Job Description
Role Overview:
Were looking for a Junior Data Engineer to join our Data Platform team. Youll design and maintain scalable data pipelines and architectures using AWS services, enabling reliable data movement, transformation, and analytics at scale.
Youll collaborate with analytics, product, and engineering teams to support reporting, dashboards, and insights for millions of students and schools.
Key Responsibilities:
- Design, build, and maintain ETL/ELT pipelines for large-scale data ingestion, transformation, and loading.
- Develop and optimize Spark and PySpark jobs for batch and real-time data processing.
- Work with AWS services such as S3, Glue, Lambda, Redshift, Athena, and EMR to manage the data ecosystem.
- Support the design and implementation of Data Lake and Data Warehouse architectures.
- Implement data validation, partitioning, and schema management for efficient query performance.
- Collaborate with data analysts and BI teams to ensure data availability and consistency.
- Maintain data lineage, metadata, and ensure data quality and governance.
- Implement monitoring and alerting for data ingestion and transformation pipelines.
- Use Git and CI/CD tools to manage code and automate deployment of data workflows.
Qualifications:
- Bachelors degree in Computer Science, Information Technology, Data Engineering, or related field.
- 13 years of hands-on experience in data engineering, data pipeline development, or cloud-based data systems.
- Strong knowledge of SQL and experience with Python or PySpark.
- Practical experience with AWS data stack S3, Glue, Lambda, Redshift, Athena, EMR, Step Functions, etc.
- Understanding of data lake architecture, ETL/ELT frameworks, and data warehousing concepts.
- Familiarity with Delta Lake, Spark SQL, or big data frameworks.
Looking to get Placed? Try our Placement Guarantee Plan
- Good understanding of data modeling, partitioning, and performance tuning.
- Excellent analytical, troubleshooting, and collaboration skills.
Good to Have / Plus
- Exposure to GCP (BigQuery, Dataflow, Cloud Storage) or Azure (Data Factory, Synapse, ADLS, Databricks).
- Experience with Databricks for scalable data processing and Delta Lake management.
- Knowledge of PostgreSQL, MySQL, or NoSQL databases.
- Familiarity with Airflow, Step Functions, or other orchestration tools.
- Understanding of DevOps practices, CI/CD pipelines, and infrastructure automation.
- Experience working in an EdTech or public data ecosystem is an advantage
Skills
Data ValidationBig DataPythonData ModelingData WarehousingData Warehousing ConceptsEtlData ProcessingImplementationMysqlData EngineerAnalyticsSqlIf an employer asks you to pay any kind of fee, please notify us immediately. Jobaaj does not charge any fee from the applicants and we do not allow other companies also to do so.
About Company
Important dates & deadlines?
Application Deadline
28 Mar 26, 03:40 PM IST
Similar Jobs
View All



