Work from Office
Data Engineer III
- Create, implement and operate the strategy for robust and scalable data pipelines for business intelligence and machine learning.
- Develop and maintain core data framework and key infrastructures
- Create and support the ETL pipeline to get the data flowing correctly from the existing and new sources to our data warehouse.
- Data Warehouse design and data modelling for efficient and cost effective reporting
- Collaborate with data analysts, data scientists, and other data consumers within the business to manage the data warehouse table structure and optimize it for reporting.
- Constantly striving to improve software development process and team productivity
- Define and implement Data Governance processes related to data discovery, lineage, access control and quality assurance
- Perform code reviews and QA data imported by various processes
- 6-10 years of experience.
- At least 3+ years of experience in data engineering and data infrastructure space on any of the big data technologies: Hive, Spark, Pyspark(Batch and Streaming), Airflow, Redshift and Delta Lake.
- Experience in product based companies or startups.
- Strong understanding of data warehousing concepts and the data ecosystem.
- Strong Design/Architecture experience architecting, developing, and maintaining solutions in AWS.
- Experience building data pipelines and managing the pipelines after they’re deployed.
- Experience with building data pipeline from business applications using APIs.
- Previous experience in Databricks is a big plus.
- Understanding of Dev Ops would be preferable though not a must
- Working knowledge of BI Tools like Metabase, Power BI is plus
- Experience of architecting systems for data access is a major plus.
- This is the revised JD by Amiya Kumar