Please click on the Apply to verify the status of jobs posted more than 15 days ago, as they may have expired. Similar Jobs
Job Description
Job Summary:
We are seeking a highly skilled Data Engineer to lead the development of our enterprise-grade Data Lakehouse platform on Databricks. This role is central to building scalable, reliable, and high-performance data pipelines that transform diverse data types including structured, semi-structured, and unstructured into curated, analytics-ready assets.
Experience:
- 3- 10 years of hands-on experience in data engineering, with a focus on building scalable data pipelines.
- Proven experience building and maintaining data platforms on a major cloud provider (Azure preferred).
- Hands-on expertise with the Databricks platform, including PySpark, Delta Lake, and performance tuning.
- Practical experience with modern data transformation tools, preferably dbt.
- Strong understanding of data modeling concepts and architectural patterns (e.g., Kimball, Inmon, Data Vault).
- Exposure to CI/CD for data pipelines and infrastructure-as-code principles.
- Familiarity with data governance principles and tools like Unity Catalog.
- Experience connecting data platforms to BI tools like Power BI is a plus.
Key Responsibilities:
- Architect and implement robust ELT pipelines to ingest data from a wide range of sources, including databases, APIs, files, and event streams, with support for schema evolution and data validation.
- Design and manage end-to-end data workflows using the Medallion Architecture (Bronze, Silver, Gold) to incrementally refine and structure data for downstream
- Develop scalable data transformation jobs using PySpark and SQL, leveraging frameworks like dbt to build modular, testable, and well-documented data models.
- Apply data governance and quality controls using Databricks Unity Catalog and Delta Live Tables (DLT), ensuring metadata management, lineage tracking, and secure access to sensitive data.
- Optimize performance and cost-efficiency of Databricks jobs through effective partitioning, Z-Ordering, caching, and cluster tuning.
Looking to get Placed? Try our Placement Guarantee Plan
- Integrate the Lakehouse with downstream analytics and AI systems, exposing clean data views and APIs for BI dashboards and machine learning workflows.
- Implement CI/CD practices for data pipelines, collaborating with DevOps teams to automate testing, deployment, and monitoring using tools like Azure DevOps or GitHub Actions.
- Ensure compliance with data privacy and security standards, including HIPAA and other relevant regulations.
- Communicate technical architecture and design decisions effectively to both technical and non-technical stakeholders.
- Evaluate and adopt emerging tools and best practices within the Databricks ecosystem to continuously improve platform capabilities.
- Contribute to internal data engineering libraries, reusable pipeline templates, and documentation to support team scalability.
Work in an agile environment following Scrum or Kanban methodologies
Skills
PySparkData BricksDelta LakeDBTCI/CD & DeploymentPythonSQLDeploymentDevopsTestingCloudSqlIf an employer asks you to pay any kind of fee, please notify us immediately. Jobaaj does not charge any fee from the applicants and we do not allow other companies also to do so.
Important dates & deadlines?
Application Deadline
08 Jan 26, 05:36 PM IST
Similar Jobs
View All

