Data Engineer

Department Icon Data Science Analytics & Machine Learning
149+ Applicants
Posted: 1 month ago
3-6 years
Bengaluru / Bangalore, Karnataka
work from office

Posted: 1 month ago
|
Applicants: 149+
Job Description
About Company
Similar Jobs
Please verify your account first! Send OTP

Please click on the Apply to verify the status of jobs posted more than 15 days ago, as they may have expired. Similar Jobs

Job Description

Synopsis of the role
As a Data Engineer, you will be a key builder within our data ecosystem, responsible for developing and maintaining the scalable data pipelines that power our business. Working closely with Lead Engineers and Architects, you will use Azure Databricks, PySpark, and Azure Data Factory to transform raw data into actionable insights. You will apply software engineering best practices to data processing, ensuring our Medallion Architecture remains performant, reliable, and secure.
What youll do
As a Data Engineer, you will focus on the development, automation, and optimization of our cloud data platform. Your core responsibilities include:
  • Pipeline Development: Build and deploy robust ETL/ELT workflows using Azure Data Factory (ADF) to ingest data from diverse internal and external sources.
  • Spark Engineering: Write clean, efficient PySpark code to perform complex data transformations, ensuring optimal resource utilization on Databricks clusters.
  • Lakehouse Maintenance: Develop and manage Delta Lake tables across Bronze, Silver, and Gold layers, implementing schema enforcement and data quality checks.
  • Data Modeling: Translate business requirements into physical data models, implementing Star Schemas and dimensional modeling to support BI tools like Power BI.
  • SQL Optimization: Author and tune sophisticated SQL queries for data validation, ad-hoc analysis, and reporting layer performance.
  • Data Governance Support: Work within Unity Catalog to manage data assets, ensuring proper tagging, documentation, and adherence to access control policies.
  • Automated Testing & CI/CD: Participate in the full DevOps lifecycle, writing unit tests for Spark logic and using Azure DevOps for continuous integration and deployment.
  • Monitoring & Troubleshooting: Proactively monitor pipeline health, identify bottlenecks, and resolve production issues to maintain high data availability.
What Experience You Need
  • Total Data Engineering Experience: 36 years of hands-on experience in data engineering, ETL development, or backend software engineering with a data focus.
  • Azure Foundations: 4+ years of experience working within the Azure cloud environment (Storage Accounts, Key Vault, Resource Groups).
  • Databricks & PySpark: 4+ years of experience building data transformation logic specifically using Databricks and Spark (Python preferred).
  • Relational Mastery: 3+ years of strong SQL skills, with a deep understanding of joins, window functions, and query execution plans.
  • Orchestration: Proven experience building multi-stage pipelines in Azure Data Factory or similar tools (e.g., Airflow, Synapse Pipelines).
  • Data Modeling Basics: Solid understanding of data warehousing concepts, including slowly changing dimensions (SCD) and Fact/Dimension table design.
  • Education/Certifications: Bachelors degree in CS or a related field. An Azure Data Engineer Associate (DP-203) certification is highly preferred.
What could set you apart
Modern Data Stack Features
  • Delta Live Tables (DLT): Experience using DLT to simplify streaming and batch ETL development.
  • Databricks SQL: Familiarity with configuring SQL Warehouses for analyst self-service.
Software Engineering Rigor
  • Testing Frameworks: Experience with pytest or chispa for validating Spark transformations.
  • Python Proficiency: Strong general-purpose Python skills beyond just Spark (API integrations, automation scripts).
Performance & Scaling
  • Partitioning & Z-Ordering: Deep understanding of how to optimize Delta tables for large-scale query performance.
  • Looking to get Placed? Try our Placement Guarantee Plan

  • Streaming: Experience with Structured Streaming for real-time data ingestion from Event Hubs or Kafka.
Security & Compliance
  • Networking: Understanding of Azure VNet integration, Private Links, and secure data transit.
  • Data Privacy: Experience implementing data masking or encryption at rest/in transit.
Advanced Data Governance & Security
  • Unity Catalog Implementation: Experience configuring and managing Unity Catalog for fine-grained access control (Row-Level Security and Column-Level Masking) and tracking end-to-end data lineage.
  • Data Quality Frameworks: Expertise in building automated data validation using frameworks like Great Expectations or Databricks Expectations (DLT) to ensure data integrity before it reaches the Gold layer.
  • Metadata Management: Ability to maintain a searchable data catalog, ensuring all assets are tagged for PII (Personally Identifiable Information) and comply with GDPR/CCPA regulations.
Sophisticated CI/CD & DataOps
  • Infrastructure as Code (IaC): Proficiency in using Terraform or Bicep to deploy and manage Azure resources (Databricks workspaces, Key Vaults, Storage Accounts) as code.
  • Automated Testing Suites: Experience implementing a Test-Driven Development (TDD) approach for data, using pytest or chispa to run unit tests on PySpark transformations within the build pipeline.
  • Azure DevOps Integration: Mastery of YAML-based Azure Pipelines for automated deployment, including specialized tasks for Databricks Asset Bundles (DABs) or the Databricks CLI.
  • Environment Parity & Promotion: Proven ability to manage complex deployment patterns (Dev > QA > Prod) ensuring seamless promotion of code, ADF triggers, and Databricks job configurations.
  • Monitoring & Alerting: Setting up proactive monitoring using Azure Monitor and Log Analytics to track pipeline failures and cluster performance in real-time.
#India

Skills

Data ValidationPythonData GovernanceData IntegrityData ModelingData PrivacyData WarehousingData Warehousing ConceptsEtlData ProcessingImplementationData EngineerAnalystAnalyticsSql

If an employer asks you to pay any kind of fee, please notify us immediately. Jobaaj does not charge any fee from the applicants and we do not allow other companies also to do so.

About Company

Equifax is a global data, analytics, and technology company. We believe knowledge drives progress. We blend unique data, analytics and technology with a passion for serving customers globally, to create insights that power decisions to move people forward.

Important dates & deadlines?

Application Deadline

30 May 26, 04:23 PM IST

Similar Jobs

View All
Loading...
Bag Logo
Jobaaj
Don't Miss out any Updates

Subscribe now for the latest job alerts
and never miss an update

Job Alert
Google hiring for Specific Roles Apply Now!
1 min ago
New Opportunity
Amazon is hiring freshers Apply Now!
5 min ago
Featured Jobs
Microsoft opening 50+ positions Apply Now!
10 min ago

Data Engineer

Share with