Data Engineer

Equifax India

Data Science Analytics & Machine Learning

149+ Applicants

Posted: 1 month ago

3-6 years

Bengaluru / Bangalore, Karnataka

work from office

Posted: 1 month ago

Applicants: 149+

Job Description

About Company

Similar Jobs

Please verify your account first! Send OTP

Please click on the Apply to verify the status of jobs posted more than 15 days ago, as they may have expired. Similar Jobs

Job Description

Synopsis of the role
As a Data Engineer, you will be a key builder within our data ecosystem, responsible for developing and maintaining the scalable data pipelines that power our business. Working closely with Lead Engineers and Architects, you will use Azure Databricks, PySpark, and Azure Data Factory to transform raw data into actionable insights. You will apply software engineering best practices to data processing, ensuring our Medallion Architecture remains performant, reliable, and secure.
What youll do
As a Data Engineer, you will focus on the development, automation, and optimization of our cloud data platform. Your core responsibilities include:

Pipeline Development: Build and deploy robust ETL/ELT workflows using Azure Data Factory (ADF) to ingest data from diverse internal and external sources.
Spark Engineering: Write clean, efficient PySpark code to perform complex data transformations, ensuring optimal resource utilization on Databricks clusters.
Lakehouse Maintenance: Develop and manage Delta Lake tables across Bronze, Silver, and Gold layers, implementing schema enforcement and data quality checks.
Data Modeling: Translate business requirements into physical data models, implementing Star Schemas and dimensional modeling to support BI tools like Power BI.
SQL Optimization: Author and tune sophisticated SQL queries for data validation, ad-hoc analysis, and reporting layer performance.
Data Governance Support: Work within Unity Catalog to manage data assets, ensuring proper tagging, documentation, and adherence to access control policies.
Automated Testing & CI/CD: Participate in the full DevOps lifecycle, writing unit tests for Spark logic and using Azure DevOps for continuous integration and deployment.
Monitoring & Troubleshooting: Proactively monitor pipeline health, identify bottlenecks, and resolve production issues to maintain high data availability.

What Experience You Need

Total Data Engineering Experience: 36 years of hands-on experience in data engineering, ETL development, or backend software engineering with a data focus.
Azure Foundations: 4+ years of experience working within the Azure cloud environment (Storage Accounts, Key Vault, Resource Groups).
Databricks & PySpark: 4+ years of experience building data transformation logic specifically using Databricks and Spark (Python preferred).
Relational Mastery: 3+ years of strong SQL skills, with a deep understanding of joins, window functions, and query execution plans.
Orchestration: Proven experience building multi-stage pipelines in Azure Data Factory or similar tools (e.g., Airflow, Synapse Pipelines).
Data Modeling Basics: Solid understanding of data warehousing concepts, including slowly changing dimensions (SCD) and Fact/Dimension table design.
Education/Certifications: Bachelors degree in CS or a related field. An Azure Data Engineer Associate (DP-203) certification is highly preferred.

What could set you apart
Modern Data Stack Features

Delta Live Tables (DLT): Experience using DLT to simplify streaming and batch ETL development.
Databricks SQL: Familiarity with configuring SQL Warehouses for analyst self-service.

Software Engineering Rigor

Testing Frameworks: Experience with pytest or chispa for validating Spark transformations.
Python Proficiency: Strong general-purpose Python skills beyond just Spark (API integrations, automation scripts).

Performance & Scaling

Partitioning & Z-Ordering: Deep understanding of how to optimize Delta tables for large-scale query performance.

Looking to get Placed? Try our Placement Guarantee Plan

Streaming: Experience with Structured Streaming for real-time data ingestion from Event Hubs or Kafka.

Security & Compliance

Networking: Understanding of Azure VNet integration, Private Links, and secure data transit.
Data Privacy: Experience implementing data masking or encryption at rest/in transit.

Advanced Data Governance & Security

Unity Catalog Implementation: Experience configuring and managing Unity Catalog for fine-grained access control (Row-Level Security and Column-Level Masking) and tracking end-to-end data lineage.
Data Quality Frameworks: Expertise in building automated data validation using frameworks like Great Expectations or Databricks Expectations (DLT) to ensure data integrity before it reaches the Gold layer.
Metadata Management: Ability to maintain a searchable data catalog, ensuring all assets are tagged for PII (Personally Identifiable Information) and comply with GDPR/CCPA regulations.

Sophisticated CI/CD & DataOps

Infrastructure as Code (IaC): Proficiency in using Terraform or Bicep to deploy and manage Azure resources (Databricks workspaces, Key Vaults, Storage Accounts) as code.
Automated Testing Suites: Experience implementing a Test-Driven Development (TDD) approach for data, using pytest or chispa to run unit tests on PySpark transformations within the build pipeline.
Azure DevOps Integration: Mastery of YAML-based Azure Pipelines for automated deployment, including specialized tasks for Databricks Asset Bundles (DABs) or the Databricks CLI.
Environment Parity & Promotion: Proven ability to manage complex deployment patterns (Dev > QA > Prod) ensuring seamless promotion of code, ADF triggers, and Databricks job configurations.
Monitoring & Alerting: Setting up proactive monitoring using Azure Monitor and Log Analytics to track pipeline failures and cluster performance in real-time.

#India

Skills

Data ValidationPythonData GovernanceData IntegrityData ModelingData PrivacyData WarehousingData Warehousing ConceptsEtlData ProcessingImplementationData EngineerAnalystAnalyticsSql

If an employer asks you to pay any kind of fee, please notify us immediately. Jobaaj does not charge any fee from the applicants and we do not allow other companies also to do so.

About Company

Equifax is a global data, analytics, and technology company. We believe knowledge drives progress. We blend unique data, analytics and technology with a passion for serving customers globally, to create insights that power decisions to move people forward.

Important dates & deadlines?

Application Deadline

30 May 26, 04:23 PM IST

Similar Jobs

View All

Jobaaj

Don't Miss out any Updates

Subscribe now for the latest job alerts
and never miss an update

Job Alert

Google hiring for Specific Roles Apply Now!

1 min ago

New Opportunity

Amazon is hiring freshers Apply Now!

5 min ago

Featured Jobs

Microsoft opening 50+ positions Apply Now!

10 min ago

Data Engineer

Equifax India

Share with

Log in to Jobaaj

Sign up

Forgot password

Verify OTP

2 days Management Consulting Workshop

Free Workshop on How to Make a Career in Investment Banking?

2 Days Product Management Workshop

Financial Modelling Workshop

Career Opportunities in Equity Research & Investment Banking

Leveraging Data Is The Secret To Dubai's Rapid Growth

The Secret Behind Dubai's Growth :: Management Consulting

Data Engineer

Job Description

Skills

About Company

Important dates & deadlines?

Don't Miss out any Updates

Data Engineer

Apply with AI

Create Your AI Profile

Verify Password

Verify Email

Profile Created

Upload Your Resume

Note to Recruiter!

Jobs by Department

Jobs by Top Companies

Jobs in Demand

Jobs by Top Cities