Senior Data Engineer

Department Icon Data Science Analytics & Machine Learning
149+ Applicants
Posted: 1 week ago
6-8 years
Bengaluru / Bangalore, Karnataka
work from office

Posted: 1 week ago
|
Applicants: 149+
Job Description
About Company
Similar Jobs
Please verify your account first! Send OTP

Job Description

Staff/Senior Data Engineer: AI Training Data (2-4 Months Contract)

Location: Remote

Role Type: Contract (2-4 Months)

Time Commitment: 40 hrs/week (Full-time availability required)

Compensation: Hyper-competitive hourly rate (matching Tier-1 Staff engineering bands) Experience: 6-12+ years

About BespokeLabs

BespokeLabs is a premier, VC-backed AI Research lab with an exceptionally talent-dense team of IIT and Ivy League alumni. We dont just build tooling around AIwe build the massive-scale data systems and reasoning architectures that directly power next-generation models. Our research shapes the frontier of AI: weve published breakthroughs like GEPA, driven foundational datasets like OpenThoughts, and shipped state-of-the-art models including Bespoke-MiniCheck and Bespoke-MiniChart. More on our website https://www.bespokelabs.ai/ :)

Role Overview

We are looking for a top-tier Senior/Staff Data Engineer for a high-impact, 2-month sprint. You will leverage your deep expertise in enterprise-grade data platforms to architect and build the complex curation systems required for advanced AI model training.

This is not a traditional ETL pipeline role. We need a heavy-hitter who has already operated production data platforms at scale inside large, complex organizations (FAANG, Fortune 100). You will use the mental models, architectural intuition, and coding skills youve developed over your career to generate, transform, and evaluate the data that trains the next generation of AI.

What You Will Do (The Contract)

  • Architect AI-Scale Systems: Design the overarching data architecture and processing topology needed to programmatically curate and shape datasets at TB/PB scale, ensuring low latency and high consistency.
  • Hands-On Development: Write production-grade code (Python/Scala, Spark) to build custom ingestion logic, highly efficient transformation scripts, and performant data validation checks.
  • Complex Data Logic: Implement advanced filtering, deduplication, and quality-scoring algorithms at scale, ensuring the resulting data objects are optimized for LLM/ML consumption.
  • Quality & Performance Tuning: Rigorously test, benchmark, and optimize processing workloads (CPU/memory tuning, partitioning strategies in Spark/Iceberg) to meet aggressive throughput targets.
  • Domain Subject Matter Expert: Act as the ultimate technical authority on distributed systems, data processing, and cloud structures to ensure the training data factory meets enterprise-grade accuracy.

What You Bring to the Table (Your Past Experience)

To be successful in this contract, you must have a track record of:

  • End-to-End Ownership:

    Looking to get Placed? Try our Placement Guarantee Plan

    Designing and owning enterprise data platforms (batch + streaming).
  • High-Throughput Processing: Building and operating Kafka-first streaming pipelines.
  • Lakehouse Architecture: Utilizing Apache Iceberg, Delta Lake, or Hudi for analytics and ML at scale.
  • Reliability Engineering: Ensuring data reliability through SLAs, monitoring, backfills, and recovery.
  • Scale: Processing billions of events and managing TBPB scale data systems.

Required Qualifications (Non-Negotiable)

  • Experience: 6+ years of Data Engineering experience.
  • Seniority: Demonstrated Senior/Staff-level ownership of production data platforms.
  • Pedigree: Background at Tier-1 enterprises (FAANG, large SaaS, Fortune 100).
  • Technical Stack: Deep fluency in Python/Scala, Spark, Kafka, Airflow, and Major Cloud Warehouses (Snowflake, BigQuery, Redshift).

Skills

Data ValidationPythonData ArchitectureEtlData ProcessingSnowflakeData EngineerAnalyticsAiMl

If an employer asks you to pay any kind of fee, please notify us immediately. Jobaaj does not charge any fee from the applicants and we do not allow other companies also to do so.

About Company

Bespoke Labs is a technology company that specializes in building custom software solutions for businesses. They focus on understanding client needs and delivering tailored software, web, and mobile applications.

Important dates & deadlines?

Application Deadline

28 Mar 26, 05:25 PM IST

Similar Jobs

View All
Loading...
Bag Logo
Jobaaj
Don't Miss out any Updates

Subscribe now for the latest job alerts
and never miss an update

Job Alert
Google hiring for Specific Roles Apply Now!
1 min ago
New Opportunity
Amazon is hiring freshers Apply Now!
5 min ago
Featured Jobs
Microsoft opening 50+ positions Apply Now!
10 min ago

Senior Data Engineer

Share with