Data Scientist

Department Icon Data Science Analytics & Machine Learning
149+ Applicants
Posted: 2 weeks ago
6-8 years
Bengaluru / Bangalore, Karnataka
work from office

Posted: 2 weeks ago
|
Applicants: 149+
Job Description
About Company
Similar Jobs
Please verify your account first! Send OTP

Please click on the Apply to verify the status of jobs posted more than 15 days ago, as they may have expired. Similar Jobs

Job Description

Senior Data Scientist: AI Training Data (2-4 Months Contract)

Company: BespokeLabs (VC-backed; founded by IIT & Ivy League alumni)

Location: Remote

Role Type: Contract (2-4 Months)

Time Commitment: 40 hrs/week (Full-time availability required)

Compensation: Hyper-competitive hourly rate (matching top-tier Senior Data Scientist bands) Experience: 6+ years

About BespokeLabs

BespokeLabs is a premier, VC-backed AI Research lab with an exceptionally talent-dense team of IIT and Ivy League alumni. We dont just build tooling around AIwe build the massive-scale data systems and reasoning architectures that directly power next-generation models. Our research shapes the frontier of AI: weve published breakthroughs like GEPA, driven foundational datasets like OpenThoughts, and shipped state-of-the-art models including Bespoke-MiniCheck and Bespoke-MiniChart. More on our website bespokelabs.ai :)

Role Overview

We are looking for a high-impact Senior Data Scientist for an intensive, 2-month sprint. You will leverage your deep expertise in production-grade machine learning and applied statistics to develop the algorithms and logic that curate and evaluate datasets for advanced AI model training.

This is not a traditional model-building or research role. We need a seasoned practitioner who has already owned the end-to-end DS lifecycle at scale. You will use your intuition for feature engineering, statistical validity, and large-scale data processing to programmatically generate, shape, and validate AI training data.

What You Will Do (The Contract)

  • Algorithm Design: Design and implement custom statistical models and programmatic logic (e.g., anomaly detection, active learning, similarity scoring) to evaluate data quality, complexity, and redundancy at scale.
  • Hands-on At-Scale Coding: Write scalable PySpark and Python (NumPy/Pandas) code to apply these algorithms across massive datasets, translating experimental logic into reliable, large-scale workflows.
  • Metric Formulation: Develop custom quantitative metrics and heuristic benchmarks to rigorously assess the fidelity and suitability of data subsets for specific AI training objectives.
  • Validation & Iteration: Run high-speed validation cycles, analyzing the output of data-curation algorithms to diagnose skew, bias, or noise, and iteratively refining the logic.
  • High-Level Curation: Apply Senior-level domain expertise in predictive modeling and feature engineering to ensure the final training inputs meet the strict standards required for state-of-the-art ML systems.

What You Bring to the Table (Your Past Experience)

To be successful in this contract, you must have a track record of:

  • The End-to-End DS Lifecycle: Framing problems, modeling, validation, production, and iteration.
  • Production Ownership: Building and deploying ML and statistical models on large-scale datasets.
  • Large-Scale Data Processing: Working with Apache Spark to develop scalable feature pipelines and offline training workflows.
  • Experimentation: Designing and analyzing rigorous experiments (A/B tests, causal inference).
  • Impact: Translating complex model outputs into clear product and business decisions.

Required Qualifications (Non-Negotiable)

  • Experience: 6+ years as a Data Scientist or Applied Scientist.
  • Production Background: Proven ownership of models running in production environments.

    Looking to get Placed? Try our Placement Guarantee Plan

  • Applied Statistics: Strong background in applied statistics and experimentation frameworks.

Core Technical Skills

  • Languages: Python (NumPy, Pandas, Scikit-learn, PyTorch / TensorFlow) and Strong SQL.
  • Big Data: Apache Spark (PySpark or Spark SQL) for large-scale data processing.
  • Methodologies: Feature engineering, model evaluation, statistical modeling, and hypothesis testing.

Strong Signals (Highly Valued)

  • Scale: Models trained on TB-scale datasets.
  • Domain Specificity: Experience in high-complexity domains such as: Recommendations, Pricing, Fraud / risk, Search / ranking, or Growth & experimentation.
  • Collaboration: Experience deploying models alongside data engineering pipelines.

Out of Scope (Who Should Not Apply)

  • BI / reporting-only roles
  • SQL-only analysts
  • Research-only ML roles with no production ownership
  • Early-career profiles

Skills

Big DataPythonData ProcessingMachine LearningPredictive ModelingStatistical ModelingApplied ScientistData ScientistAiMlSql

If an employer asks you to pay any kind of fee, please notify us immediately. Jobaaj does not charge any fee from the applicants and we do not allow other companies also to do so.

About Company

Bespoke Labs is a technology company that specializes in building custom software solutions for businesses. They focus on understanding client needs and delivering tailored software, web, and mobile applications.

Important dates & deadlines?

Application Deadline

28 Mar 26, 03:40 PM IST

Similar Jobs

View All
Loading...
Bag Logo
Jobaaj
Don't Miss out any Updates

Subscribe now for the latest job alerts
and never miss an update

Job Alert
Google hiring for Specific Roles Apply Now!
1 min ago
New Opportunity
Amazon is hiring freshers Apply Now!
5 min ago
Featured Jobs
Microsoft opening 50+ positions Apply Now!
10 min ago

Data Scientist

Share with