Job Description
Were building a Multimodal AI Companion Platform (Telegram Bot + Web Dashboard) where creators can monetize personalized AI clones through text, voice, image, and video interactions.
Were looking for a Senior Machine Learning Engineer to own model fine-tuning, personalization pipelines, and production-grade ML delivery, working closely with backend systems under strict timelines.
This is not a research role. This is about shipping real ML into production, fast.
The Mission:
Phase 1 Build (MVP Delivery)
Fine-tune and adapt LLMs to capture creator personality and conversational style
Implement RAG pipelines that feel consistent, contextual, and in-character
Ship stable ML inference flows that integrate cleanly with backend APIs
Phase 2 Launch (Pilot Rollout)
Optimize latency, cost, and reliability for real user traffic
Ensure voice, image, and text generation pipelines are production-ready
Actively debug and refine models based on real interaction data
Phase 3 Scale (Quality + Cost Control)
Improve personalization quality without blowing up GPU costs
Help shape longer-term ML architecture (LoRAs, memory, inference patterns)
Support roadmap decisions with realistic ML constraints and tradeoffs
Core Responsibilities:
Model Fine-Tuning & Personalization
Fine-tune LLMs for conversational style, tone, and behavioral consistency
Train and manage LoRA adapters for image/video personalization
Continuously improve response quality using real interaction feedback
RAG & Memory Systems
Design and maintain RAG pipelines using vector databases
Balance retrieval accuracy, latency, and token cost
Ensure long-term memory feels natural and coherent across sessions
Multimodal ML Pipelines
Own voice cloning and TTS workflows (training + inference)
Integrate STT, TTS, image, and video generation into a unified experience
Ensure outputs are reliable, high-quality, and monetization-ready
Production ML & Backend Integration
Build ML systems that work cleanly with FastAPI-based backends
Handle async workflows, background jobs, and inference queues
Optimize for serverless GPU execution and cost efficiency
Delivery Under Tight Timelines
Ship working ML features within aggressive deadlines
Make pragmatic tradeoffs between perfect and production-ready
Take ownership when things break and fix them fast
Requirements:
Looking to get Placed? Try our Placement Guarantee Plan
Strong Applied ML Experience
5+ years in Machine Learning / Applied AI
Hands-on experience with LLM fine-tuning, embeddings, and RAG
Experience deploying ML systems used by real users
Backend-Aware Engineer
Strong Python skills beyond notebooks
Comfortable working with APIs, async flows, and production constraints
Experience collaborating closely with backend engineers
Multimodal Exposure (Preferred)
Voice cloning / TTS / STT pipelines
Image or video generation with diffusion models
Serverless GPU inference platforms
Execution Mindset
You prioritize shipping over theory
Youre comfortable with ambiguity and fast decisions
You can work independently without heavy process overhead
Why Join
Real Ownership: You own ML quality, not just experiments
High Impact: Models you ship directly drive revenue
Speed: Small team, aggressive timelines, zero bureaucracy
Depth: Work across LLMs, RAG, voice, image, and infra not just one slice
Skills
PythonMachine LearningAiMlIf an employer asks you to pay any kind of fee, please notify us immediately. Jobaaj does not charge any fee from the applicants and we do not allow other companies also to do so.
About Company
Important dates & deadlines?
Application Deadline
28 Mar 26, 05:25 PM IST
Similar Jobs
View All



