Job Description
- 7+ years of related experience with a bachelors degree.
- Proven experience designing and deploying applications using Generative AI and large language models (e.g., GPT4, Claude, open-weight large language models (LLMs)).
- Understanding of retrieval-augmented generation, embeddings-based search, agent orchestration, or prompt chaining.
- Familiarity with modern LLM/GenAI tools such as LangChain, LlamaIndex, HuggingFace Transformers, Semantic Kernel, or LangGraph.
- Advanced knowledge of SQL and experience working with relational and NoSQL databases, query authoring (SQL), as well as working familiarity with a variety of databases (e.g., SQL Server).
- Experience building and optimizing data pipelines on Azure Databricks.
- In-depth knowledge of data engineering, machine learning, data warehousing, and Delta Lake on Databricks.
- Strong knowledge of Spark and Python.
- A successful history of manipulating, processing, and extracting value from large, disconnected datasets.
- Excellent skills in stakeholder management and communication, enabling effective communication across global teams.
- Familiarity with Fivetran.
- Familiarity with BI tools like Power BI, etc.
- Understanding building and deploying ML and feature engineering pipelines to production using MLflow.
- Experience with building a data pipeline from various business applications like Salesforce, NetSuite, etc.
- Knowledge of message queuing, stream processing, and highly scalable data stores.
- Experience working in a compliance-based environment, including building and deploying compliant software solutions throughout the software life cycle.
- Familiarity with cloud-based AI/ML services and Generative AI tools.
- Design and development of systems for the maintenance of the Azure Databricks, ETL processes, business intelligence, and data ingestion pipelines for AI/ML use cases.
- Build, scale, and optimize GenAI and ML workloads across Databricks and other production environments, with strong attention to cost-efficiency, compliance, and robustness.
- Build ML pipelines to train, serve, and monitor reinforcement learning or supervised learning models using Databricks and MLflow.
- Create and support ETL pipelines and table schemas to facilitate the integration of new and existing data sources into the Lakehouse on Databricks.
- Maintain data governance and data privacy standards.
- Collaborate with data architects, data scientists, analysts, and other business consumers to quickly and thoroughly analyze business requirements to populate the data warehouse, optimized for reporting and analytics.
- Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Maintain technical documentation and mentor junior data engineers on best practices in data engineering and Lakehouse architecture.
- Drive innovation and contribute to the development of cutting-edge Generative AI and analytical capabilities for the Next-Gen research enablement platform.
- US and EU projects based on advanced technologies.
- Competitive compensation based on skills and experience.
- Regular performance appraisals to support your growth.
- 15 vacation days, 10 national holidays, 5 sick days.
- Free tech webinars and meetups organized by Svitla.
- Reimbursement for private medical insurance.
- Personalized learning program tailored to your interests and skill development.
- Bonuses for article writing, public talks, and other activities.
- Fun corporate onlineoffline celebrations and activities.
- Awesome team, friendly and supportive community!
- 7+ years of related experience with a bachelors degree.
- Proven experience designing and deploying applications using Generative AI and large language models (e.g., GPT4, Claude, open-weight large language models (LLMs)).
- Understanding of retrieval-augmented generation, embeddings-based search, agent orchestration, or prompt chaining.
- Familiarity with modern VV tools such as LangChain, LlamaIndex, HuggingFace Transformers, Semantic Kernel, or LangGraph.
- Advanced knowledge of SQL and experience working with relational and NoSQL databases, query authoring (SQL), as well as working familiarity with a variety of databases (e.g., SQL Server).
- Experience building and optimizing data pipelines on Azure Databricks.
- In-depth knowledge of data engineering, machine learning, data warehousing, and Delta Lake on Databricks.
- Strong knowledge of Spark and Python.
- A successful history of manipulating, processing, and extracting value from large, disconnected datasets.
- Excellent skills in stakeholder management and communication, enabling effective communication across global teams.
- Familiarity with Fivetran.
- Familiarity with BI tools like Power BI, etc.
- Understanding building and deploying ML and feature engineering pipelines to production using MLflow.
- Experience with building a data pipeline from various business applications like Salesforce, NetSuite, etc.
- Knowledge of message queuing, stream processing, and highly scalable data stores.
- Experience working in a compliance-based environment, including building and deploying compliant software solutions throughout the software life cycle.
- Familiarity with cloud-based AI/ML services and Generative AI tools.
Looking to get Placed? Try our Placement Guarantee Plan
- Design and development of systems for the maintenance of the Azure Databricks, ETL processes, business intelligence, and data ingestion pipelines for AI/ML use cases.
- Build, scale, and optimize GenAI and ML workloads across Databricks and other production environments, with strong attention to cost-efficiency, compliance, and robustness.
- Build ML pipelines to train, serve, and monitor reinforcement learning or supervised learning models using Databricks and MLflow.
- Create and support ETL pipelines and table schemas to facilitate the integration of new and existing data sources into the Lakehouse on Databricks.
- Maintain data governance and data privacy standards.
- Collaborate with data architects, data scientists, analysts, and other business consumers to quickly and thoroughly analyze business requirements to populate the data warehouse, optimized for reporting and analytics.
- Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Maintain technical documentation and mentor junior data engineers on best practices in data engineering and Lakehouse architecture.
- Drive innovation and contribute to the development of cutting-edge Generative AI and analytical capabilities for the Next-Gen research enablement platform.
We Offer:
- US and EU projects based on advanced technologies.
- Competitive compensation based on skills and experience.
- Regular performance appraisals to support your growth.
- 15 vacation days, 10 national holidays, 5 sick days.
- Free tech webinars and meetups organized by Svitla.
- Reimbursement for private medical insurance.
- Personalized learning program tailored to your interests and skill development.
- Bonuses for article writing, public talks, and other activities.
- Fun corporate onlineoffline celebrations and activities.
- Awesome team, friendly and supportive community!
Skills
PythonData GovernanceData PrivacyData WarehousingEtlMachine LearningReporting And AnalyticsRoot Cause AnalysisAi/mlLarge Language ModelsAnalyticsAiMlSqlAi EngineerIf an employer asks you to pay any kind of fee, please notify us immediately. Jobaaj does not charge any fee from the applicants and we do not allow other companies also to do so.
About Company
Important dates & deadlines?
Application Deadline
13 Jul 26, 07:03 PM IST
Similar Jobs
View All

