Please verify your account first! Send OTP

Job Overview

Functional Area

Data

Work preferred

Work from Office

Experience

Min Experience

5 Years

Max Experience

7 Years

Description

About Fusemachines


Fusemachines is a leading AI strategy, talent, and education services provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic and more than 350 full-time employees) Fusemachines seeks to bring its global expertise in AI to transform companies around the world.


About the role:


This is a remote, 3 months contract role, with a possibility of extension, with some working hours overlapping with US Eastern time. They will be responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization and Advanced Analytics).


Qualification / Skill Set Requirement:


  • 5+ years of real-world data engineering development experience in AWS (certifications preferred)
  • Strong programming Skills in one or more languages such as Python, Scala, and proficient in writing efficient and optimized code for data integration, storage, processing and manipulation.
  • Strong knowledge SDLC tools and technologies, including project management software (Jira or similar), source code management (GitHub or similar), CI/CD system (GitHub actions, AWS CodeBuild or similar) and binary repository manager (AWS CodeArtifact or similar).
  • Good understanding of Data Modeling and Database Design Principles. Being able to design and implement efficient database schemas that meet the requirements of the data architecture to support data solutions.
  • Strong Experience with relational SQL and NoSQL databases, including Postgres, MongoDB, ElasticSearch
  • Skilled in Data Integration from different sources such as APIs, databases, flat files, event streaming.
  • Strong experience in working with ELT and ETL tools and being able to develop custom integration solutions as needed.
  • Strong experience with scalable and distributed Data Technologies such as Spark/PySpark, DBT and Kafka, to be able to handle large volumes of data.
  • Strong experience in designing and implementing Data Warehousing solutions in AWS with RedShift. Demonstrated experience in designing and implementing efficient ELT/ETL processes that extract data from source systems, transform it (DBT), and load it into the data warehouse.
  • Strong experience in Orchestration using Apache Airflow.
  • Expert in Cloud Computing in AWS, including deep knowledge of a variety of AWS services like Lambda, Kinesis, S3, Lake Formation, EC2, ECS/ECR, IAM, CloudWatch, Redshift, etc
  • Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent.
  • Good understanding of BI solutions including Looker and LookML (Looker Modeling Language)
  • Good Problem-Solving skills: being able to troubleshoot data processing pipelines and identify performance bottlenecks and other issues.
  • Possesses strong leadership skills with a willingness to lead, create Ideas, and be assertive


Responsibilities:


  • Design and develop high-performance, large-scale, complex data architectures, which support data integration (batch and real-time), storage (data lakes, warehouses, marts, etc), processing, orchestration and infrastructure.
  • Ensuring the scalability, reliability, and performance of data systems.
  • Mentoring and guiding junior/mid-level data engineers.
  • Collaborating with Product, Engineering, Data Scientists and Analysts to understand data requirements and develop data solutions, including reusable components.
  • Evaluating and implementing new technologies and tools to improve data integration, data processing and analysis.
  • Care about architecture, observability, testing, and building reliable infrastructure and data pipelines.
  • Conduct Discovery on existing Data Infrastructure and Proposed Architecture.
  • Evaluate, design, and implement data governance solutions: cataloging, lineage, quality and data governance frameworks that are suitable for a modern analytics solution, considering industry-standard best practices and patterns.
  • Assess best practices and design schemas that matches business needs for delivering a modern analytics solution (descriptive, diagnostic, predictive, prescriptive)
  • Be an active member of our Agile team, participating in all ceremonies and continuous improvement activities.


Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.


Powered by JazzHR


TPb0yvoeXJ

Skills

AgileAnalyticsContinuous ImprovementData ArchitectureData GovernanceData IntegrationData ModelingData ProcessingData QualityData WarehousingDesigningEtlImplementationProject ManagementPythonQualitySqlStrategyVisualization