Please click on the Apply to verify the status of jobs posted more than 15 days ago, as they may have expired. Similar Jobs
Job Description
As Point72 reimagines the future of investing, our Technology group is constantly improving our companys IT infrastructure, positioning us at the forefront of a rapidly evolving technology landscape. Were a team of experts experimenting, discovering new ways to harness the power of open-source solutions, and embracing enterprise agile methodology. We encourage professional development to ensure you bring innovative ideas to our products while satisfying your own intellectual curiosity.
What youll do
- Own the day-to-day operational health of compliance approved AI and Data platforms in production, ensuring high availability, performance, and reliability
- Monitor AI and Data services, model inference layers, APIs, and data dependencies using logs, metrics, dashboards, and alerts
- Provide production-focused user support for AI tools and Data platforms, prioritizing issue resolution
- Lead incident triage, coordination, and resolution for platform outages or service degradations, partnering with development and infrastructure teams.
- Perform deep technical troubleshooting across applications, data, and system layers.
- Enhance observability, alerting, and operational runbooks to reduce mean time to detect. (MTTD) and mean time to resolve (MTTR) incidents
- Conduct post-incident root cause analysis and drive corrective and preventive improvements
- Support production deployments, configuration changes, and platform upgrades with a strong focus on risk mitigation and stability
- Automate repetitive operational tasks and support workflows using Python and other scripting tools
- Collaborate closely with AI and Data engineering teams to improve platform resilience, scalability, and overall supportability
- Bachelors degree in Computer Science, Engineering, Mathematics, Physics, or a related technical discipline
- 36 years of experience in application support, production engineering, SRE, or platform operations roles
- Strong proficiency in Python for debugging, automation, and operational tooling, as well as SQL for data validation, issue investigation, and platform troubleshooting.
- Working knowledge of cloud operations preferably AWS and Azure
- Windows environments, .NET applications, and SQL Server, as well as Databricks. Prior experience in supporting Reference/Alternate Data applications will be an added advantage
- Good understanding of production systems, APIs, and distributed services
- Experience supporting or operating AI/ML platforms, with knowledge of model serving, inference pipelines, and dependency management
- Excellent analytical, troubleshooting, and incident management skills
- Commitment to the highest ethical standards
Looking to get Placed? Try our Placement Guarantee Plan
We invest in our people, their careers, their health, and their well-being. When you work here, we provide:
- Health care benefits
- Maternity, Adoption & related leave policies
- Generous paternity and family care leave policies
- Employee Assistance Program & Mental wellness programs
- Transportation support
- Tuition assistance
Point72 is a leading global alternative investment firm led by Steven A. Cohen. Building on more than 30 years of investing experience, Point72 seeks to deliver superior returns for its investors through fundamental and systematic investing strategies across asset classes and geographies. We aim to attract and retain the industrys brightest talent by cultivating an investor-led culture and committing to our peoples long-term growth. For more information, visit https://point72.com/.
Skills
Data ValidationPythonIncident ManagementRoot Cause AnalysisAi/mlAiMlSqlIf an employer asks you to pay any kind of fee, please notify us immediately. Jobaaj does not charge any fee from the applicants and we do not allow other companies also to do so.
About Company
Important dates & deadlines?
Application Deadline
15 Jul 26, 03:04 PM IST
Similar Jobs
View All

