
The cloud monitoring platform engineers love
Datadog (NYSE: DDOG) is a leading cloud observability platform that provides monitoring and analytics for applications, infrastructure, and logs. Trusted by over 26,000 customers including major companies like Netflix, Samsung, and Airbnb, Datadog is headquartered in New York City. The company went ...
Datadog offers competitive salaries, equity options, generous PTO policies, and a flexible remote work policy. Employees also benefit from a learning ...
Datadog fosters an engineering-first culture, with 70% of its workforce comprising engineers. The company emphasizes a strong focus on solving complex...

Datadog • New York, New York, USA
Datadog is seeking a Staff Generative AI Engineer to lead machine learning projects within their Application Performance Monitoring team. You'll design, train, and deploy GenAI/ML models at scale, collaborating with cross-functional teams. This role requires deep expertise in GenAI and machine learning.
You have extensive experience in Generative AI and machine learning, with a proven track record of driving impactful projects from concept to production. As a technical leader, you excel in building and benchmarking GenAI/ML models using state-of-the-art techniques. Your strong communication skills enable you to collaborate effectively with cross-functional teams, influencing product direction and driving innovation.
You are product-minded, understanding the importance of user experience and performance optimization in application monitoring. Your background includes working with automated investigation and triaging tools, ensuring that you can contribute to the development of solutions that enhance application performance visibility. You thrive in a hybrid workplace, valuing collaboration and creativity in your work environment.
Experience with distributed tracing, profiling, and telemetry data is a plus, as it aligns with the goals of the APM team. Familiarity with cloud platforms and data visualization tools will further enhance your contributions to the team.
In this role, you will act as a technical leader within the APM organization, driving Generative AI and machine learning projects from concept to production. You will build and benchmark GenAI/ML models, collaborating closely with cross-functional teams to develop automated investigation and triaging tools. Your influence will shape product direction, ensuring that the APM team remains at the forefront of application performance monitoring.
You will lead efforts to design, train, evaluate, and deploy machine learning models at scale, focusing on agentic workflows that enhance user experience. Your work will involve troubleshooting issues and optimizing services, providing deep visibility into applications for users. You will also mentor junior engineers, fostering a culture of learning and innovation within the team.
Datadog values a collaborative office culture that encourages creativity and relationship-building among team members. As part of a hybrid workplace, you will have the flexibility to create a work-life harmony that suits you best. You will be part of a team that is dedicated to becoming world leaders in agentic investigations and incident troubleshooting, making a significant impact in the field of application performance monitoring.
Apply now or save it for later. Get alerts for similar jobs at Datadog.