datadog logo

Staff Software Engineer - ML Observability

Verified
datadog
Posted 1 weeks ago
Posted 1 April 2026

About the Role

<p data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true">The ML Observability team builds cutting-edge tools to monitor, explain, and improve AI systems in production, particularly those leveraging Large Language Models (LLMs) and generative AI. We provide robust, scalable observability for AI workloads, including drift detection and model evaluation, and behavior tracing, enabling customers to ship AI with confidence.</p> <p data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true">As a Staff Engineer, you’ll lead the development of new features and foundational capabilities within Datadog’s LLM Observability product. You will shape product direction, drive experimentation, and apply your deep understanding of both AI systems and software engineering to solve open-ended problems in the fast-moving AI landscape. Your work will directly impact how our customers monitor, troubleshoot, and optimize LLM-based applications in production.</p> <p data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true">Join us in building the foundational tools that make AI systems observable, understandable, and reliable in the real world.</p> <p data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true"><em data-prosemirror-content-type="mark" data-prosemirror-mark-name="em">At Datadog, we place value in our office culture - the relationships and collaboration it builds and the creativity it brings to the table. We operate as a hybrid workplace to ensure our Datadogs can create a work-life harmony that best fits them.</em></p> <p data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true">&nbsp;</p> <p data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true"><strong data-prosemirror-content-type="mark" data-prosemirror-mark-name="strong">What You’ll Do:</strong></p> <ul class="ak-ul" data-prosemirror-content-type="node" data-prosemirror-node-name="bulletList" data-prosemirror-node-block="true"> <li data-prosemirror-content-type="node" data-prosemirror-node-name="listItem" data-prosemirror-node-block="true"> <p data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true">Drive design and implementation of LLM observability features.</p> </li> </ul> <ul> <li data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true">Ideate, prototype, and scale new product features to provide insights and drive improvements for generative AI systems</li> <li data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true">Work cross-functionally with other eng teams, product, UX, and applied science to iterate fast and find product-market fit</li> <li data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true">Develop and extend tools for tracing, evaluating, and debugging LLMs</li> <li data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true">Influence architecture decisions and mentor engineers to build resilient, high-performance systems</li> <li data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true">Stay close to customer pain points and use those insights to guide product and engineering priorities</li> <li data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true">Stay current with industry trends and advancements in machine learning and observability, driving innovation within the team</li> </ul> <p data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true">&nbsp;</p> <p data-prosemirror-content-type="node" data-prosemirror-node-name="paragraph" data-prosemirror-node-block="true"><strong data-prosemirror-content-type="mark&quot

Related Searches

Explore more opportunities matching this role's title, location, and skills.

Job Title PagesLocation PagesCompany PagesSkill Pages

Ready to apply?

Click below to apply directly on datadog's careers page.

Get the top 10 hyper-growth roles delivered to your inbox every Tuesday.