anthropic logo

Staff Engineer, Datacenter Server Lifecycle

Verified
anthropic
Posted 4 days ago
Posted 29 April 2026
1 views

About the Role

<div class="content-intro"><h2><strong>About Anthropic</strong></h2> <p>Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.</p></div><h2 class="text-text-100 mt-3 -mb-1 text-[1.125rem] font-bold" data-sourcepos="7:1-7:18;390-407">About the role</h2> <p class="font-claude-response-body break-words whitespace-normal leading-[1.7]" data-sourcepos="9:1-9:529;409-937">Anthropic is expanding beyond cloud infrastructure, and this role sits at the heart of that effort. As a Staff Engineer on the Datacenter Server Lifecycle team, you will own the end-to-end operational journey of every machine in our facility — from initial provisioning and deployment, across its working life, through maintenance and refresh, and all the way to decommissioning. This is greenfield work: you will help define the processes, tooling, and operational standards that govern how we run and retire hardware at scale.</p> <p class="font-claude-response-body break-words whitespace-normal leading-[1.7]" data-sourcepos="11:1-11:609;939-1547">A distinguishing aspect of this role is its deep intersection with security. The machines in our datacenter handle some of the most sensitive workloads in AI — training frontier models and serving millions of users interacting with Claude. Ensuring that every machine in the fleet is trusted, attested, and operating with a verified chain of integrity from the hardware up is a core part of the job, not an afterthought. You will partner closely with our Infrastructure Security team to define and enforce trusted compute standards across the lifecycle, from secure provisioning through end-of-life handling.</p> <h2 class="text-text-100 mt-3 -mb-1 text-[1.125rem] font-bold" data-sourcepos="13:1-13:24;1549-1572">Key responsibilities</h2> <ul class="[li_&amp;]:mb-0 [li_&amp;]:mt-1 [li_&amp;]:gap-1 [&amp;:not(:last-child)ul]:pb-1 [&amp;:not(:last-child)ol]:pb-1 list-disc flex flex-col gap-1 pl-8 mb-3" data-sourcepos="15:1-19:126;1574-2312"> <li class="whitespace-normal break-words pl-2" data-sourcepos="15:1-15:99;1574-1672">Lead the build-out of automation to support datacenters containing tens of thousands of servers.</li> <li class="whitespace-normal break-words pl-2" data-sourcepos="16:1-16:298;1673-1970">Define and own the end-to-end server lifecycle strategy — from provisioning and deployment through operation, maintenance, refresh, and decommissioning — and maintain automation and operational procedures for common lifecycle events (e.g., hardware failures, firmware upgrades, fleet rotations).</li> <li class="whitespace-normal break-words pl-2" data-sourcepos="17:1-17:124;1971-2094">Partner closely with Infrastructure Security to design and enforce trusted compute standards across the server lifecycle.</li> <li class="whitespace-normal break-words pl-2" data-sourcepos="18:1-18:92;2095-2186">Work closely with our Networking team to ensure end-to-end connectivity across all sites.</li> <li class="whitespace-normal break-words pl-2" data-sourcepos="19:1-19:126;2187-2312">Build and maintain tooling to track machine health, configuration, and operational status across the full datacenter fleet.</li> </ul> <h2 class="text-text-100 mt-3 -mb-1 text-[1.125rem] font-bold" data-sourcepos="21:1-21:26;2314-2339">Minimum qualifications</h2> <ul class="[li&amp;]:mb-0 [li&amp;]:mt-1 [li_&amp;]:gap-1 [&amp;:not(:last-child)_ul]:pb-1 [&amp;:not(:last-child)_ol]:pb-1 list-disc flex flex-col gap-1 pl-8 mb-3" data-sourcepos="23:1-29:79;2341-3090"> <li class="whitespace-normal break-words pl-2" data-sourcepos="23:1-23:139;2341-2479">Hands-on experience with server hardware, including rack deployment, cabling, troubleshooting, and understanding failure modes at scale.</li> <li class="whitespace-normal break-words pl-2" data-sourcepos="24:1-24:156;2480-2635">End-to-end understanding of hardware lifecycle management: asset tracking, provisioning workflows, maintenance scheduling, and decommissioning practices.</li> <li class="whitespace-normal break-words pl-2" data-sourcepos="25:1-25:86;2636-2721">Proficiency in at least one programming language (e.g., Python, Rust, Go, or Java).</li> <li class="whit

Related Searches

Explore more opportunities matching this role's title, location, and skills.

Job Title PagesLocation PagesCompany PagesSkill Pages

Ready to apply?

Click below to apply directly on anthropic's careers page.

Get the top 10 hyper-growth roles delivered to your inbox every Tuesday.