
Empowering the world through technology and information
Google LLC, headquartered in Mountain View, California, is a global leader in internet-related services and products, including its flagship search engine, Google Search, and the Android operating system. With over 100,000 employees, Google also offers cloud computing services through Google Cloud P...
Google offers competitive salaries, equity options, generous PTO policies, comprehensive health benefits, and a remote work policy that allows flexibi...
Google is known for its engineering-first culture, emphasizing innovation and collaboration. The company fosters a unique environment that encourages ...

Google • Kirkland, WA, USA, Sunnyvale, CA, USA
Google is seeking a Staff Software Engineer to develop emerging on-prem AI infrastructure. You'll leverage your expertise in C++, distributed systems, and cloud technologies. This role requires 8+ years of experience in software engineering.
You have a Bachelor's degree or equivalent practical experience, along with 8 years of experience programming in C++. Your background includes 5 years of experience testing and launching software products, as well as building and developing large-scale infrastructure or distributed systems. You possess a strong understanding of software design and architecture, with at least 3 years of experience in this area. You are familiar with cloud or systems-level infrastructure that spans the entire hardware and software stack, and you have experience in diagnostics, troubleshooting, and supportability. Your ability to navigate ambiguity and deliver solutions for complex technical problems sets you apart.
You have experience leading SWAT team efforts for complex issues and developing long-term sustainable solutions. Familiarity with Service Level Objectives (SLOs) and metrics measurement is a plus, as is your understanding of low-level system software, OS, firmware, and networking. Your passion for building system skills drives your work.
In this role, you will be responsible for diagnostics and troubleshooting of end-to-end supportability issues, uncovering and addressing complex technical problems. You will build repair automation systems and implement success metrics for the team, focusing on operational plane metrics and RMA/spares metrics. Your contributions will help Google develop next-generation technologies that change how billions of users connect and interact with information. You will work closely with cross-functional teams to ensure the infrastructure meets the needs of large-scale applications.
At Google, you will be part of a team that values innovation and collaboration. You will have the opportunity to work on cutting-edge technologies that impact millions of users worldwide. We encourage you to apply even if your experience doesn't match every requirement, as we believe diverse teams build better products. Join us in shaping the future of AI infrastructure and making a difference in the tech landscape.
Apply now or save it for later. Get alerts for similar jobs at Google.