Site Reliability Engineer, Google Distributed Cloud, Connected SRE

Google • Sunnyvale, CA, USA

Posted 3w ago🏛️ On-Site Senior Site reliability engineer 📍 Sunnyvale

Apply Now →

Skills & Technologies

Python Linux Kubernetes

Overview

Google is seeking a Senior Site Reliability Engineer to build and maintain scalable, reliable systems for Google Cloud. You'll work with Python, Linux, and Kubernetes to ensure high reliability and performance. This role requires 8+ years of experience in software development and site reliability engineering.

Job Description

Who you are

You have a Bachelor's degree in Computer Science or a related field, or equivalent practical experience. With 8 years of experience in software development across various programming languages, you have a strong foundation in coding and system design. Your 4 years of experience in site reliability engineering have equipped you with the skills to build and maintain scalable, reliable systems. You are proficient in automation and coding in Python, and you have a solid understanding of distributed systems, having spent 3 years designing, analyzing, and troubleshooting them. Your experience in Linux system administration spans at least 5 years, and you have worked with Kubernetes for 3 years. You thrive in a culture of intellectual curiosity and problem-solving, and you are eager to foster reliability across engineering teams.

Desirable

A Master's degree in Computer Science or Engineering would be a plus. Experience in leading projects and collaborating with development teams on design and operations is highly valued. You are familiar with SLOs, monitoring, rollout safety, and production safety systems, which are crucial for maintaining the reliability of large-scale systems.

What you'll do

As a Senior Site Reliability Engineer at Google, you will play a critical role in ensuring the reliability and uptime of Google Cloud's services. You will collaborate with development teams to design and launch new features while sharing on-call responsibilities to meet customer expectations and SLAs. Your expertise in coding, algorithms, and large-scale system design will be essential as you optimize existing systems and build infrastructure to eliminate manual work through automation. You will monitor system capacity and performance, ensuring that our services meet the high standards expected by our customers. Your contributions will directly impact the reliability of our platform, and you will have the opportunity to manage the unique scale of Google Cloud.

What we offer

At Google, you will be part of a culture that values intellectual curiosity and openness. You will work alongside talented engineers who are passionate about building and maintaining reliable systems. We offer competitive compensation and benefits, along with opportunities for professional growth and development. You will have access to the tools and resources necessary to operate at scale, and you will be encouraged to innovate and improve our systems continuously. Join us in our mission to provide reliable cloud services that empower businesses and developers around the world.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Google.

Apply Now →Get Job Alerts

About Google

Key Highlights

🎁 Benefits

🌟 Culture