
Empowering the world through technology and information
Google LLC, headquartered in Mountain View, California, is a global leader in internet-related services and products, including its flagship search engine, Google Search, and the Android operating system. With over 100,000 employees, Google also offers cloud computing services through Google Cloud P...
Google offers competitive salaries, equity options, generous PTO policies, comprehensive health benefits, and a remote work policy that allows flexibi...
Google is known for its engineering-first culture, emphasizing innovation and collaboration. The company fosters a unique environment that encourages ...

Google • Munich, Germany
Google is seeking a Senior Site Reliability Engineer to ensure the reliability and performance of Google Cloud's services. You'll leverage your expertise in Java, Python, and distributed systems to optimize large-scale systems. This role requires 5+ years of experience in software development and systems engineering.
You have a Bachelor's degree in Computer Science or a related field, along with 5 years of experience in software development using one or more programming languages. Your background includes at least 3 years of experience in designing, analyzing, and troubleshooting large-scale distributed systems, and you have spent 2 years leading projects and providing technical leadership. You are passionate about Site Reliability Engineering (SRE), which combines software and systems engineering to build and run large-scale, fault-tolerant systems. You thrive in a culture of intellectual curiosity and problem-solving, and you are eager to collaborate with diverse teams to tackle complex challenges.
Your technical expertise includes proficiency in Java and Python, as well as a solid understanding of Linux environments. You have hands-on experience with containerization technologies like Docker and orchestration tools such as Kubernetes. Your knowledge of Google Cloud Platform (GCP) and distributed systems design allows you to optimize existing systems and build infrastructure that scales effectively. You are also skilled in automation, which helps eliminate repetitive tasks and improve system reliability.
You understand the importance of measuring and monitoring system health, availability, and latency. You are committed to practicing sustainable incident response and conducting blameless postmortems to learn from failures and improve processes. You are a proactive problem-solver who enjoys managing the unique challenges of scale that come with working at Google Cloud.
A Master's degree in Computer Science or Engineering is preferred, as well as experience with capacity planning and launch reviews. Familiarity with software platforms and frameworks that enhance system performance is a plus. You are open to learning new technologies and methodologies that can further improve the reliability and efficiency of the systems you manage.
As a Senior Site Reliability Engineer at Google, you will play a crucial role in ensuring that our cloud services maintain high reliability and performance standards. You will work closely with engineering teams to design and implement solutions that enhance system capacity and performance. Your responsibilities will include maintaining services once they are live by measuring and monitoring their availability, latency, and overall health. You will also be involved in scaling systems sustainably through automation and advocating for changes that improve reliability and velocity.
You will lead projects that focus on optimizing existing systems and building new infrastructure to support Google's growing cloud services. Your technical leadership will guide teams in implementing best practices for incident management and response, ensuring that we learn from each incident to prevent future occurrences. You will collaborate with cross-functional teams to drive improvements in system architecture and performance, leveraging your expertise in coding, algorithms, and large-scale system design.
In this role, you will have the opportunity to manage the unique challenges of scale that are inherent to Google Cloud. You will be part of a culture that encourages collaboration, innovation, and risk-taking in a blame-free environment. Your contributions will directly impact the reliability and efficiency of our services, helping to deliver exceptional experiences to our customers.
At Google, we offer a dynamic work environment that fosters growth and development. You will have access to cutting-edge technologies and the opportunity to work on meaningful projects that have a real impact on users worldwide. We provide competitive compensation and benefits, including opportunities for professional development and career advancement. Our culture promotes self-direction and encourages you to take ownership of your work while collaborating with talented individuals from diverse backgrounds. Join us in shaping the future of cloud computing and making a difference in the world.
Apply now or save it for later. Get alerts for similar jobs at Google.