
The single application for the entire DevOps lifecycle
GitLab is a comprehensive DevOps platform headquartered in San Francisco, California, serving over 30,000 organizations including NASA, IBM, and Goldman Sachs. The platform integrates project planning, source code management, CI/CD, and monitoring into a single application, streamlining the software...
GitLab offers competitive salaries, equity options, unlimited PTO, and a flexible remote work policy, allowing employees to work from anywhere. They a...
GitLab is known for its remote-first culture, with a strong emphasis on transparency and collaboration across global teams. The company values results...

GitLab • Remote, Americas; Remote, EMEA
GitLab is seeking an Intermediate Site Reliability Engineer to ensure the smooth operation of GitLab.com and other production systems. You'll focus on systems layer operations and leverage Kubernetes to enhance service reliability. This role requires a strong background in software engineering practices.
You have a solid background in site reliability engineering, with experience in maintaining production systems and ensuring their reliability for millions of users. You understand the importance of combining operations with software engineering practices to create efficient and effective solutions. Your expertise includes working with operating systems, storage, networking, and edge services, particularly in a Kubernetes environment. You are comfortable with the challenges of managing complex systems and are proactive in identifying and resolving issues before they impact users. You embrace AI as a productivity multiplier and are eager to incorporate it into your daily workflows to drive efficiency and innovation.
As an Intermediate Site Reliability Engineer at GitLab, you will be responsible for keeping GitLab.com and other production systems running smoothly. You will work closely with engineering teams to implement best practices in reliability and performance. Your role will involve monitoring system performance, troubleshooting issues, and optimizing infrastructure to ensure high availability. You will also contribute to the development of automation tools and processes that enhance operational efficiency. Collaboration with cross-functional teams will be key as you help to define and implement reliability standards across the organization. You will participate in incident response and post-mortem analysis to continuously improve system reliability and performance. Your contributions will directly impact the user experience and the overall success of GitLab's mission to transform software development.
At GitLab, you will be part of a high-performance culture that values collaboration and continuous knowledge exchange. We provide opportunities for professional growth and development, allowing you to reach your full potential while working alongside industry leaders. Our commitment to innovation means you will have the chance to work with cutting-edge technologies and contribute to meaningful projects that make a difference in the software development landscape. We encourage you to apply even if your experience doesn't match every requirement, as we value diverse perspectives and backgrounds.
Apply now or save it for later. Get alerts for similar jobs at GitLab.