
The leading developer data platform for modern applications
MongoDB is a leading developer data platform headquartered in the Theater District of New York, NY. The company specializes in a document-oriented database system that stores data as JSON-like documents, making it a popular choice for modernizing legacy applications. With over 1000 employees and $31...
MongoDB offers a comprehensive benefits package including equity and an Employee Stock Purchase Program, 20 weeks of fully paid gender-neutral parenta...
MongoDB fosters a culture centered around open-source development and innovation. The company is committed to helping businesses modernize their appli...

MongoDB • Ireland
MongoDB is seeking a Senior Site Reliability Engineer for their Observability team to build and maintain the observability stack. You'll work with technologies like Splunk, Prometheus, and Docker to ensure service reliability. This role requires strong collaboration skills and experience in observability infrastructure.
You have a strong background in Site Reliability Engineering with a focus on observability — you've designed and implemented observability stacks that include metrics, logging, and tracing to ensure service reliability across various platforms. Your experience includes working with tools like Splunk and Prometheus, and you understand the importance of monitoring and alerting in maintaining service health.
You thrive in collaborative environments — you enjoy working closely with software engineering and other SRE teams to promote best practices in service instrumentation and monitoring. Your ability to communicate effectively with cross-functional teams ensures that observability standards are met and maintained across the organization.
You are proactive in identifying and troubleshooting issues — your analytical skills allow you to define key metrics that detect incidents and quantify service performance. You have experience in building reliable, fault-tolerant systems that are self-healing, and you understand the complexities of operating in a multi-cloud environment.
You are comfortable participating in on-call rotations — your experience has prepared you to handle incidents effectively and to contribute to the continuous improvement of incident response processes. You are dedicated to building a culture of reliability and resilience within your team and the broader organization.
Experience with telemetry pipelines and monitoring infrastructure is a plus — you have a keen interest in exploring new technologies and methodologies that enhance observability practices. Familiarity with cloud providers and their observability tools will help you adapt quickly to MongoDB's infrastructure.
As a Senior Site Reliability Engineer on the Observability team, you will define the standards and vision for the observability platform used by all engineering teams — your role will involve designing, architecting, and delivering core pieces of observability services in collaboration with various stakeholders. You will be responsible for ensuring that the observability stack is robust and meets the needs of the organization.
You will work on building and maintaining the observability stack, which includes metrics, logging, and tracing — your expertise will help in troubleshooting and implementing monitoring solutions that span across multiple cloud providers. You will also identify and configure key metrics that help detect incidents and quantify service health, availability, and performance.
Collaboration is key in this role — you will partner with other SRE and software engineering teams to promote best practices in instrumenting and monitoring services. Your contributions will directly impact the reliability and performance of MongoDB's services, making them more resilient and self-healing.
You will participate in a week-long on-call rotation, ensuring that you are hands-on with incident management and response — your experience will guide you in improving the incident response process and enhancing the overall reliability of the services.
MongoDB offers a hybrid working model, allowing you to balance your work between the office and remote — you will have the opportunity to work in a collaborative environment that values innovation and reliability. The company is committed to providing necessary accommodations for individuals with disabilities within the application and interview process.
You will be part of a team that is dedicated to building impactful observability solutions — your work will contribute to the overall success of MongoDB's engineering efforts, ensuring that services are reliable and performant. The culture at MongoDB encourages continuous learning and growth, providing you with opportunities to expand your skills and knowledge in the field of Site Reliability Engineering.
Apply now or save it for later. Get alerts for similar jobs at MongoDB.