Senior Software Engineer - SRE, Backend (Reliability Engineering)

Affirm • Remote Canada

Posted 22h ago🏠 Remote Senior Site reliability engineer 📍 Canada

Apply Now →

Skills & Technologies

Java Python AWS Docker Kubernetes Linux Git Prometheus Grafana REST API

Overview

Affirm is seeking a Senior Site Reliability Engineer to enhance application reliability and performance. You'll work with technologies like Java, Python, and AWS to implement best practices in reliability engineering. This role requires strong experience in SRE practices and distributed systems.

Job Description

Who you are

You have 5+ years of experience in software engineering with a focus on Site Reliability Engineering, where you've successfully contributed to the reliability and performance of large-scale applications. Your background includes a deep understanding of distributed systems and the challenges they present, allowing you to effectively guide teams in operational excellence.

Your expertise in programming languages such as Java and Python enables you to build robust tooling and automation solutions that enhance operational efficiency. You are well-versed in cloud platforms like AWS, and you understand how to leverage their services to improve application performance and reliability.

You have hands-on experience with containerization and orchestration technologies, particularly Docker and Kubernetes, which you use to streamline deployment processes and manage application lifecycles. Your familiarity with Linux systems allows you to troubleshoot and optimize performance in production environments effectively.

You are proficient in using version control systems like Git, which you utilize to manage code changes and collaborate with your team. Your experience with monitoring and observability tools such as Prometheus and Grafana helps you provide visibility into application performance and proactively address issues before they impact users.

You understand the importance of defining Service Level Objectives (SLOs) and are skilled in driving incident management processes to ensure swift resolution of issues. Your ability to engage in architectural discussions allows you to recommend improvements that enhance system reliability and resilience.

You are a strong communicator who enjoys collaborating with cross-functional teams, sharing your knowledge, and mentoring junior engineers. You thrive in environments where you can lead initiatives and drive change, ensuring that best practices are adopted across the organization.

Desirable

Experience with configuration management tools and practices is a plus, as is familiarity with CI/CD pipelines. You may also have exposure to chaos engineering principles, which you can apply to test system resilience under unexpected conditions.

What you'll do

In this role, you will own and deliver quarterly goals for your team, leading engineers through ambiguity to solve complex, open-ended problems. You will ensure that your team is supported throughout the delivery process, fostering a culture of collaboration and continuous improvement.

You will guide the development of SLOs and ensure that they align with business objectives, providing teams with the data and visibility they need to operate effectively. Your leadership will drive the incident management and analysis process, ensuring that lessons learned are documented and shared across the organization.

You will steer the implementation of change management and deployment practices, working closely with engineering teams to ensure that changes are made safely and efficiently. Your engagement in service and architectural conversations will help shape the direction of the systems you support, ensuring they are built for reliability and scalability.

You will recommend observability and alerting configurations that empower teams to monitor their applications effectively, enabling them to respond quickly to performance issues. Your contributions will help establish a culture of ownership and accountability, where engineers take pride in the reliability of the services they operate.

What we offer

At Affirm, you will be part of a mission-driven team that is reinventing credit to make it more honest and friendly. We offer a flexible remote work environment that allows you to balance your professional and personal life. You will have the opportunity to work with cutting-edge technologies and contribute to meaningful projects that impact our customers' experiences.

We believe in fostering a culture of learning and growth, providing you with opportunities to develop your skills and advance your career. Our team is committed to supporting each other and sharing knowledge, ensuring that everyone has the resources they need to succeed.

Join us at Affirm, where you can make a difference in the world of finance while working in a supportive and innovative environment.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Affirm.

Apply Now →Get Job Alerts

About Affirm

Key Highlights

🎁 Benefits

🌟 Culture