LeethubLeethub
JobsCompaniesBlog
Go to dashboard

Leethub

Curated tech jobs from FAANG and top companies worldwide.

Top Companies

  • Google Jobs
  • Meta Jobs
  • Amazon Jobs
  • Apple Jobs
  • Netflix Jobs
  • All Companies →

Job Categories

  • Software Engineering
  • Data, AI & Machine Learning
  • Product Management
  • Design & User Experience
  • Operations & Strategy
  • Remote Jobs
  • All Categories →

Browse by Type

  • Remote Jobs
  • Hybrid Jobs
  • Senior Positions
  • Entry Level
  • All Jobs →

Resources

  • Google Interview Guide
  • Salary Guide 2025
  • Salary Negotiation
  • LeetCode Study Plan
  • All Articles →

Company

  • Dashboard
  • Privacy Policy
  • Contact Us
© 2026 Leethub LLC. All rights reserved.
Home›Jobs›Crusoe›Site Reliability Engineer
Crusoe

About Crusoe

Sustainable AI cloud solutions for a greener future

🏢 Tech👥 501-1000📅 Founded 2018📍 Denver, Colorado, United States

Key Highlights

  • Headquartered in Denver, Colorado
  • 501-1000 employees focused on AI and renewable energy
  • First vertically integrated AI cloud platform
  • Committed to sustainable computing practices

Crusoe is a pioneering AI cloud platform headquartered in Denver, Colorado, that utilizes clean, renewable energy to power its operations. The company focuses on providing scalable computing resources for AI and machine learning applications, serving a diverse range of clients across various industr...

🎁 Benefits

Crusoe offers competitive salaries, equity options, generous PTO, and a flexible remote work policy to support work-life balance....

🌟 Culture

Crusoe fosters a culture centered on sustainability and innovation, encouraging employees to contribute to environmentally friendly computing solution...

🌐 Website💼 LinkedIn𝕏 TwitterAll 238 jobs →
Crusoe

Site Reliability Engineer

Crusoe • Dublin - IE

Posted 15h ago🏛️ On-SiteMid-LevelSite reliability engineer📍 Dublin
Apply Now →

Skills & Technologies

LinuxNetworkingAutomation

Overview

Crusoe is hiring a Site Reliability Engineer to ensure the reliability and performance of their cloud infrastructure. You'll work with Linux, networking, and automation to maintain high service levels. This role requires experience in SRE practices and distributed systems.

Job Description

Who you are

You have a strong background in Site Reliability Engineering (SRE) practices, with a focus on maintaining high service levels through effective monitoring and automation. Your experience with distributed systems allows you to understand the complexities involved in ensuring reliability and performance. You are proficient in Linux and have a solid understanding of networking principles, which are crucial for troubleshooting and optimizing infrastructure. Your passion for automation drives you to seek out opportunities to improve processes and reduce manual intervention, ensuring that systems run smoothly and efficiently.

You thrive in a collaborative environment, working closely with engineering teams to advise on building resilient code. Your problem-solving skills enable you to anticipate potential issues and implement proactive measures to prevent them from impacting customers. You are committed to continuous improvement and conduct thorough post-mortems to learn from incidents, sharing insights with your team to enhance overall performance. You understand the importance of a customer-centric approach and strive to ensure that clients have reliable access to the virtual machines they depend on.

Desirable

Experience with cloud infrastructure and familiarity with various cloud service providers would be a plus. Knowledge of monitoring tools and practices, as well as experience with incident management, will further enhance your ability to contribute to the team's success. A background in software development can also be beneficial, as it allows for better collaboration with engineering teams.

What you'll do

In this role, you will be responsible for ensuring the reliability and performance of Crusoe's AI platform. You will work on automation and tool development to streamline routine processes, allowing for more efficient operations. Your expertise in SRE practices will guide you in detecting, analyzing, and preventing issues that could affect service levels. You will collaborate with various engineering teams to advise them on best practices for building resilient code, ensuring that systems are designed with reliability in mind.

You will also conduct thorough post-mortems following incidents, identifying root causes and implementing solutions to prevent recurrence. Your proactive approach will help anticipate issues before they impact customers, maintaining the high standards of service that Crusoe is known for. You will play a key role in driving continuous improvement initiatives, working to enhance the overall performance of the infrastructure.

What we offer

At Crusoe, you will be part of a mission-driven team that is dedicated to accelerating the abundance of energy and intelligence through sustainable technology. We offer a collaborative work environment where innovation is encouraged, and your contributions will have a tangible impact on the future of AI and cloud infrastructure. You will have opportunities for professional growth and development, as well as the chance to work on cutting-edge projects that are shaping the industry. Join us in our commitment to responsible and transformative technology solutions.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Crusoe.

Apply Now →Get Job Alerts