LeethubLeethub
JobsCompaniesBlog
Go to dashboard

Leethub

Curated tech jobs from FAANG and top companies worldwide.

Top Companies

  • Google Jobs
  • Meta Jobs
  • Amazon Jobs
  • Apple Jobs
  • Netflix Jobs
  • All Companies →

Job Categories

  • Software Engineering
  • Data, AI & Machine Learning
  • Product Management
  • Design & User Experience
  • Operations & Strategy
  • Remote Jobs
  • All Categories →

Browse by Type

  • Remote Jobs
  • Hybrid Jobs
  • Senior Positions
  • Entry Level
  • All Jobs →

Resources

  • Google Interview Guide
  • Salary Guide 2025
  • Salary Negotiation
  • LeetCode Study Plan
  • All Articles →

Company

  • Dashboard
  • Privacy Policy
  • Contact Us
© 2026 Leethub LLC. All rights reserved.
Home›Jobs›Groupon›Principal Site Reliability Engineer (AI-first SRE)
Groupon

About Groupon

Find amazing deals on experiences near you

🏢 Tech👥 1K-5K📅 Founded 2008📍 Chicago, Illinois, United States

Key Highlights

  • Headquartered in Chicago, Illinois
  • Over 300,000 deals available across various categories
  • Approximately 3,000 employees
  • Publicly traded since 2011 (NASDAQ: GRPN)

Groupon, headquartered in Chicago, Illinois, connects consumers with local businesses through its platform, offering over 300,000 deals across various categories including travel, dining, and entertainment. Founded in 2008, Groupon has served millions of customers and employs approximately 3,000 peo...

🎁 Benefits

Groupon offers competitive salaries, stock options, flexible PTO, and a remote work policy to support work-life balance. Employees also benefit from w...

🌟 Culture

Groupon fosters a culture of innovation and customer-centricity, encouraging employees to explore new ideas and solutions to enhance user experiences....

🌐 Website💼 LinkedIn𝕏 TwitterAll 50 jobs →
Groupon

Principal Site Reliability Engineer (AI-first SRE)

Groupon • Remote - Peru

Posted 4d ago🏠 RemotePrincipalSite reliability engineer📍 Peru
Apply Now →

Skills & Technologies

AIMachine learningInfrastructure as codeMonitoring and alerting

Overview

Groupon is seeking a Principal Site Reliability Engineer to lead the evolution of their platform towards AI-driven resilience. You'll design self-healing systems ensuring high availability and reliability. This role requires expertise in AI and machine learning.

Job Description

Who you are

You have extensive experience in site reliability engineering, with a strong focus on building and maintaining resilient systems. Your background includes designing intelligent, self-healing systems that achieve high availability targets — you understand the importance of proactive maintenance and automation in infrastructure management. You are well-versed in using AI and machine learning to enhance system reliability and governance, ensuring that incidents are prevented before they occur. Your technical skills are complemented by your ability to collaborate effectively with cross-functional teams, driving innovation and improvement in system performance.

Desirable

Experience with predictive analytics and automation tools is a plus, as is familiarity with cloud infrastructure and services. You have a passion for continuous learning and staying updated with the latest trends in site reliability and AI technologies. You thrive in environments that encourage risk-taking and innovation, and you are eager to contribute to a culture that celebrates success and autonomy.

What you'll do

In this role, you will lead the modernization of Groupon's global platform, focusing on reliability as a core component of the transformation. You will architect and maintain self-healing systems that meet or exceed 99.9% availability targets, leveraging AI and machine learning to automate infrastructure governance and incident detection. Your responsibilities will include designing and implementing monitoring and alerting systems that provide real-time insights into system performance, enabling rapid response to potential issues. You will collaborate with engineering teams to integrate reliability practices into the development lifecycle, ensuring that reliability is prioritized from the outset of new projects.

You will also be responsible for conducting post-incident reviews to identify root causes and implement preventive measures, fostering a culture of continuous improvement within the team. Your leadership will guide the evolution of the site reliability engineering practice at Groupon, influencing technical direction and mentoring junior engineers. You will have the opportunity to make a significant impact on the reliability and performance of systems that serve millions of customers daily.

What we offer

Groupon offers a dynamic work environment where you can make a meaningful impact on the business. You will have the autonomy to drive initiatives and the support of a collaborative team. We provide competitive compensation and benefits, along with opportunities for professional growth and development. Join us in our mission to help local businesses thrive and transform the way customers discover experiences and services.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Groupon.

Apply Now →Get Job Alerts