Director - Site Reliability Engineering

UKG Lowell, MA $179,800 - $258,500
Full Time Director Level 10+ years

Posted 1 week ago

Interested in this position?

Upload your resume and we'll match you with this and other relevant opportunities.

Upload Your Resume

About This Role

Lead and shape reliability at an enterprise scale for UKG's global platforms, focusing on the reliability, resilience, and operational excellence of hundreds of applications across a hybrid infrastructure. This role involves leading an established SRE organization and driving consistent reliability practices across diverse technologies.

Responsibilities

  • Manage reliability outcomes across a large and varied application portfolio, including availability, performance, scalability, and recoverability
  • Ensure applications meet defined reliability expectations on both on-premise and cloud platforms
  • Lead and participate in major incident response, acting as a senior escalation point and ensuring effective executive communication
  • Drive post-incident learning and systemic improvements to reduce repeat issues
  • Lead teams responsible for understanding application behavior in production, including runtime performance, resource utilization, and failure modes
  • Partner with Infrastructure, Cloud, Security, and Product Engineering teams to address cross-layer reliability concerns
  • Establish standards for operational readiness, release safety, capacity planning, and disaster recovery across platforms
  • Apply Site Reliability Engineering principles pragmatically across both legacy and cloud-native systems, including SLOs, error budgets, and automation
  • Lead and develop SRE managers and engineers across a global organization
  • Translate business priorities into reliability-focused technical initiatives and partner with senior leadership to balance delivery velocity, reliability, and operational risk

Requirements

  • 10+ years of experience in software engineering, systems engineering, SRE, or related disciplines
  • Proven experience leading established, globally distributed engineering organizations
  • Strong understanding of production systems and application behavior at scale
  • Experience operating and leading teams across hybrid environments (on-prem and public cloud)
  • Demonstrated ability to influence outcomes in a matrixed enterprise environment
  • Experience owning incident response, operational reviews, and executive-level communication
  • Excellent communication skills, with the ability to clearly articulate technical and operational concepts to varied audiences

Qualifications

  • 10+ years of experience in software engineering, systems engineering, SRE, or related disciplines

Nice to Have

  • Experience supporting large-scale application portfolios across both Windows/.NET and cloud-native environments
  • Familiarity with Google Cloud Platform and enterprise-scale cloud operations
  • Strong understanding of observability practices across application, platform, and infrastructure layers
  • Prior experience partnering closely with Product, Infrastructure, and Cloud leadership

Skills

Agile * .NET * CI/CD * Windows * Google Cloud Platform * Observability * SLO *

* Required skills

Benefits

Restricted stock unit awards

About UKG

UKG is the Workforce Operating Platform that puts workforce understanding to work, leveraging insights and people-first AI to build trust, amplify productivity, and empower talent.

Technology
View all jobs at UKG →