Senior Engineer, SRE
RemotePosted 1 month ago Expired
This job has expired
Looking for a job like Senior Engineer, SRE in or near Remote, CA? Upload your resume and we'll notify you when similar positions become available.
Upload Your ResumeAbout This Role
As a Senior Site Reliability Engineer, you will ensure hyper-stable online experiences for Sephora customers by monitoring, optimizing, and safeguarding the reliability of the Dotcom platform and OMNI services. You will lead incident response, drive automation, enhance observability, and validate release readiness to maintain high availability and performance.
Responsibilities
- Ensure Platform Stability for Dotcom and OMNI services, including BOPIS and Same-Day Delivery, ensuring high availability and resilience.
- Lead Incident Response by triaging, diagnosing, and resolving L2/L3 production incidents, and partnering on corrective actions.
- Drive Intelligent Automation through building solutions, reducing operational toil, and creating AI-driven reliability tools and workflows.
- Enhance Observability by developing and optimizing logs, metrics, traces, dashboards, anomaly detection, and refining alerting pipelines.
- Validate Release Readiness for seasonal events, feature launches, and traffic spikes through resiliency checks and performance validation.
- Maintain Reliability Standards, optimizing SLO/SLI frameworks, monitoring error budgets, and collaborating on continuous reliability improvements.
Requirements
- 6+ years of SRE, DevOps, or Production Engineering experience
- Strong understanding of reliability principles and operational excellence
- Exposure to Azure AKS, Kubernetes, Docker, Service Mesh, API-driven architectures
- Operational support experience for React front-end and Spring Boot microservices
- Hands-on experience with Dynatrace, Splunk, Grafana, Prometheus
- Strong scripting abilities (Python, Bash, PowerShell, YAML)
- Proven experience in incident management and root cause analysis
- Experience with SRE principles and CI/CD pipelines (Jenkins, GitHub Actions)
- Azure cloud platform experience
Qualifications
- 6+ years of hands-on SRE, DevOps, or Production Engineering experience in high-scale digital applications.
Nice to Have
- AWS/GCP/OCI experience
Skills
* Required skills