Site Reliability Engineer
Contract
Mid Level
4+ years
Posted 2 months ago Expired
This job has expired
Looking for a job like Site Reliability Engineer in or near Irvine, CA? Upload your resume and we'll notify you when similar positions become available.
Upload Your ResumeAbout This Role
Support and maintain production-grade cloud infrastructure and Kubernetes-based platforms for a client, ensuring high availability, performance, and reliability.
Responsibilities
- Support production-grade cloud infrastructure in major cloud providers (AWS, GCP, or Azure)
- Operate and maintain Kubernetes-based platforms in production environments
- Implement or support monitoring, alerting, and observability solutions (metrics, logs, traces)
- Troubleshoot distributed systems, including performance, availability, and reliability issues
- Participate in on-call rotations, incident response, and root cause analysis
Requirements
- 4+ years of relevant technology experience
- Hands-on experience supporting production-grade cloud infrastructure in AWS, GCP, or Azure
- Practical experience operating and maintaining Kubernetes-based platforms in production environments
- Experience with Infrastructure as Code (IaC) tools such as Terraform, Helm, or CloudFormation
- Working knowledge of CI/CD and GitOps practices, including automated testing and deployment pipelines
- Experience implementing or supporting monitoring, alerting, and observability solutions
- Strong troubleshooting skills across distributed systems
- Proficiency in at least one scripting or programming language (e.g., Python, Go, Bash)
- Experience participating in on-call rotations, incident response, and root cause analysis
- Authorised to work in the US (USC/GC/GC-EAD/H4-EAD/L2S Only)
- Local candidate to Irvine, California, with local DL and local project in CA
Qualifications
- BS degree in Computer Science or related field or equivalent combination of education & experience
- 4+ years of relevant technology experience or equivalent
Nice to Have
- Experience operating multi-cloud environments (AWS, GCP, Azure)
- Experience with event streaming platforms such as Apache Kafka, Kafka Connect, or Amazon MSK
- Familiarity with service mesh technologies (e.g., Istio)
- Exposure to stream processing frameworks (e.g., Apache Flink) and CDC tools such as Debezium
- Experience supporting MLOps or AI infrastructure
- Familiarity with observability standards such as OpenTelemetry and Golden Signals
- Experience working in regulated environments and supporting compliance frameworks (HIPAA, SOC 2, ISO 27001)
- Experience implementing security best practices for cloud-native platforms (IAM, secrets management, RBAC)
- Prior experience in platform engineering or internal developer platforms
- Exposure to cost optimization and FinOps practices in cloud environments
Skills
Python
*
AWS
*
Azure
*
Kubernetes
*
CloudFormation
*
CI/CD
*
Terraform
*
Go
*
Apache Kafka
*
GCP
*
OpenTelemetry
*
Bash
*
Helm
*
MLOps
*
GitOps
*
Istio
*
Apache Flink
*
Kafka Connect
*
Amazon MSK
*
Debezium
*
* Required skills
Related Searches
Similar Jobs
Site Reliability Engineer - Trading
Expired
Hunter Bond
·
New York, NY
·
$100,000 - $200,000
Python
Kubernetes
Docker
CI/CD
+9 more
2 months ago
Site Reliability Engineer III
Expired
JPMorganChase
·
Plano, TX
AWS
Splunk
Kubernetes
AI
+11 more
2 months ago
Senior Site Reliability Engineer
Expired
BetterUp
·
Chicago, IL
·
$164,000 - $205,000
AWS
Kubernetes
Terraform
Prometheus
+3 more
2 months ago
Site Reliability Engineer
Expired
AppBuddy
·
Boston, MA
·
$190,000 - $215,000
Python
AWS
Jenkins
Kubernetes
+21 more
2 months ago
Site Reliability Engineer
Expired
Origami Risk
·
Atlanta, GA
·
$100,000 - $120,000
SQL
AWS
Azure
C++
+9 more
2 months ago