Site Reliability Engineer III

JPMorganChase Plano, TX

Full Time Mid Level 2+ years

Posted 3 months ago Expired

This job has expired

Looking for a job like Site Reliability Engineer III in or near Plano, TX? Upload your resume and we'll notify you when similar positions become available.

Upload Your Resume

About This Role

This role involves enhancing intelligent and resilient platform operations for a global financial institution by integrating traditional support with modern Site Reliability Engineering (SRE) principles, utilizing agentic AI. The Site Reliability Engineer will drive innovation within the CORPORATE SECTOR, ENTERPRISE TECHNOLOGY team.

Responsibilities

Advocate and embody site reliability principles, fostering a culture of excellence and technical influence within your team
Leverage AI tools to enhance operational effectiveness and automate processes, ensuring high-quality customer service
Spearhead projects aimed at enhancing the reliability and stability of applications and platforms
Utilize data-driven analytics and AI technologies to automate detection, diagnosis, resolution processes, elevate service levels and drive continuous improvement
Engage stakeholders to establish realistic service level objectives and error budgets, ensuring alignment with customer expectations
Exhibit technical proficiency in one or more domains, proactively addressing technology-related bottlenecks
Employ AI-driven solutions to streamline processes and enhance operational efficiency
Participate in troubleshooting during incidents, demonstrating the ability to swiftly identify and resolve issues to prevent financial losses
Act as a culture carrier by documenting learnings and disseminating knowledge through internal forums and communities of practice
Mentor team members, guiding them in the strategic adoption of AI technologies to enhance operational effectiveness and customer service

Requirements

Formal training or certification on site reliability engineering concepts
2+ years applied experience in resiliency, scalability, performance and security
Proven success in an SRE or DevOps role, with knowledge of SLIs/SLOs, incident management, blameless postmortem analysis, and systems reliability
Expert with observability stacks (e.g., Prometheus, Grafana, Splunk, OpenTelemetry)
Deep experience correlating telemetry across services and time
Hands-on skills in coding (at least one high-level programming language)
Hands-on skills in cloud platforms (AWS or GCP)
Hands-on skills in container orchestration (Kubernetes)
Hands-on skills in infrastructure as code (Terraform)
Hands-on skills in resilient CI/CD pipelines
Active experience or deep curiosity in applying AI to operations—such as LLM-based copilots, anomaly detection, automated runbooks, autonomous agents (e.g. CrewAI, LangGraph), or Retrieval-Augmented Generation (RAG) workflows for support
A track record of delivering under pressure
Ability to deconstruct complexity, organize effectively, and drive clarity into ambiguous operational environments
Outstanding communication, empathy, and professionalism

Qualifications

2+ years applied experience in resiliency, scalability, performance and security or proven success in an SRE or DevOps role

Nice to Have

Experience with operational and compliance rigor in banking, fintech, or similar
Practical use of LLM frameworks (e.g. LangChain, Semantic Kernel), AI orchestration tools, vector databases, or custom agents supporting reliability workflows
Experience with game days, chaos experiments, or failure-mode analysis to improve service robustness
A background in mentoring engineers or leading technical knowledge-sharing, especially around AI and SRE best practices

Skills

AWS * Splunk * Kubernetes * AI * CI/CD * Terraform * LangChain * Grafana * GCP * Langgraph * LLM * Prometheus * OpenTelemetry * Semantic Kernel * CrewAI *

* Required skills

Benefits

Tuition Reimbursement

Comprehensive health care coverage

Mental Health Support

On-site health and wellness centers

Financial coaching

Retirement savings plan

Backup childcare

About JPMorganChase

Chase is a leading financial services firm, helping nearly half of America’s households and small businesses achieve their financial goals through a broad range of financial products. Our mission is to create engaged, lifelong relationships and put our customers at the heart of everything we do. We...

Finance

View all jobs at JPMorganChase →

Similar Jobs

Site Reliability Engineer III

This job has expired

About This Role

Responsibilities

Requirements

Qualifications

Nice to Have

Skills

Benefits

About JPMorganChase

Related Searches

Similar Jobs

Data Owner

Part Time Associate Banker

Data Scientist

Client Service Senior Manager

Data Scientist Lead - Customer Analytics