Production Reliability Engineer
Full Time
Mid Level
5+ years
Posted 2 weeks ago
Interested in this position?
Upload your resume and we'll match you with this and other relevant opportunities.
Upload Your ResumeAbout This Role
The Production Reliability Engineer will join a top global trading firm's Central Operations and Reliability Engineering team, focusing on managing and supporting a real-time, high-performance global trading environment. This role is critical for system reliability, performance, and operational risk at the intersection of infrastructure, automation, and live production support.
Responsibilities
- Own and improve a large-scale production environment with a focus on reliability, performance, and operability
- Proactively monitor and troubleshoot distributed, latency-sensitive systems
- Build and maintain DevOps and automation tooling across configuration management, deployments, monitoring, data collection, and analysis
- Use system and operational metrics to improve scalability and stability
- Partner with engineers, operators, and stakeholders to investigate and resolve complex system issues
- Coordinate production changes and manage incidents in collaboration with risk and operational support teams
- Communicate directly with end users to manage incidents and drive technology improvements
- Support reconciliation workflows related to system output and downstream processes
- Evaluate and manage operational risk for production changes
- Define, document, and continuously refine operational procedures
- Mentor and support other reliability and operations engineers
- Participate in shared operational and on-call responsibilities
Requirements
- Degree in Computer Science, Engineering, or equivalent professional experience
- 5+ years in DevOps, SRE, Linux Systems Engineering, or Network Engineering roles
- 3+ years of experience with Python and shell scripting
- Strong Linux expertise, including system internals, performance tuning, and system/network configuration
- Solid understanding of networking fundamentals (routing, multicast, VLANs, Ethernet)
- Ability to support periodic on-call duties
Qualifications
- Degree in Computer Science, Engineering, or equivalent professional experience
- 5+ years in DevOps, SRE, Linux Systems Engineering, or Network Engineering roles
Nice to Have
- Familiarity with C++
Skills
Python
*
C++
*
DevOps
*
Networking
*
Linux
*
Shell Scripting
*
SRE
*
* Required skills