Interested in this position?
Upload your resume and we'll match you with this and other relevant opportunities.
Upload Your ResumeAbout This Role
Waymo is seeking a Staff-level Software Engineer with expertise in Machine Learning to optimize core ML pipelines and enhance the efficiency and reliability of autonomous driving systems. This role involves managing multiple ML systems, developing high-performance integration code, and influencing infrastructure roadmaps to accelerate development cycles for self-driving technology.
Responsibilities
- Partner with Core Infrastructure teams to develop and influence the strategic infrastructure roadmap, translating ML product requirements into actionable infrastructure requests.
- Serve as the technical escalation point for production issues, troubleshooting failures across the entire stack—from Python-level errors to cluster scheduling and network latency problems.
- Profile and optimize end-to-end ML pipelines, identifying bottlenecks in data loading and implementing C++ optimizations to reduce Python overhead.
- Build robust CLI tools and middleware to streamline the ML development "inner loop," automating repetitive tasks and improving change management workflows.
- Develop shims and wrappers to integrate new infrastructure features into the ML stack, enabling faster iteration and deployment.
- Instrument system modules for active monitoring and implement recovery mechanisms to maintain system stability and uptime.
- Collaborate with research and infrastructure teams to ensure seamless integration of hardware accelerators and storage solutions.
Requirements
- Expertise in C++ with a focus on system performance, concurrency, and memory management.
- Proficiency in Python, especially for ML modeling and scripting.
- Deep understanding of distributed systems, including resource management, job scheduling, RPC systems, and distributed storage.
- Strong debugging skills with experience in diagnosing complex system failures and performance issues.
- Experience in designing and implementing scalable, high-performance infrastructure solutions.
- Proven ability to lead technical discussions and influence cross-team collaboration.
- Demonstrated experience mentoring engineers and establishing engineering standards.
Skills
* Required skills
Benefits
About Wiraa
CrowdStrike is a global leader in cybersecurity, dedicated to protecting organizations by stopping breaches and redefining modern security through its advanced AI-native platform. Since its inception in 2011, CrowdStrike has been at the forefront of cybersecurity innovation, managing large-scale dis...