Senior Software Dev Engineer, EC2 Nitro
Full Time
Senior Level
5+ years
Posted 1 week ago
Interested in this position?
Upload your resume and we'll match you with this and other relevant opportunities.
Upload Your ResumeAbout This Role
Join the EC2 Nitro Machine Learning Systems team to build and optimize infrastructure powering computationally intensive AI/ML workloads. This role involves establishing EC2 as the definitive source for best-known-configurations across diverse ML applications and influencing future accelerated platform designs.
Responsibilities
- Design and implement scalable performance measurement infrastructure for ML benchmarking across AWS, incorporating critical metrics
- Lead technical projects establishing EC2 as the definitive source for ML performance best practices across diverse applications
- Develop and maintain comprehensive regression testing systems that validate performance across major component releases
- Collaborate with hardware engineering teams to influence future accelerator platform designs based on performance insights
- Build customer relationships by investigating complex performance challenges, developing solutions, and publishing actionable best practices
Requirements
- 5+ years of non-internship professional software development experience
- 5+ years of programming with at least one software programming language experience
- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience as a mentor, tech lead or leading an engineering team
- Knowledge of Machine Learning and LLM fundamentals, including transformer architecture, training/inference lifecycles, and optimization techniques
Qualifications
- Bachelor's degree in computer science or equivalent
- 5+ years of non-internship professional software development experience, including programming and leading design or architecture of new and existing systems.
Nice to Have
- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Knowledge of ML frameworks including JAX, PyTorch, vLLM, SGLang, Dynamo, TorchXLA, and TensorRT
- Knowledge of machine learning model architecture and inference
Skills
Machine Learning
*
PyTorch
*
LLM
*
vLLM
*
TensorRT
*
Dynamo
*
JAX
*
SGLang
*
TorchXLA
*
* Required skills
Benefits
Health Insurance
Paid Time Off
Flexible spending accounts
Basic Life & AD&D Insurance
Adoption and Surrogacy Reimbursement coverage
Dental Insurance
Parental Leave
Prescription coverage
401K Matching
Medical advice line
Vision Insurance
Mental Health Support
Employee Assistance Program (EAP)
Supplemental life plans
About Amazon Web Services (AWS)
AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure, powering millions of businesses and services worldwide.
Technology
View all jobs at Amazon Web Services (AWS) →
Related Searches
Similar Jobs
Data Center Technician
Active
Amazon Web Services (AWS)
·
Fort Worth, TX
·
$29 - $52
Linux
Hardware Diagnostics
Network Diagnostics
1 week ago
Data Center Engineering Operations Technician
Active
Amazon Web Services (AWS)
·
Columbus, OH
·
$57 - $63
HVAC
Motors
Pumps
Building Management Systems (BMS)
+13 more
1 week ago
Engineering Operation Technician
Active
Amazon Web Services (AWS)
·
Wink, TX
·
$36 - $63
Microsoft Office
Building Management Systems (BMS)
Electrical Power Management System (EPMS)
1 week ago
Work Based Learning Program Logistics Specialist
Active
Amazon Web Services (AWS)
·
Hermiston, OR
·
$21 - $37
Microsoft Office
1 week ago
Data Center Operations Manager
Active
Amazon Web Services (AWS)
·
Canton, MS
·
$70,700 - $158,000
Hardware Diagnostics
Network Diagnostics
1 week ago