Software Engineer - AI/ML, AWS Neuron
Posted 2 weeks ago
Interested in this position?
Upload your resume and we'll match you with this and other relevant opportunities.
Upload Your ResumeAbout This Role
This role is for a Machine Learning Engineer in the Distributed Training team for AWS Neuron, responsible for development, enablement, and performance tuning of a wide variety of ML model families. You will help lead efforts building distributed training support into Pytorch and Jax using the Neuron compiler and runtime stacks to tune models for highest performance and efficiency on AWS Trainium instances.
Responsibilities
- Develop, enable, and performance tune a wide variety of ML model families, including massive-scale Large Language Models (LLM) such as GPT and Llama, as well as Stable Diffusion and Vision Transformers (ViT)
- Work with chip architects, compiler engineers and runtime engineers to create, build and tune distributed training solutions with Trainium instances
- Lead efforts building distributed training support into Pytorch and Jax using the Neuron compiler and runtime stacks
- Tune ML models to ensure highest performance and maximize efficiency running on customer AWS Trainium
- Utilize strong software development and ML knowledge to contribute to the team
Requirements
- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture experience (design patterns, reliability and scaling) of new and existing systems
- Experience programming with at least one software programming language
- Experience with training large ML models using Python
Qualifications
- Bachelor's degree in computer science or equivalent
- 3+ years of non-internship professional software development experience, 2+ years design or architecture experience
Nice to Have
- 3+ years of full software development life cycle experience, including coding standards, code reviews, source control management, build processes, testing, and operations experience
Skills
* Required skills
Benefits
About Amazon Web Services (AWS)
AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure, powering millions of businesses and services worldwide.