Senior ML Infrastructure Engineer (Compute)

General Motors Mountain View, CA
Full Time Senior Level 4+ years

Posted 1 month ago Expired

This job has expired

Looking for a job like Senior ML Infrastructure Engineer (Compute) in or near Mountain View, CA? Upload your resume and we'll notify you when similar positions become available.

Upload Your Resume

About This Role

This role involves building and scaling robust Compute platforms for Simulation workflows, focusing on high utilization of cutting-edge GPUs and ensuring the reliability of the platform. The engineer will play a key role in shaping the architecture, roadmap, and user experience of a robust service supporting AI Validation/Simulation needs.

Responsibilities

  • Design and implement core platform backend software components
  • Collaborate with Simulation engineers, ML engineers and researchers to understand critical workflows, parse them to platform requirements, and deliver incremental value
  • Lead technical decision-making on Compute architecture, cloud capacity provisioning, caching, and auto-scaling mechanisms
  • Drive the development of monitoring, observability, and metrics to ensure reliability, performance, and resource optimization
  • Proactively research and integrate frameworks, hardware accelerators, and distributed computing techniques
  • Lead large-scale technical initiatives across GM’s ML infrastructure
  • Raise the engineering bar through technical leadership and by establishing best practices

Requirements

  • 4+ years of industry experience, with a focus on high performance backend services
  • Strong expertise in Go, or other similar coding languages
  • Experience working with cloud platforms such as GCP, Azure, or AWS
  • Experience in delivering cross-functional initiatives
  • Strong communication skills and a proven ability to drive cross-functional initiatives
  • Ability to thrive in a dynamic, multi-tasking environment with ever-evolving priorities

Qualifications

  • 4+ years of industry experience, with a focus on high performance backend services

Nice to Have

  • Hands-on experience with Cloud VM services Google Compute Engine
  • Experience with hardware-in-the-loop validation systems
  • Experience with high performance computing (HPC)
  • Experience working with or designing interfaces and clients for developer workflows
  • Familiarity with telemetry, and other feedback loops to inform product improvements
  • Familiarity with hardware acceleration (GPUs) and optimizations

Skills

AWS * Azure * Go * Distributed Systems * GCP * Google Compute Engine * GPUs *

* Required skills

About General Motors

The AI Validation Platform team owns the cloud-agnostic, reliable, and cost-efficient platform that powers GM’s AV efforts, supporting the simulated validation of state-of-the-art (SOTA) machine learning models.

Automotive
View all jobs at General Motors →