Senior Machine Learning Applications and Compiler Engineer

NVIDIA Santa Clara, CA $152,000 - $287,500
Full Time Senior Level 5+ years

Posted 2 months ago Expired

This job has expired

Looking for a job like Senior Machine Learning Applications and Compiler Engineer in or near Santa Clara, CA? Upload your resume and we'll notify you when similar positions become available.

Upload Your Resume

About This Role

NVIDIA is seeking a Senior Machine Learning Applications and Compiler Engineer to develop algorithms and optimizations for inference and compiler stacks. This role involves working at the intersection of large-scale systems, compilers, and deep learning, defining how neural network workloads map onto future NVIDIA platforms.

Responsibilities

  • Build, develop, and maintain high-performance runtime and compiler components, focusing on end-to-end inference optimization.
  • Define and implement mappings of large-scale inference workloads onto NVIDIA’s systems.
  • Extend and integrate with NVIDIA’s SW ecosystem, contributing to libraries, tooling, and interfaces.
  • Benchmark, profile, and monitor key performance and efficiency metrics to ensure the compiler generates efficient mappings.
  • Collaborate closely with hardware architects and design teams to feedback software observations and influence future architectures.
  • Prototype and evaluate new compilation and runtime techniques, including graph transformations, scheduling strategies, and memory/layout optimizations.
  • Publish and present technical work on novel compilation approaches for inference and related spatial accelerators at top tier ML, compiler, and computer architecture venues.

Requirements

  • MS or PhD in Computer Science, Electrical/Computer Engineering, or related field, or equivalent experience, with 5 years of relevant experience.
  • Strong software engineering background with proficiency in systems level programming (e.g., C/C++ and/or Rust) and solid CS fundamentals in data structures, algorithms, and concurrency.
  • Hands on experience with compiler or runtime development, including IR design, optimization passes, or code generation.
  • Experience with LLVM and/or MLIR, including building custom passes, dialects, or integrations.
  • Familiarity with deep learning frameworks such as TensorFlow and PyTorch, and experience working with portable graph formats such as ONNX.
  • Solid understanding of parallel and heterogeneous compute architectures, such as GPUs, spatial accelerators, or other domain specific processors.
  • Strong analytical and debugging skills, with experience using profiling, tracing, and benchmarking tools to drive performance improvements.
  • Excellent communication and collaboration skills, with the ability to work across hardware, systems, and software teams.

Qualifications

  • MS or PhD in Computer Science, Electrical/Computer Engineering, or related field, or equivalent experience
  • 5 years of relevant experience.

Nice to Have

  • Prior work on spatial or dataflow architectures, including static scheduling, pipeline parallelism, or tensor parallelism at scale.
  • Contributions to opensource ML frameworks, compilers, or runtime systems, particularly in areas related to performance or scalability.
  • Demonstrated research impact, such as publications or presentations at conferences like PLDI, CGO, ASPLOS, ISCA, MICRO, MLSys, NeurIPS, or similar.
  • Experience with large-scale AI distributed inference or training systems, including performance modeling and capacity planning for multi rack deployments.
  • Direct experience with MLIR based compilers or other multilevel IR stacks, especially in the context of graph based deep learning workloads.

Skills

Communication * TensorFlow * PyTorch * C/C++ * Collaboration * Rust * Deep Learning * Algorithms * Data Structures * Debugging * Profiling * ONNX * Software Engineering * GPUs * Tracing * Concurrency * Benchmarking * LLVM * MLIR *

* Required skills

Benefits

Equity

About NVIDIA

Technology
View all jobs at NVIDIA →