Senior Software Engineer, Infrastructure Software for AI
Posted 3 weeks ago
Interested in this position?
Upload your resume and we'll match you with this and other relevant opportunities.
Upload Your ResumeAbout This Role
Join our infrastructure team to develop foundational systems software for large-scale AI workloads on next-generation GPU platforms, emphasizing Kubernetes orchestration and GPU resource management. You will drive architectural innovation to maximize efficiency and utilization.
Responsibilities
- Design and develop robust systems software to enable AI workloads on large-scale GPU clusters (e.g., GB200 NVL72 rack-scale systems)
- Deliver critical control plane components for workload scheduling, orchestration, and resource management
- Build management plane for underlying hardware platforms
- Create northbound APIs and interfaces for customer portals and self-service access to the infrastructure
- Contribute to product requirements documents (PRDs), sprint planning, and agile program execution
- Help attract, mentor, and grow top engineering talent
- Exemplify and cultivate a culture of humility, bold innovation, and disciplined delivery to bring products to market
Requirements
- 5+ years of experience in software engineering, hardware platforms, distributed systems, or infrastructure development
- 2+ years in technical lead roles, owning high-impact projects and leading teams
- Proven hands-on experience building systems software, AI frameworks, or applied AI systems
Qualifications
- Bachelor's degree in computer science, Electrical Engineering, or a related technical field.
- 5+ years of experience in software engineering, hardware platforms, distributed systems, or infrastructure development; 2+ years in technical lead roles.
Nice to Have
- Master's degree in a relevant technical discipline (e.g., CS, Systems Engineering)
- Direct experience with Kubernetes and container orchestration at scale
- Hands-on work with GPU-accelerated systems and high-performance computing (HPC) environments
- Expertise in designing scalable infrastructure for demanding AI workloads (training, fine-tuning, serving)
- Familiarity with AI developer frameworks, MLOps tools, automation pipelines, and CI/CD systems
Skills
* Required skills
About VeeAR Projects Inc.
VeeAR Projects Inc. challenges conventional limits by building transformative products that fully exploit state-of-the-art infrastructure including NVIDIA GB200, MGX modular architectures, and DGX Grace Hopper platforms, combined with cloud-native software, to power centralized AI data centers and d...