AWS DevOps Engineer (With deep expertise in MLOps)

Jobs via Dice Plano, TX
Full Time Mid Level 5+ years

Posted 1 month ago Expired

This job has expired

Looking for a job like AWS DevOps Engineer (With deep expertise in MLOps) in or near Plano, TX? Upload your resume and we'll notify you when similar positions become available.

Upload Your Resume

About This Role

This role will design, deploy, and support a scalable SageMaker platform for Toyota Financial Services, accelerating the end-to-end machine learning lifecycle. The engineer will empower data scientists by enabling seamless model development, versioning, deployment, and monitoring in production.

Responsibilities

  • Design, deploy, and maintain a robust AWS SageMaker platform to support the full ML lifecycle
  • Collaborate closely with data scientists to productionize machine learning models, ensuring scalability, reliability, and performance
  • Implement model versioning, lineage tracking, and governance to support reproducibility and auditability
  • Build and maintain MLOps pipelines that automate continuous integration, continuous delivery (CI/CD), and continuous training (CT) of ML models
  • Manage AWS infrastructure including EC2, ECS Fargate, ALB, S3, DynamoDB, OpenSearch, and AWS Bedrock to support AI/ML workloads
  • Enforce enterprise security best practices using IAM, Guardrails, and AWS security services
  • Configure and manage Single Sign-On (SSO) integration with Okta and Mulesoft proxy for secure platform access
  • Automate infrastructure provisioning and management using Infrastructure as Code (IaC) tools such as Terraform and OpenTofu
  • Monitor deployed models and infrastructure for performance, drift, and anomalies; implement alerting and remediation workflows
  • Support containerized microservices architecture with Python-based services, establishing CI/CD pipelines for rapid deployment
  • Stay current with AWS services, MLOps frameworks, AI ecosystem trends, and DevOps best practices

Requirements

  • 5+ years of hands-on experience in AWS cloud infrastructure and DevOps engineering with a strong focus on MLOps
  • Expertise in AWS services: SageMaker (including SageMaker Pipelines, Model Registry), ECS Fargate, EC2, ALB, S3, DynamoDB, OpenSearch, AWS Bedrock
  • Proven experience in productionizing ML models, managing model versioning, lineage, and lifecycle
  • Strong skills in Infrastructure as Code using Terraform and OpenTofu
  • Experience designing and implementing CI/CD pipelines for ML workflows and Python microservices
  • Deep understanding of AWS security best practices, IAM policies, Guardrails, and enterprise security frameworks
  • Experience integrating Single Sign-On (SSO) solutions using Okta and Mulesoft proxy
  • Familiarity with ML lifecycle management tools and frameworks
  • Strong scripting and automation skills
  • Excellent problem-solving, collaboration, and communication skills

Qualifications

  • 5+ years of hands-on experience in AWS cloud infrastructure and DevOps engineering with a strong focus on MLOps.

Nice to Have

  • AWS certifications (e.g., AWS Certified DevOps Engineer, AWS Certified Machine Learning Specialty)
  • Experience with container orchestration, microservices, and serverless architectures
  • Knowledge of monitoring, logging, and alerting tools for ML models and cloud infrastructure
  • Familiarity with open-source MLOps tools (e.g., MLflow, Kubeflow)

Skills

Python * AWS EC2 * AWS S3 * CI/CD * Terraform * MuleSoft * IAM * OpenSearch * AWS SageMaker * MLOps * AWS ECS Fargate * AWS ALB * AWS DynamoDB * AWS Bedrock * Okta * OpenTofu *

* Required skills

Benefits

Paid Time Off
401(k) savings plan with company match
Toyota Team Member Lease Vehicle Program
Paid Holidays
Wellness plans
Team Member Vehicle Purchase Discount
Comprehensive health care plans
Tuition Reimbursement
Professional growth and development programs
Annual retirement contribution from Toyota

About Jobs via Dice

Professional Services
View all jobs at Jobs via Dice →