Cloud Platform Engineer
Value Innovation Labs
Brooklyn, NY
Contract
Senior Level
10+ years
Posted 2 weeks ago
Interested in this position?
Upload your resume and we'll match you with this and other relevant opportunities.
Upload Your ResumeAbout This Role
Seeking an experienced Cloud Platform Engineer with a strong background in AWS, AI/ML, and data engineering to support large-scale cloud platforms and intelligent data-driven systems within a mission-critical government environment.
Responsibilities
- Monitor system and database performance using AWS CloudWatch metrics, alarms, and logs; proactively troubleshoot and resolve issues
- Design, develop, deploy, and optimize AI/ML solutions using AWS services such as SageMaker and Bedrock, supporting model training, inference, and production integration
- Automate infrastructure and operational workflows using AWS Lambda, Systems Manager (SSM), and Infrastructure as Code (IaC) tools such as CloudFormation and Terraform
- Design, build, and maintain scalable, fault-tolerant AWS data and analytics platforms using API Gateway, S3, EC2, RDS, Lambda, Glue, Athena, DynamoDB, EMR, Kinesis, and DataSync
- Architect and integrate agentic AI systems, including LLM-based agents, multi-agent workflows, and autonomous orchestration pipelines using LangChain and LangGraph
- Build and support ETL/ELT pipelines and data architectures for machine learning, analytics, and intelligent agent-based applications
- Support CI/CD pipelines for AI models and data workflows using Jenkins, ECS, EKS, or Kubernetes
- Apply cloud and AI security best practices, including IAM least-privilege, encryption, audit logging, and compliance controls
- Maintain technical documentation covering AI architectures, data pipelines, infrastructure configurations, and operational runbooks
Requirements
- 10+ years of hands-on AWS experience including EC2, RDS, S3, CloudWatch, CloudTrail, IAM, KMS, AWS Backup, and Lambda
- 10+ years of Linux/Unix administration and automation scripting using Bash, Shell, and Python
- 10+ years of Infrastructure as Code (IaC) and automation experience using CloudFormation, Terraform, and Ansible
- 10+ years of AWS networking experience including VPC, subnets, NACLs, security groups, Route 53, and multi-AZ architectures
- 7+ years of CI/CD experience using Jenkins, IaC, and container platforms (ECS, EKS, Kubernetes) supporting MLOps and autonomous workflows
- 6+ years designing and maintaining scalable data processing workflows using AWS managed services and Python/PySpark, with strong knowledge of ETL/ELT architectures
- 6+ years of experience with AWS AI/ML services such as SageMaker, Bedrock, and OpenSearch / vector databases
- Strong understanding of machine learning algorithms, NLP concepts, and deep learning frameworks including TensorFlow, PyTorch, and Hugging Face
Qualifications
- 10+ years of hands-on AWS experience, Linux/Unix administration, infrastructure as code, and AWS networking. 7+ years of CI/CD experience. 6+ years designing and maintaining scalable data processing workflows and AWS AI/ML services.
Nice to Have
- Local candidates
- US Citizens
Skills
Python
*
AWS
*
Jenkins
*
Kubernetes
*
TensorFlow
*
PyTorch
*
AWS Lambda
*
API Gateway
*
CloudFormation
*
EMR
*
Terraform
*
Shell
*
Ansible
*
IAM
*
DynamoDB
*
EC2
*
S3
*
LangChain
*
Langgraph
*
EKS
*
Bash
*
OpenSearch
*
Athena
*
AWS CloudWatch
*
VPC
*
PySpark
*
Hugging Face
*
Glue
*
ECS
*
RDS
*
Route 53
*
Bedrock
*
Kinesis
*
SageMaker
*
KMS
*
Systems Manager (SSM)
*
DataSync
*
AWS Backup
*
* Required skills