Senior Data Engineer

Honeywell Atlanta, GA
Full Time Senior Level 5+ years

Posted 2 weeks ago

Interested in this position?

Upload your resume and we'll match you with this and other relevant opportunities.

Upload Your Resume

About This Role

As a Senior Data Engineer, you will design and implement scalable data architectures and pipelines focusing on IoT and real-time data processing for Honeywell’s industrial customers. This role involves transforming high-volume IoT telemetry into reliable, actionable insights to enable advanced AI and data solutions. You will work within a high-performing global team to deliver next-generation AI capabilities including large-scale machine learning, intelligent automation, and real-time analytics.

Responsibilities

  • Design and implement scalable data architectures to process high-volume IoT sensor data and telemetry streams
  • Build and maintain data pipelines for AI product lifecycle, including training data preparation, feature engineering, and inference data flows
  • Develop and optimize RAG (Retrieval Augmented Generation) systems, vector databases, embedding pipelines, and efficient retrieval mechanisms
  • Lead the architecture and development of scalable data platforms on Databricks
  • Drive the integration of GenAI capabilities into data workflows and applications
  • Optimize data processing for performance, cost, and reliability at scale
  • Create robust data integration solutions combining industrial IoT data with enterprise data sources for AI model training and inference
  • Implement DataOps practices for continuous integration and delivery of data pipelines powering AI solutions
  • Design and maintain automated testing frameworks for data quality, data drift detection, and AI model performance monitoring
  • Partner with ML engineers and data scientists to implement efficient data workflows for model training, fine-tuning, and deployment
  • Mentor team members and provide technical leadership on complex data engineering challenges

Requirements

  • Minimum 5 years of experience building production data pipelines in Databricks processing TB scale data
  • Extensive experience implementing medallion architecture (Bronze/Silver/Gold) with Delta Lake, Delta Live Tables (DLT), and Lakeflow for batch and streaming pipelines from Event Hub or Kafka sources
  • Strong hands-on proficiency with PySpark for distributed data processing and transformation
  • Strong experience working with cloud platforms such as Azure, GCP and Databricks, especially in designing and implementing AI/ML-driven data workflows
  • Proficient in CI/CD practices using Databricks Asset Bundles (DAB), Git workflows, GitHub Actions, and understanding of DataOps practices including data quality testing and observability
  • Hands-on experience building RAG applications with vector databases, LLM integration, and agentic frameworks like LangChain, LangGraph
  • Natural analytical mindset with demonstrated ability to explore data, debug complex distributed systems, and optimize pipeline performance at scale

Qualifications

  • Minimum 5 years of experience building production data pipelines in Databricks processing TB scale data

Nice to Have

  • Experience building RAG and agentic architecture solutions and working with LLM-powered applications
  • Expertise in real-time data processing frameworks (Apache Spark Streaming, Structured Streaming)
  • Knowledge of MLOps practices and experience building data pipelines for AI model deployment
  • Experience with time-series databases and IoT data modeling patterns
  • Familiarity with containerization (Docker) and orchestration (Kubernetes) for AI workloads
  • Strong background in data quality implementation for AI training data
  • Experience working with distributed teams and cross-functional collaboration
  • Knowledge of data security and governance practices for AI systems
  • Experience working on analytics projects with Agile and Scrum Methodologies

Skills

Azure * Kubernetes * Docker * Agile * Scrum * Git * Databricks * LangChain * Kafka * GCP * Langgraph * LLM * MLOps * GitHub Actions * Delta Lake * Vector Databases * PySpark * DataOps * Event Hub * Structured Streaming * Delta Live Tables (DLT) * Lakeflow * Databricks Asset Bundles (DAB) * Apache Spark Streaming *

* Required skills

Benefits

Life Insurance
Short-Term Disability
Medical
Dental
Educational Assistance
401(k) Match
Paid Holidays
Long-Term Disability
Vision
EAP
Paid Time Off
Parental Leave
Flexible spending accounts
Health Savings Accounts

About Honeywell

Honeywell invents and commercializes technologies that address some of the world's most critical challenges around energy, safety, security, air travel, productivity, and global urbanization. They are a software-industrial company committed to introducing state-of-the-art technology solutions to imp...

Manufacturing
View all jobs at Honeywell →