Lead Data Engineer

Honeywell
Innovate to solve the world's most important challenges
The Lead Data Engineer role will be part of a high-performing global team that delivers cutting-edge AI/ML data products for Honeywell's Industrial customers, with a specific focus on IoT and real-time data processing. As a data engineer, you will architect and implement scalable data pipelines that power next-generation AI solutions, including Large Language Models (LLMs), autonomous agents, and real-time inference systems. You will work at the intersection of IoT telemetry data and modern AI technologies to create innovative industrial solutions.
KEY RESPONSIBILITIES
Data Engineering & AI Pipeline Development:
DataOps & Governance:
Technical Leadership & Innovation:
YOU MUST HAVE
WE VALUE
Additional Information
Global (ALL)
Honeywell is an equal opportunity employer. Qualified applicants will be considered without regard to age, race, creed, color, national origin, ancestry, marital status, affectional or sexual orientation, gender identity or expression, disability, nationality, sex, religion, or veteran status.
The Lead Data Engineer role will be part of a high-performing global team that delivers cutting-edge AI/ML data products for Honeywell's Industrial customers, with a specific focus on IoT and real-time data processing. As a data engineer, you will architect and implement scalable data pipelines that power next-generation AI solutions, including Large Language Models (LLMs), autonomous agents, and real-time inference systems. You will work at the intersection of IoT telemetry data and modern AI technologies to create innovative industrial solutions.
KEY RESPONSIBILITIES
Data Engineering & AI Pipeline Development:
- Design and implement scalable data architectures to process high-volume IoT sensor data and telemetry streams, ensuring reliable data capture and processing for AI/ML workloads
- Architect and build data pipelines for AI product lifecycle, including training data preparation, feature engineering, and inference data flows
- Develop and optimize RAG (Retrieval Augmented Generation) systems, including vector databases, embedding pipelines, and efficient retrieval mechanisms
- Design and implement robust data integration solutions that combine industrial IoT data streams with enterprise data sources for AI model training and inference
DataOps & Governance:
- Define a mature DataOps strategies to ensure continuous integration and delivery of data pipelines powering AI solutions
- Lead efforts in data quality, observability, and lineage tracking to maintain high integrity in AI/ML datasets.
- Create self-service data assets enabling data scientists and ML engineers to access and utilize data efficiently
- Design and maintain automated documentation systems for data lineage and AI model provenance
- Ensure compliance with data governance policies, including security, privacy, and regulatory requirements for AI-driven applications
Technical Leadership & Innovation:
- Lead architectural discussions, establish standards and drive technical excellence across teams
- Partner with ML engineers and data scientists to implement efficient data workflows for model training, fine-tuning, and deployment
- Mentor data engineers on standards, best practices, and innovative approaches to build extensible and reusable solution
- Drive innovation, continuous improvement in data engineering practices and tooling
- Manage stakeholder expectations, aligning data engineering roadmaps with business and AI strategy
YOU MUST HAVE
- Minimum 8 years of hands-on experience in building data pipelines using large-scale distributed data processing tools, frameworks & platforms (Python, Spark, Databricks)
- 6+ years of extensive experience in data management concepts, including data modeling, CDC, ETL/ELT processes, data lakes, and data governance.
- 4+ years of hands-on experience with PySpark/Scala
- 4+ years of experience with cloud platforms (Azure/GCP/Databricks) particularly in implementing AI/ML solutions
WE VALUE
- Experience implementing RAG architectures and working with LLM-powered applications
- Expertise in real-time data processing frameworks (Kafka, Apache Spark Streaming, Structured Streaming)
- Knowledge of MLOps practices and experience building data pipelines for AI model deployment
- Experience with time-series databases and IoT data modeling patterns
- Familiarity with containerization (Docker) and orchestration (Kubernetes) for AI workloads
- Strong background in data quality implementation for AI training data
- Experience with graph databases and knowledge graphs for AI applications
- Experience working with distributed teams and cross-functional collaboration
- Knowledge of data security and governance practices for AI systems
- Expertise in version control systems, CI/CD methodologies
- Experience working on analytics projects with Agile and Scrum Methodologies
Additional Information
- JOB ID: HRD9085977
- Category: Engineering
- Location: 715 Peachtree Street, N.E.,Atlanta,Georgia,30308,United States
- Exempt
Global (ALL)
Honeywell is an equal opportunity employer. Qualified applicants will be considered without regard to age, race, creed, color, national origin, ancestry, marital status, affectional or sexual orientation, gender identity or expression, disability, nationality, sex, religion, or veteran status.
JOB SUMMARY
Lead Data Engineer

Honeywell
Atlanta
9 days ago
N/A
Full-time
Lead Data Engineer