Interested in this position?
Upload your resume and we'll match you with this and other relevant opportunities.
Upload Your ResumeAbout This Role
This role involves acting as a hands-on technical lead to define architecture, code, deploy, and maintain scalable ETL pipelines and data structures within Syneos Health's Clinical Development model. The position focuses on implementing the Translational Data Lake data ingestion, managing complex datasets into modern cloud architectures, and designing bespoke integration solutions for diverse scientific data sources.
Responsibilities
- Act as a hands-on technical lead who defines the architecture, codes, deploys, and maintains scalable ETL pipelines and data structures
- Spearhead the technical implementation of the Translational Data Lake data ingestion, managing the ingestion of complex datasets into modern cloud architectures
- Lead data engineering projects beyond the Data Lake, designing bespoke integration solutions for diverse scientific data sources across the Research organization
- Design and script automated procedures to normalize unformatted data from external vendors (CROs) into a structured Common Data Model (CDM)
- Partner with various functions in Research and IT to align infrastructure with scientific needs, ensuring solutions are robust, FAIR-compliant, and scalable
- Develop and communicate the technical vision for biomarker data integration and reuse
- Architect and implement scalable ETL procedures, APIs and front-end tools for data access and visualization
- Engage stakeholders to gather requirements and incorporate feedback into design
- Lead user acceptance testing (UAT) and ensure high-quality deliverables
- Collaborate with IT and Translational leads to align infrastructure and governance processes
- Champion FAIR principles and interoperability across translational and clinical programs
Requirements
- 8+ years of professional experience in data engineering or software architecture
- Expert-level coding proficiency in Python with mastery of modern data engineering libraries (Pandas, PySpark, Dask, SQLAlchemy)
- Advanced proficiency with SQL
- Advanced proficiency with workflow orchestration tools (Airflow, Dagster, or Prefect)
- Advanced proficiency with containerization (Docker/Kubernetes)
- Deep experience with modern Data Lake and Lakehouse architectures (e.g., Azure Fabric, Databricks, Snowflake)
- Proven track record of connecting and integrating disparate data sources
- Solid understanding of data modeling, ETL processes, and schema design for complex datasets
- Experience designing and deploying APIs for data access
- Excellent communication skills to bridge the gap between IT infrastructure and scientific stakeholders
- Familiarity with FAIR principles and metadata standards for scientific data
Qualifications
- Bachelor’s or master’s degree in computer science, Data Engineering, Bioinformatics, or related field
- 8+ years of professional experience in data engineering or software architecture, with a focus on building production-grade data pipelines
Nice to Have
- Familiarity with clinical data standards including SDTM, ADaM, and CDISC, and biomarker data formats (NGS variant results, flow cytometry, serum proteomics, gene expression profiling)
- Direct experience with Azure Fabric tools for connecting and integrating data sources
- Proficiency in R for interoperability with bioinformatics teams
Skills
* Required skills
Benefits
About Syneos Health
Syneos Health® is a leading fully integrated biopharmaceutical solutions organization built to accelerate customer success. They translate unique clinical, medical affairs and commercial insights into outcomes to address modern market realities.