Senior Data Engineer
Full Time
Senior Level
5+ years
Posted 3 weeks ago
Interested in this position?
Upload your resume and we'll match you with this and other relevant opportunities.
Upload Your ResumeAbout This Role
Serve as a senior technical contributor in a contingent role, focusing on large-scale data engineering initiatives. Responsibilities include designing and implementing scalable data lake architectures and optimizing data processing on Google Cloud Platform.
Responsibilities
- Design and implement scalable data lake architectures (e.g., Bronze/Silver/Gold layered models)
- Define Cloud Storage (GCS) architecture including bucket structures, naming standards, lifecycle policies, and IAM models
- Apply best practices for Hadoop/HDFS-like storage, distributed file systems, and data locality
- Work with columnar formats (Parquet, Avro, ORC) and compression for performance and cost optimization
- Develop effective partitioning strategies, organization techniques, and backfill approaches
- Build curated and analytical data models optimized for BI and visualization tools
- Build batch and streaming ingestion pipelines using Google Cloud Platform-native tools
- Develop workflows using Cloud Composer / Apache Airflow
- Build scalable batch and streaming data pipelines using Dataflow (Apache Beam) and/or Spark (Dataproc)
- Write optimized BigQuery SQL leveraging clustering, partitioning, and cost-efficient design
- Write production-grade Python for data engineering with maintainable, testable code
- Implement metadata management, cataloging, and ownership standards
- Build data quality frameworks (validation, freshness, SLAs, alerting)
- Manage Google Cloud Platform environments including project setup, resource boundaries, billing, quotas, and cost optimization
- Build and maintain CI/CD pipelines for data platform and pipeline deployments
Requirements
- 5+ years of software engineering or data engineering experience
- Deep experience in cloud-native data platforms
- Experience with large-scale distributed processing
- Experience with advanced analytics data models
- Cloud Storage (GCS) architecture design expertise
- Experience with columnar formats (Parquet, Avro, ORC)
- Proficiency in Google Cloud Platform-native ingestion tools
- Experience with Cloud Composer / Apache Airflow
- Skilled in Dataflow (Apache Beam) and/or Spark (Dataproc)
- Ability to write optimized BigQuery SQL
- Proficiency in Python for data engineering
- Experience with metadata management, cataloging, and data quality frameworks
- Knowledge of Google Cloud Platform environment management and cost optimization
- Ability to implement IAM best practices
- Experience with CI/CD pipelines for data platforms
Qualifications
- 5+ years of software engineering or data engineering experience
Nice to Have
- Experience with VPC Service Controls, perimeter security, and data exfiltration prevention
- Understanding of PII protection, data masking, tokenization, and audit/compliance practices
Skills
Python
*
SQL
*
CI/CD
*
BigQuery
*
IAM
*
Apache Airflow
*
Google Cloud Platform
*
Spark
*
Hadoop
*
Pig
*
HIVE
*
Dataflow
*
Pub/Sub
*
Parquet
*
Cloud Composer
*
Cloud Storage (GCS)
*
HDFS
*
Avro
*
ORC
*
Apache Beam
*
Dataproc
*
Sqoop
*
KMS/CMEK
*
Google Cloud Platform Secret Manager
*
P4
*
P2
*
* Required skills
Related Searches
Similar Jobs
Site Reliability Engineer (Performance)
Active
Jobs via Dice
·
Westlake, TX
Python
SQL
Splunk
Java
+26 more
1 week ago
Senior Data Engineer
Active Remote
Jobgether
·
$185,000 - $200,000
Python
SQL
Salesforce
ServiceNow
+4 more
1 week ago
Senior Data Engineer
Active
Loopback Health
·
Dallas, TX
Python
SQL
AWS
Azure
+17 more
1 week ago
Senior Data Engineer
Active
Capital One
·
Plano, TX
·
$147,100 - $167,900
Python
AWS
Google Cloud
Java
+16 more
2 weeks ago
Senior Data Engineer
Active Remote
James Search Group
·
$120,000 - $150,000
Python
SQL
AWS
DevOps
+16 more
2 weeks ago