Full Job Description
Looking for a seasoned GCP Data Engineer to design, build, and maintain large-scale data pipelines and infrastructure on Google Cloud Platform. You will work closely with data architects, analysts, and product teams to deliver reliable, performant, and cost-efficient data solutions.
Responsibilities
Design and implement scalable data pipelines using GCP-native services (Dataflow, Dataproc, Pub/Sub, Cloud Composer/Airflow)
Architect and optimize BigQuery datasets, tables, and queries for analytical workloads at scale
Design and manage Cloud Spanner schemas for globally distributed, strongly consistent transactional data
Build and maintain data models, transformations, and orchestration workflows using Cloud Workflows and related tools
Develop backend data services and ETL/ELT scripts in Python
Integrate and manage Firestore for real-time, document-oriented data use cases
Design and manage the GraphQL schema
Build highly optimized resolver functions that bridge the GraphQL schema directly to data warehouses
Implement GraphQL Subscriptions to stream live data, event changes, or real-time metrics using message brokers like Apache Kafka
Implement data governance, lineage, and quality frameworks using tools like Dataplex or Data Catalog
Collaborate on infrastructure-as-code using Terraform for GCP resource provisioning
Monitor pipeline health, optimize costs, and troubleshoot production issues
Mentor junior engineers and contribute to architectural decisions and best practices
Requirements
Bachelor's degree in Computer Science, Engineering, or a related field; OR equivalent combination of education and relevant experience.
10+ years of overall experience in data engineering or a related field
5+ years of hands-on experience on Google Cloud Platform
Strong proficiency in Python for data processing, automation, and pipeline development
Deep expertise in BigQuery - schema design, partitioning, clustering, query optimization, cost governance
Production experience with Cloud Spanner - schema design, interleaving, transaction patterns, and performance tuning
Solid understanding of GCP data services: Dataflow, Pub/Sub, Cloud Storage, Dataproc, Cloud Composer
Experience with Cloud Workflows for serverless orchestration
Hands-on experience with Firestore (Native mode preferred) for NoSQL/document storage patterns
Strong SQL skills and understanding of data warehousing concepts
Experience with CI/CD pipelines (Cloud Build, GitHub Actions) and version control (Git)
Preferred / Nice to Have
Experience with dbt for transformation layer on BigQuery
Familiarity with streaming architectures (exactly-once semantics, late data handling)
Knowledge of data mesh or data lakehouse patterns
Exposure to Vertex AI or ML pipelines for MLOps workflows
GCP Professional Data Engineer certification