What You'll do
- Lead the design and delivery of end-to-end data platforms and pipelines (batch and streaming) using AWS services to support analytics, reporting, ML, and operational use cases.
- Define and own data architecture decisions, standards, and best practices. Produce and maintain Architecture Decision Records (ADRs), runbooks, and high-quality diagrams.
- Architect and implement data lake and/or data warehouse solutions (ingestion, storage, transformation, serving) using S3, Glue, Redshift, Athena, and related services.
- Build and optimize ETL/ELT workflows with Spark, Glue, Lambda, Step Functions, and orchestration frameworks (Airflow, Dagster).
- Design and operate streaming ingestion and processing using Kinesis, MSK, and Kafka ecosystems; collaborate on re-architecture efforts to reduce cost and improve resilience.
- Drive data modeling, schema design, partitioning, and query optimization for large data sets to meet performance and cost targets.
- Establish and operationalize data quality, validation, observability, lineage, and metadata practices (Glue Data Catalog, Data Hub/Amundsen or equivalent).
- Implement robust security, governance, and compliance controls (IAM, Lake Formation, KMS, VPC, data masking/security scanning) and collaborate with security and privacy teams.
- Champion cost optimization across the data stack (serverless architectures, right-sizing, Graviton/spot/commitment strategies) and provide continuous cost governance.
- Own production reliability: monitoring, SLIs/SLOs, incident response, postmortems, and continuous improvements.
- Lead and mentor engineers, run design reviews, and promote cross-team collaboration with product, analytics, ML, and platform teams.
- Influence roadmap, capacity planning, and hiring for the data engineering function; represent the team in senior technical forums.
What You'll bring to the role
- 7+ years of hands-on experience in data engineering, architecting and operating data platforms at scale.
- Demonstrated leadership on large, complex data projects and strong ability to influence technical and product stakeholders.
- Extensive hands-on experience with AWS data services: S3, Glue, Redshift (Spectrum/RA3), Athena, EMR, Lambda, Kinesis/MSK, DMS.
- Strong programming skills in Python and/or Scala (or Java); proven track record writing production-quality, testable code.
- Deep expertise in SQL with experience designing and optimizing queries for high-volume data.
- Experience with distributed processing frameworks (Spark) and modern ETL/ELT patterns.
- Proficiency with orchestration and workflow tools (Airflow, Step Functions, Dagster) and CI/CD for data pipelines.
- Strong skills in infrastructure-as-code (Terraform, CloudFormation, or CDK) and GitOps practices.
- Experience implementing data security, governance, and access controls at scale.
- Expertise in observability, monitoring, and incident management (CloudWatch, Prometheus, Grafana, structured logging, tracing).
- Excellent communication skills and proven ability to mentor and lead engineers.
Preferred Qualifications
- AWS certifications (e.g., AWS Certified Data Analytics - Specialty, AWS Certified Solutions Architect) or equivalent.
- Experience with data lakehouse technologies (Delta Lake, Iceberg, Databricks Unity Catalog), and large-scale performance troubleshooting.
- Familiarity with metadata and lineage tooling (Amundsen, DataHub, OpenLineage).
- Prior experience with cost-optimization programs (Graviton, serverless, spot instances) and demonstrable results.
- Contributions to open-source projects, publications, or conference presentations on data engineering topics.
- Experience working in regulated industries (finance, healthcare) and implementing compliance controls.
-AI usage for building Data Pipelines.
Skills You'll have
- Strategic thinker with strong problem-solving and architectural judgment.
- Ability to manage multiple complex priorities and deliver high-quality work under deadlines.
- Strong mentoring and team leadership skills; fosters a culture of knowledge sharing and continuous improvement.
- Customer-focused: translates business needs into pragmatic, maintainable technical solutions.
- Strong attention to detail, excellent documentation habits, and commitment to operational excellence.
Compensation Range:
Pay Range - Start:
$118,960.00
Pay Range - End:
$178,440.00
Geographic Specific Pay Structure:
Structure 110:
$130,880.00 USD - $196,320.00 USD
Structure 115:
$136,800.00 USD - $205,200.00 USD
We believe in fairness and transparency. It's why we share the salary range for most of our roles. However, final salaries are based on a number of factors, including the skills and experience of the candidate; the current market; location of the candidate; and other factors uncovered in the hiring process. The standard pay structure is listed but if you're living in California, New York City or other eligible location, geographic specific pay structures, compensation and benefits could be applicable, click here to learn more.