Guidehouse

Data Infrastructure Engineer

Guidehouse$98K — $163K *
US-AnywhereRemote in United States
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Bachelor's degree in Engineering, IT, Computer Science, or related field (or equivalent experience)
  • 4+ years of experience in building production data pipelines and/or data platforms
  • Strong experience with data ingestion and ETL/ELT workflows
  • Hands-on experience with cloud-based data lakes/delta lakes on AWS
  • Proficient in SQL and a programming language (Python preferred, Scala/Java accepted)
  • Experience in metadata management and governance
  • Hands-on experience with automated AWS provisioning using IaC

Responsibilities

  • Design and implement batch and streaming data ingestion from various sources
  • Build and optimize ETL/ELT pipelines for analytics-ready datasets
  • Manage a scalable lakehouse architecture on AWS S3
  • Establish environment standards for development and production environments
  • Implement a managed metadata repository for dataset cataloging and governance
  • Build operational data quality checks and publish SLAs/SLOs
  • Collaborate with cross-functional teams to support engineering needs

Benefits

  • Medical, Rx, Dental & Vision Insurance
  • Personal and Family Sick Time & Company Paid Holidays
  • Parental Leave
  • 401(k) Retirement Plan
  • Tuition Reimbursement, Certifications & Learning Opportunities
  • Employee Assistance Program
  • Corporate Sponsored Events & Community Outreach
Full Job Description

Job Family:

Software Development & Support


Travel Required:

None


Clearance Required:

Ability to Obtain Public Trust

We are seeking aData Infrastructure Engineerto build andoperatethe data platform that powersAI/ML analytics modules. You will design and implement scalabledata ingestion pipelines, robustETL/ELT, and a moderndata lake / delta lake (lakehouse)onAWS.Youllalsoestablishamanaged metadata repositoryand governance layers (catalog, lineage, quality, access controls) and deliverautomated cloud provisioningplusCI/CD for data pipelinesto enable reliable, repeatable deployments across environments.

This role is ideal for an engineer who enjoys platform building, automation, and enabling advanced analytics through trusted, well-governed data.

What You Will Do:

Build &OperateData Pipelines (Batch + Streaming)

  • Design and implementbatch and streaming ingestionfrom APIs, relational databases, file drops, event streams, and external partners.

  • Build andoptimizeETL/ELT pipelinesto produce curated, analytics-ready datasets for reporting and ML consumption.

  • Implement incremental processing patterns, change data capture (CDC) approaches whereappropriate, and data contract standards.

Deliver a Modern Lakehouse (Data Lake / Delta Lake)

  • Build and manage a scalablelakehouseon AWS object storage (e.g., S3) using open table/file formats and delta/lakehouseconcepts (e.g., ACID tables, schema evolution, time travel patterns).

  • Optimizeperformance and cost through partitioning, compaction, lifecycle policies, and efficient compute/storage usage.

  • Establish environment standards for dev/test/prod and consistent promotion across stages.

Metadata, Governance, Lineage & Quality (Trust Layer)

  • Implement amanaged metadata repositoryfor dataset cataloging, ownership, glossary/definitions, tagging, and discoverability.

  • Enableend-to-end lineage(source 1 transformations 1 consumption) to support auditability and impact analysis.

  • Implementgovernance controls including policy-based access, data classification, retention, and secure data handling.

  • Build operational data quality checks (freshness, completeness, validity, anomaly detection) and publish SLAs/SLOs.

AWS Automation + CI/CD for Data Pipelines

  • Implementautomated cloud provisioning in AWSusing Infrastructure as Code (IaC) for consistent environments and secure-by-default baselines.

  • Build and enhanceCI/CD for data pipelines, including automated tests, validation gates, promotion workflows, and rollback strategies.

  • Improve observability with metrics/logs/alerts, dashboards, runbooks, and incident response readiness.

Cross-Team Collaboration & Documentation

  • Work closely with engineering, security, networking, and application teams to support mission needs and delivery timelines.

  • Maintain high-quality engineering documentation includingSOPs, system diagrams, and secure configuration baselines.

  • Summarize and present findings and recommendationsboth written and verbalto technical and non-technical stakeholders.

What You Will Need:

  • Must be able to OBTAIN and MAINTAIN a Federal or DoD 2PUBLIC TRUST2; candidates must obtain approved adjudication of their PUBLIC TRUST prior to onboarding with Guidehouse. Candidates with an ACTIVE PUBLIC TRUST or SUITABILITY are preferred.

  • Bachelors degree in Engineering, IT, Computer Science, or related field (or equivalent experience).

  • Minimum of FOUR (4) years experience building production data pipelines and/or data platforms.

  • Strong experience implementingdata ingestionandETL/ELTworkflows, including data modeling and transformation best practices.

  • Hands-on experience building adata lake / delta lake (lakehouse)on AWS (or equivalent cloud) using object storage and modern table formats/patterns.

  • ProficiencyinSQLand one programming language commonly used for data engineering (Python preferred; Scala/Java acceptable).

  • Experience withmetadata management and governance: cataloging, lineage, ownership, access controls,classificationand policy enforcement.

  • Experience implementingautomated AWS provisioningusingIaCandoperatingacross multiple environments.

  • Experience building or operatingCI/CD pipelinesfor data workflows (testing, packaging, deployment automation, environment promotion).

  • Solid security fundamentals: IAM/least privilege, encryption,secretsmanagement, secure SDLC practices.

What Would Be Nice To Have:

  • Hands-on experience with Databricks

  • Hands-on experienceutilizingmodern DevOps practices, including tools likeGit, Terraform, Jenkins, AWSCodePipeline, and Docker.

  • ExperienceutilizingAI-assisted coding tools(e.g.,GitHub Copilot, ChatGPT, Cursor, Kiro) to safely accelerate implementation whilemaintainingstrict code quality throughtesting, code reviews, and security practices.

  • Knowledge graph and Graph RAGexperience, including:

    • Graph modeling and ontology/taxonomy alignment

    • Entity resolution and relationship extraction

    • Hybrid retrieval approaches combining graph traversal with semantic/vector search to improve grounding and explainability

#LI-DNI

The annual salary range for this position is $98,000.00-$163,000.00. Compensation decisions depend on a wide range of factors, including but not limited to skill sets, experience and training, security clearances, licensure and certifications, and other business and organizational needs.


What We Offer:

Guidehouse offers a comprehensive, total rewards package that includes competitive compensation and a flexible benefits package that reflects our commitment to creating a diverse and supportive workplace.

Benefits include:

  • Medical, Rx, Dental & Vision Insurance

  • Personal and Family Sick Time & Company Paid Holidays

  • Parental Leave

  • 401(k) Retirement Plan

  • Group Term Life and Travel Assistance

  • Voluntary Life and AD&D Insurance

  • Health Savings Account, Health Care & Dependent Care Flexible Spending Accounts

  • Transit and Parking Commuter Benefits

  • Short-Term & Long-Term Disability

  • Tuition Reimbursement, Personal Development, Certifications & Learning Opportunities

  • Employee Referral Program

  • Corporate Sponsored Events & Community Outreach

  • Care.com annual membership

  • Employee Assistance Program

  • Supplemental Benefits via Corestream (Critical Care, Hospital Indemnity, Accident Insurance, Legal Assistance and ID theft protection, etc.)

  • Position may be eligible for a discretionary variable incentive bonus

About Guidehouse

Guidehouse is a management consulting firm headquartered in Washington, D.C. The firm provides consulting services to clients in the public and commercial sectors, with a focus on energy, financial services, healthcare, national security, and aerospace and defense. Guidehouse was founded in 2018 as a spin-off from PwC. The firm has over 7,000 employees and operates in more than 50 locations worldwide.
Learn more about Guidehouse
Size
8,000 employees
Industry
Founded
2018

Similar Jobs

More Jobs at Guidehouse

More Information Technology Jobs

Find similar Data Infrastructure Engineer jobs: