Apache Spark Developer

Absolute Business Solutions Corp.

$100K — $130K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • TS/SCI eligibility with ability/willingness to obtain CI polygraph
  • Bachelor's degree plus 5 years' experience in data engineering or Spark development (or additional years' experience in lieu of degree)
  • Strong hands-on experience with Apache Spark (mandatory) and Python (PySpark)
  • Experience with Parquet and/or Delta Lake
  • Familiarity with Docker/containerization and Kubernetes (basic to intermediate level)
  • Experience with object storage systems (e.g., S3 or equivalent)
  • Strong troubleshooting and performance tuning skills

Responsibilities

  • Design, develop, and maintain Apache Spark pipelines using PySpark and/or Scala
  • Process and transform large-scale datasets using modern data lake architectures
  • Optimize Spark jobs for performance through various techniques
  • Implement Structured Streaming pipelines for near real-time data processing
  • Develop and deploy Spark applications within containerized environments
  • Execute workloads in Kubernetes clusters for scalable processing
  • Troubleshoot data pipeline failures and ensure reliability in mission-critical environments

Benefits

  • Generous PTO plus 11 Federal Holidays
  • 401k Fully Vested with Match
  • Tuition Assistance Program for loan repayment
  • Annual Health and Wellness Allowance
  • Annual Funds for Education and Training
  • 8 hours of Volunteer Time Off for charity support
  • Charitable Match program for donations
  • Referral Program for internal and external referrals
  • Bonuses through the Living Our Values awards program
Full Job Description
We are actively hiring a TS/SCI-cleared Apache Spark Developer to support NGA's Data Modernization Services (DMS) mission by building and optimizing large-scale data processing pipelines. This role focuses on developing high-performance Spark applications within a containerized, Kubernetes-based environment, supporting mission analytics, data exploitation, and AI/ML integration. The ideal candidate thrives in distributed data environments, understands performance tuning deeply, and can operate effectively in secure, air-gapped systems.

This role is on-site/flexible hours in Herndon, VA; Springfield, VA; St. Louis, MO; or Aurora, CO.

Clearance Required for this role: TS/SCI eligibility with willingness/ability to obtain CI polygraph.

Core Technology Stack

Data / Processing
  • Apache Spark (PySpark, Scala)
  • Delta Lake, Parquet
  • Structured Streaming

Infrastructure
  • Kubernetes (execution environment)
  • Docker

Storage / Cloud (Abstracted)
  • S3 / object storage
  • AWS / GCP / Azure (environment-dependent)

DevOps (Exposure Level)
  • Git, Jenkins (CI/CD)

Languages
  • Python (PySpark)
  • Scala (preferred)
  • Bash / scripting

Key Responsibilities
  • Design, develop, and maintain Apache Spark pipelines (batch and streaming) using PySpark and/or Scala
  • Process and transform large-scale datasets using modern data lake architectures (Delta Lake, Parquet)
  • Optimize Spark jobs for performance, including:

o Partitioning strategies

o Shuffle optimization

o Memory tuning

o File sizing and storage efficiency
  • Implement Structured Streaming pipelines for near real-time data processing
  • Develop and deploy Spark applications within containerized environments (Docker)
  • Execute workloads in Kubernetes clusters, supporting scalable and distributed processing
  • Integrate Spark pipelines with downstream systems, including:

o Analytics platforms (SQL, notebooks)

o AI/ML workflows and feature engineering pipelines
  • Support data ingestion and storage in object-based systems (e.g., S3-compatible storage)
  • Troubleshoot data pipeline failures and ensure reliability in mission-critical environments
  • Operate within secure, air-gapped environments, including:

o Managing dependencies without internet access

o Working within controlled network and security constraints

Required Qualifications:
  • TS/SCI (eligibility) with ability/willingness to obtain/maintain counterintelligence polygraph
  • Bachelor's degree plus 5 years' experience in data engineering or Spark development (will entertain additional years' experience in lieu of degree)
  • Strong hands-on experience with:

o Apache Spark (mandatory)

o Python (PySpark)

o Data processing at scale
  • Experience working with:

o Parquet and/or Delta Lake

o Distributed data systems
  • Familiarity with:

o Docker / containerization

o Kubernetes (basic to intermediate experience)
  • Experience with object storage systems (e.g., S3 or equivalent)
  • Strong troubleshooting and performance tuning skills
  • Proficiency in Bash or scripting

Preferred Qualifications:
  • Experience with Scala for Spark development
  • Experience with Structured Streaming in production environments
  • Familiarity with Iceberg or lakehouse architectures
  • Experience with CI/CD pipelines (Jenkins, Git)
  • Exposure to Terraform or Infrastructure as Code
  • Experience supporting AI/ML data pipelines
  • Prior experience supporting NGA, IC, or DoD programs

Some of our benefits include:
  • Generous PTO plus 11 Federal Holidays
  • Retirement Planning - 401k Fully Vested with Match
  • Tuition Assistance Program - Annual contributions to help you pay down your loans
  • Annual Health and Wellness Allowance - buy an Apple Watch, a treadmill, or hit the gym on us
  • Career Development - Annual Funds to spend on Education and Training
  • Volunteer Time Off - Annually, all employees can spend 8 hours directly supporting a charity of choice
  • Charitable Match - ABSC matches an employee's donation to a qualifying charity
  • Referral Program - We pay for internal and external referrals!
  • LOV Awards - Earn bonus awards throughout the year from our Living Our Values awards program

Similar Jobs

More Jobs at Absolute Business Solutions Corp.

More Information Technology Jobs

Find similar Apache Spark Developer jobs: