DISQO

Staff Data Engineer (Scala, Spark, & Gen AI)

DISQO$130K — $180K *
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 8+ years of experience in building and supporting production data pipelines and distributed systems.
  • Expert-level experience in Scala and Apache Spark, including performance tuning.
  • Proven ability to integrate Generative AI/LLMs into data engineering workflows.
  • Strong proficiency in Python for interfacing with AI tools and data APIs.
  • Extensive experience with AWS architecture and services.
  • Deep knowledge of advanced ETL/ELT processes and complex data modeling.
  • Expertise in workflow orchestration tools, notably Airflow.

Responsibilities

  • Architect and lead the design of scalable, fault-tolerant data pipelines using Scala and Spark.
  • Pioneer Generative AI integration to enrich data and automate metadata generation.
  • Collaborate with Product and Engineering to develop AI-enhanced data architectures.
  • Optimize ETL/ELT workflows to resolve performance bottlenecks in Spark.
  • Implement infrastructure for AI capabilities, including embedding and retrieval architectures.
  • Establish coding standards and ensure quality across projects.
  • Mentor engineers and foster a continuous learning environment.

Benefits

  • 100% coverage for Medical/Dental/Vision for employees and competitive dependent coverage.
  • Stock options and 401K plan available.
  • Generous PTO policy with team-building events.
  • Catered lunch and fully stocked kitchen at the workplace.
  • Paid maternity/paternity leave and comprehensive disability insurance.
  • Travel assistance program and 24/7 counseling services offered to employees.
Full Job Description
Position Description

As a Staff Data Engineer, you will set the technical direction for DISQO's ad measurement platform. You will architect, build, and scale our most complex data pipelines while spearheading the integration of Generative AI capabilities directly into our core data infrastructure and products. You will tackle our hardest scalability challenges, utilizing expert-level Spark and Scala to process massive datasets, while leveraging LLMs to unlock new value from unstructured and structured data.

Operating with a high degree of autonomy, you will lead cross-functional technical initiatives, drive architectural decisions, and pioneer how we use AI to enrich data, automate pipelines, and improve data quality. You will mentor senior and mid-level engineers, raising the technical bar for the entire team while expanding DISQO's technical depth across big data systems, cloud infrastructure, and applied AI.

What you will do:

  • Architect and Lead: Design, build, and maintain highly scalable, fault-tolerant data pipelines using expert-level Scala and Apache Spark.
  • >
  • Gen AI Integration: Pioneer the use of Generative AI within our data ecosystem-incorporating LLMs to enrich datasets, extract value from unstructured data, automate metadata generation, and build intelligent data products.
  • >
  • Cross-Functional Strategy: Partner with Product and Engineering leadership to translate complex business requirements into forward-looking data and AI-augmented architectures.
  • >
  • Optimize Systems: Architect and aggressively optimize large-scale ETL/ELT workflows. Dive deep into Spark internals to resolve complex performance bottlenecks, memory issues, and data skew.
  • >
  • Modern AI Tooling: Implement and manage infrastructure to support AI integration, including vector databases, embeddings pipelines, and Retrieval-Augmented Generation (RAG) architectures.
  • >
  • Set the Standard: Write clean, highly optimized, and maintainable code, while establishing standards for code quality, testing, and system architecture across the organization.
  • >
  • Ensure Operational Excellence: Champion data quality, observability, and system health to consistently meet enterprise SLAs and customer commitments.
  • >
  • Mentorship: Actively mentor engineers, lead technical design reviews, and foster a culture of continuous learning and technical rigor.
  • >


What we're looking for:

  • 8+ years of experience building, architecting, and supporting complex production data pipelines, distributed systems, and backend infrastructure.
  • >
  • Expert-Level Scala & Spark: Deep, hands-on expertise in Scala and Apache Spark. You must understand Spark internals, query plans, memory management, and advanced performance tuning for massive-scale batch processing.
  • >
  • Applied Generative AI Experience: Proven experience integrating Gen AI / LLMs (e.g., OpenAI APIs, Anthropic, Bedrock) into data products or data engineering workflows. Hands on experience developing with AI dev tools such as Claude code, etc
  • >
  • Strong Python Skills: Proficiency in Python specifically to interface with modern AI ecosystems, data APIs, and orchestration tools.
  • >
  • Cloud Mastery: Extensive architectural experience within the AWS ecosystem (EMR, Glue, Athena, S3, Bedrock, etc.).
  • >
  • Core Data Foundations: Deep understanding of advanced ETL/ELT concepts, complex data modeling, and performance-tuning SQL.
  • >
  • Orchestration: Expert-level experience with workflow orchestration tools such as Airflow.
  • >
  • Leadership: Proven track record of leading technical initiatives, making architectural decisions, and mentoring teams in an agile, fast-moving environment.
  • >


Nice to have:

  • Experience with Snowflake or other modern cloud data warehouses.
  • >
  • Deep exposure to streaming or real-time event processing (Kafka, Flink, Kinesis, etc.).
  • >
  • Experience utilizing AI for automated data observability, anomaly detection, or data quality tooling.
  • >
  • Background in ad tech, measurement, attribution modeling, or specialized analytics platforms.
  • >


Why DISQO?

  • Lead the architecture of intelligent data products that directly influence how the world's top brands measure advertising impact.
  • >
  • Work with bleeding-edge data and Gen AI infrastructure at a highly meaningful scale.
  • >
  • Shape the technical culture and elevate a talented engineering organization while owning massive-scale production systems.
  • >


#LI-MV1

At DISQO, we pride ourselves on having a positive, performance-oriented workplace that includes a flexible hybrid approach, competitive medical benefits, and an amazing vacation policy. Read more about our culture on Glassdoor.

You can learn more about what's happening at DISQO by visiting the DISQO Company Blog.

Perks & Benefits:
• 100% covered Medical/Dental/Vision for employee, competitive dependent coverage
• Stock options
• 401K
• Generous PTO policy
• Team offsites, social events & happy hours
• Life Insurance
• Health FSA
• Commuter FSA (for hybrid employees)
• Catered lunch and fully stocked kitchen
• Paid Maternity/Paternity leave
• Disability Insurance
• Travel Assistance Program
• 24/7 Counseling Services offered to Employees

Note: The benefits noted above are for full time US based employees only.

About DISQO

DISQO is a market research company that provides consumer insights to businesses. The company uses a proprietary platform to collect and analyze data from millions of consumers across the United States. DISQO's platform is designed to provide accurate and reliable insights into consumer behavior, preferences, and opinions. The company's clients include some of the world's largest brands and market research firms. DISQO was founded in 2015 and is headquartered in Costa Mesa, California.
Learn more about DISQO
Size
200 employees
Industry
Founded
2015

Similar Jobs

More Jobs at DISQO

More Information Technology Jobs

Find similar Staff Data Engineer (Scala, Spark, & Gen AI) jobs: