About the TeamWe are a newly formed, forward-looking Cybersecurity Data Engineering & Enablement Team driving the future of our enterprise defense strategy. Our mission is to build a next-generation, centralized data lakehouse that unifies all security telemetry into a single, high-performance ecosystem. Operating across two specialized verticals-Data Engineering (ingestion, enrichment, and semantic layers) and Data Platform (foundational infrastructure, security architecture, and AI enablement)-we are designing a scalable, cloud-native foundation from the ground up. By combining cutting-edge data architecture with advanced analytics, we empower our threat hunters, data scientists, and incident responders with the real-time, trusted intelligence needed to protect the enterprise at scale.
About the RoleWe are seeking a highly specialized
Senior Data Engineer - Cybersecurity to serve as the
Subject Matter Expert (SME) for AI/ML and Platform Integration. This critical role sits at the intersection of core data platform infrastructure, advanced analytics, and external system integrations. Your primary mission is to optimize our data platform to serve as a high-performance engine for Data Science, Machine Learning (ML), and Generative AI (GenAI) workloads.
Additionally, you will own the
integration fabric of the platform-building the robust APIs, webhook ingestion engines, and data connectors that seamlessly sync our central lakehouse with downstream business applications, SaaS platforms, and third-party ecosystems.
Key Responsibilities- AI/ML Data Infrastructure & Tooling: Design, provision, and maintain the platform infrastructure required for end-to-end machine learning lifecycles. Optimize the platform for distributed training, model evaluation, and batch/real-time inference.
- Enterprise Feature Store Architecture: Design and manage the enterprise Feature Store. Ensure consistent, low-latency feature delivery, preventing data leakage between training pipelines and real-time production inference.
- Vector Infrastructure for GenAI: Architect and maintain vector databases and indexing pipelines required to support Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) patterns, and semantic search.
- Platform Integration & API Management: Serve as the SME for how external applications interact with the data lakehouse. Design, build, and secure high-throughput APIs, data connectors, and reverse-ETL patterns to sync data back into business systems (e.g., CRMs, ERPs, marketing automation).
- MLOps Collaboration & Automation: Partner closely with Data Scientists and MLOps teams to establish CI/CD automation for ML (MLOps). Transition experimental, unoptimized data science notebooks into resilient, production-grade automated workflows.
- Compute Optimization for Data Science: Configure and optimize compute engines tailored for heavy mathematical and data science workloads (e.g., Ray, Spark/EMR GPU instances).
About YouBasic Qualification- Experience: 5+ years of data engineering experience, with at least 2+ years dedicated to supporting machine learning platforms, MLOps, or complex platform integrations.
- ML Data Stack: Deep hands-on experience with AWS SageMaker, MLflow, or equivalent cloud-native ML platforms.
- Feature Stores & Vector DBs: Proven experience implementing feature store frameworks (e.g., Feast, SageMaker Feature Store) and vector databases (e.g., Pinecone, Milvus, Qdrant, or Pgvector).
- Distributed Compute & ML Libraries: Strong experience using Apache Spark / AWS EMR, Ray, or Dask to process massive datasets for feature extraction and model preparation.
- Integration Patterns: Expert knowledge of building rest APIs, Webhooks, and utilizing streaming tools (e.g., AWS Kinesis, Kafka) for real-time integration.
- Languages & CI/CD: Advanced proficiency in Python (including ML ecosystems like Pandas, NumPy, Scikit-Learn) and SQL. Extensive experience with GitHub Actions, GitLab CI, or Jenkins for data/ML pipelines.
Other Qualifications- Experience deploying and fine-tuning open-source LLMs or orchestrating AI agents using frameworks like LangChain or LlamaIndex.
- Experience with reverse-ETL tools (e.g., Census, Hightouch) or enterprise integration platforms.
Workday Pay Transparency StatementThe annualized base salary ranges for the primary location and any additional locations are listed below. Workday pay ranges vary based on work location. As a part of the total compensation package, this role may be eligible for the Workday Bonus Plan or a role-specific commission/bonus, as well as annual refresh stock grants. Recruiters can share more detail during the hiring process. Each candidate's compensation offer will be based on multiple factors including, but not limited to, geography, experience, skills, job duties, and business need, among other things. For more information regarding Workday's comprehensive benefits, please click here.
Primary Location: USA.VA.Reston
Primary Location Base Pay Range: $159,600 USD - $239,400 USD
Additional US Location(s) Base Pay Range: $144,400 USD - $258,000 USD
Our Approach to Flexible WorkWith Flex Work, we're combining the best of both worlds: in-person time and remote. Our approach enables our teams to deepen connections, maintain a strong community, and do their best work. We know that flexibility can take shape in many ways, so rather than a number of required days in-office each week, we simply
spend at least half (50%) of our time each quarter in the office or in the field with our customers, prospects, and partners (depending on role). This means you'll have the freedom to create a flexible schedule that caters to your business, team, and personal needs, while being intentional to make the most of time spent together. Those in our remote "home office" roles also have the opportunity to come together in our offices for important moments that matter.
Pursuant to applicable Fair Chance law, Workday will consider for employment qualified applicants with arrest and conviction records.