DescriptionThe Data Science and Assessments (DS&A) team within People, Purpose, & Brand (PP&B) is hiring a Director I, Data Scientist (STP) to serve as a hands-on technical leader for GenAI evaluation, enablement, and responsible AI adoption. PP&B develops programs across talent acquisition, workforce planning, performance & rewards, employer branding, and wellbeing to build a diverse, future-ready workforce.
In this role, you will shape how DS&A generates, evaluates, and scales ML and GenAI-based programs and tooling to improve productivity, decision-making, and employee experience across PP&B. Core responsibilities span three areas:
- GenAI Enablement: Design, build, and maintain evaluation frameworks and pipelines that accelerate the creation, iteration, and safe expansion of GenAI capabilities, including agentic workflows, data readiness, and post-deployment monitoring.
- Vendor & AI/ML Risk Management: Define and execute a vendor-AI evaluation vision enabling repeatable vendor selection, value validation, quality monitoring, and responsible model risk management.
- Broader Data Science: Support classic ML, automation, and technical consulting to empower PP&B colleagues and drive enterprise-wide outcomes.
This individual contributor role reports through the Office of Data & Data Science. The ideal candidate is proactive, highly technical, and collaborative; able to translate complex tradeoffs into clear, actionable recommendations that drive meaningful impact.
Key Responsibilities
- Architect, develop, and maintain tooling and pipelines to support GenAI model development, evaluation, deployment, and monitoring across PP&B programs.
- Design and operationalize scalable evaluation frameworks and metrics for GenAI systems (including automated and human-in-the-loop evaluations) to ensure quality, safety, and organizational alignment.
- Lead the vendor AI evaluation program: define criteria, run benchmarks and pilots, synthesize results, and provide clear recommendations for vendor selection and integration.
- Build reusable components leveraging APIs and templates that enable rapid iteration and reliable deployment of GenAI features.
- Partner with stakeholders across PP&B to translate business needs into technical designs, evaluation plans, and implementation roadmaps.
- Promote strong engineering hygiene across projects (CI/CD, version control, testing, documentation, reproducible pipelines).
- Provide informal mentorship for data science and analytics colleagues on tools and evaluation processes.
Qualifications
- Strong foundation in Data Science principles (Probability, Statistics, AI/ML).
- Experience with LLMs, embeddings, and generative/agentic systems.
- Proficient in Python; comfortable writing production-quality code.
- Strong SQL skills for querying, validation, and data exploration.
- Experience with APIs and integrating external model or vendor services.
- Familiarity with cloud computing concepts and services.
- Experience evaluating models for fairness, bias, privacy, explainability, or responsible AI.
- Comfortable with Git-based version control and collaborative code review.
- Competencies typically acquired through a Ph.D. degree (in Statistics, Mathematics, Economics, Actuarial Science or other scientific field of study) and a minimum of 3 years of relevant experience, a Master`s degree and a minimum of 6 years of relevant experience or may be acquired through a Bachelor`s degree and a minimum of 8 years of relevant experience.
What We Value
- A collaborative, customer-focused mindset oriented toward pragmatic, high-impact solutions and enabling others.
- Strong project management skills: building clear plans, coordinating across stakeholders, and driving accountability.
- Clear communication of technical tradeoffs, evaluation results, and recommendations to both technical and non-technical audiences.
- Product-minded engineering: building systems that are maintainable, scalable, and easy to adopt.
- A bias toward action combined with rigorous attention to evaluation and safety.
Work Arrangement & Travel
- Telecommuting up to 60% when near an office; 100% remote supported otherwise.
- Travel up to 10% for team meetings, planning sessions, or stakeholder engagements.