Junior AI/ML EngineerLocation: Herndon, VA (Hybrid Work)Preferred: US Citizenship Key Responsibilities:
• Support data preprocessing and feature engineering pipelines under senior engineer direction: clean, normalize, and validate HRSA fraud-related datasets; handle class imbalance preparation (SMOTE, undersampling) and train/validation/test split management.
• Assist in the development, training, and evaluation of supervised fraud classification models; compute and document standard evaluation metrics (accuracy, precision, recall, F1 score, AUC-ROC, confusion matrices) for government review in EPLC-required model evaluation reports.
• Maintain and monitor ML experiment tracking using MLflow or equivalent tooling approved for the IRMS environment; log hyperparameter configurations, training runs, and evaluation results with full reproducibility documentation.
• Support model drift detection and retraining pipelines: run scheduled evaluation jobs, flag performance degradation against established baselines, and escalate findings to the AI/ML Lead Engineer and Fraud AI/ML SME.
• Assist the NLP/NER pipeline team (Rohit) with data transformation tasks: format-convert NER pipeline outputs into feature-compatible schemas for downstream ML models; validate entity extraction quality against labeled reference sets.
• Develop and maintain Jupyter notebook-based model exploration and reporting artifacts for use in EPLC deliverables, sprint reviews, and government demonstrations.
• Support UiPath Maestro agent integration testing: prepare model inference payloads, validate agent input/output schemas, and assist with integration testing between ML model inference APIs and the persona-based agent layer.
• Implement and maintain data pipeline scripts (Python/Pandas/NumPy) for batch data ingestion, feature store updates, and model scoring batch runs within the IRMS security boundary.
• Follow and enforce IRMS boundary data handling procedures: ensure no PII/PHI is processed outside approved environments; maintain developer/test environment segregation per HHS security policy.
• Produce supporting artifacts for EPLC deliverables: training data specifications, model evaluation appendices, data dictionary updates, and sprint retrospective documentation as directed by the PM and AI/ML Lead.
• Participate in code reviews; adhere to OWASP secure coding standards, NIST SP 800-160 engineering principles, and Node's internal CI/CD quality gates.
RequirementsRequired Skills:Bachelor's degree in Computer Science, Data Science, Mathematics, Statistics, or a closely related field; recent graduates with strong applied ML coursework or project portfolios will be considered.• 1-3 years of hands-on experience (including internships, graduate research, or project work) in machine learning, data science, or data engineering with Python.• Proficiency in Python ML stack: scikit-learn, Pandas, NumPy; familiarity with at least one deep learning framework (TensorFlow or PyTorch) for model evaluation and inference tasks.• Demonstrated experience with standard ML evaluation workflows: train/validation/test split design, cross-validation, metric computation, and results documentation.• Experience with Jupyter notebooks for data exploration, model evaluation, and technical reporting.• Familiarity with Git-based version control and CI/CD principles; ability to work within a structured sprint cadence with documented deliverable commitments.• Demonstrated ability to handle sensitive data responsibly; understanding of data governance, access control, and the importance of environment segregation in a regulated or government setting.• Strong written communication skills: ability to produce clear, organized technical documentation suitable for government review.Benefits- Medical
- Dental
- Vision
- Basic Life
- Health Saving Account
- 401K Matching
- Three weeks of PTO/Sick
- 11 Paid Holidays
- Pre-Approved Online Training