Data Scientist

Peraton

$80K — $128K *
US-AnywhereRemote in United States
Healthcare
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Associate's degree with 6 years experience, Bachelor's degree with 4+ years, or Master's with 2+ years in relevant fields; High School diploma with 8 years experience acceptable.
  • Proficient in SQL and Python, with hands-on experience using ML frameworks such as scikit-learn, XGBoost, PyTorch, or TensorFlow.
  • Experience with MLFlow or similar tools for experiment tracking and lifecycle management.
  • Understanding of SDLC fundamentals and experience with version control systems like GitHub.
  • Experience in distributed computing environments (e.g., Spark, Databricks) and cloud-native services.
  • Basic knowledge of Bash or shell scripting for automation purposes.
  • Ability to communicate technical concepts to diverse audiences and collaborate across teams.
  • Must be able to obtain and maintain a Public Trust clearance.

Responsibilities

  • Develop, train, and evaluate various ML models and contribute to LLM-based capabilities.
  • Support model governance and deployment practices with MLFlow, including tracking and versioning.
  • Contribute to production ML operations by monitoring model performance and detecting drift.
  • Enhance model serving infrastructure and lifecycle automation for scalable development and inference.
  • Apply explainability techniques and produce documentation for stakeholder transparency.
  • Assist in data ingestion and ELT/ETL transformation using Spark and SQL within Snowflake and Databricks.
  • Support pipeline orchestration and data stewardship practices including metadata management.
  • Coordinate with platform teams on system administration tasks and environment setup.

Benefits

  • Comprehensive health benefits package including medical, dental, and vision coverage.
  • Flexible work schedule and remote work options.
  • Professional development opportunities and support for continuing education.
  • Generous paid time off (PTO) policy, including holidays and parental leave.
  • Retirement savings plan with company matching contributions.
Full Job Description
Responsibilities

We are looking for a Data Scientist to contribute across the full ML development lifecycle — from model building and experimentation to production deployment and monitoring. Core responsibilities are in applied data science and MLOps, with secondary contributions to data engineering and light platform operations. This role works within established platform patterns alongside dedicated infrastructure engineers, without requiring their involvement for routine ML and data tasks. All work is performed in a HIPAA-governed, FedRAMP-compliant healthcare analytics environment.

 

What you'll do:

  • Develop, train, and evaluate ML models (classification, regression, clustering, anomaly detection) and contribute to LLM-based capabilities such as RAG pipelines and prompt evaluation.
  • Support model governance and deployment practices using MLFlow, including experiment tracking, model versioning, registry promotion workflows, and automated testing across the ML lifecycle.
  • Contribute to production ML operations: model performance monitoring, drift detection, automated alerting, and incident escalation to maintain reliability and SLA compliance.
  • Build and improve model serving infrastructure, feature pipelines, and lifecycle automation to support reproducible, scalable model development and inference.
  • Apply explainability techniques (e.g., SHAP, LIME) and produce technical documentation to support stakeholder transparency and compliance requirements.
  • Contribute to data ingestion, ELT/ETL transformation, and pipeline reliability using Spark and SQL-based frameworks within Snowflake and Databricks environments.
  • Support pipeline orchestration, medallion architecture conventions, and data stewardship practices (metadata management, PII handling, lineage tracking in Unity Catalog).
  • Perform occasional system administration tasks in collaboration with platform teams, including environment configuration, access management, compute troubleshooting, and secrets handling using platform-native tools.
Qualifications

 

Basic Qualifications:

  • Associate's degree with 6 years of experience, Bachelor's degree with 4+ years of relevant experience, or Master's degree with 2+ years of relevant experience or High School diploma with 8 years of experience.
  • Demonstrated experience with SQL and Python, including Python-based ML frameworks (e.g., scikit-learn, XGBoost, PyTorch, or TensorFlow).
  • Hands-on experience with MLFlow or equivalent tools for experiment tracking, model governance, and lifecycle management.
  • Strong understanding of SDLC fundamentals and experience with GitHub or equivalent version control.
  • Experience with distributed compute environments (e.g., Spark, Databricks) and cloud-native services.
  • Basic proficiency with Bash or shell scripting for automation and environment setup.
  • Ability to collaborate across multidisciplinary teams and communicate technical concepts to varied audiences.
  • Ability to obtain and maintain a Public Trust clearance
  • US citizenship required or must be a Green Card holder and have been in the USA for 3 of the last 5 years..

Preferred Qualifications:

  • Experience with MLOps practices including CI/CD for ML, containerization, feature pipeline automation, and model deployment frameworks.
  • Experience with Databricks E2 components (Unity Catalog, Feature Store, Delta Live Tables) and/or model serving and drift monitoring tools (e.g., Databricks Model Serving, Evidenly, etc.).
  • Experience with LLM frameworks (e.g., LangChain, LlamaIndex, Hugging Face Transformers) and familiarity with model explainability libraries (e.g., SHAP, LIME).
  • Advanced Spark performance optimization experience and/or API development using Databricks REST APIs.
  • Experience with healthcare analytics data (preferably Medicare or Medicaid) and familiarity with HIPAA or FedRAMP compliance constraints.
  • Experience building data pipelines in a Snowflake or Databricks environment.
  • Familiarity with orchestration tools (Airflow, Databricks Workflows).
  • Exposure to streaming data patterns using Spark Structured Streaming, Delta Live Tables, or Kafka.
  • Familiarity with environment reproducibility tooling (Docker, conda) and scripting (Python, Bash) to support automation and CI/CD tasks
Target Salary Range$80,000 - $128,000. This represents the typical salary range for this position. Salary is determined by various factors, including but not limited to, the scope and responsibilities of the position, the individual’s experience, education, knowledge, skills, and competencies, as well as geographic location and business and contract considerations. Depending on the position, employees may be eligible for overtime, shift differential, and a discretionary bonus in addition to base pay.

Similar Jobs

More Jobs at Peraton

More Healthcare Jobs

Find similar Data Scientist jobs: