RAG and Evaluation Engineer

LTS

• $90K — $120K *

US-AnywhereRemote in United States

Information Technology

Less than 5 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

Bachelor's degree in Computer Science, Engineering, Information Science, or related field, plus 4 years of software engineering experience (or equivalent experience).
Experience with production RAG systems and the ability to quantify their quality.
Proven ability to work in fast-paced, collaborative environments.
Expertise in retrieval pipelines including ingestion, chunking, embedding, and reranking.
Strong skills in evaluation, including benchmark design and regression detection.
Familiarity with when to use BM25 versus embeddings in retrieval contexts.
Ability to define and measure success metrics independently.

Responsibilities

Own the ingestion pipelines for various types of data, including source code and technical documentation.
Ensure retrieval quality through chunking, embeddings, and reranking techniques.
Manage the evaluation harness to benchmark accuracy and quality of outputs.
Conduct A/B testing and regression detection for product improvements.
Facilitate feedback loops from production usage back to evaluations and retrieval processes.
Define key quality metrics for the platform amidst unclear benchmarks.
Collaborate with Agent Engineers on prompt and evaluation iterations.

Benefits

Support high visibility federal missions in IT and healthcare.
A culture focused on innovation, growth, collaboration, and quality.
Access to cutting-edge tools and technologies.
Comprehensive benefits for employees and their families.
Career paths that reward ambition and performance.

Full Job Description

RAG & Evaluation Engineer

What You'll Do:

The RAG & Evaluation Engineer owns the knowledge surface and the eval harness. Ingestion pipelines for source code, structured metadata, technical documentation, patches, and additional corpora the customer provides. Retrieval quality across chunking, embeddings, hybrid retrieval, reranking, freshness. Benchmarks for translation accuracy, dependency-map correctness, and overall agent quality. The feedback loop from production usage back into evals and retrieval lives here.

Own the knowledge surface - ingestion pipelines for source code, structured metadata, technical documentation, patches, and additional corpora the customer provides.
Own retrieval quality - chunking, embeddings, hybrid retrieval, reranking, and freshness.
Own the eval harness - benchmarks for translation accuracy, dependency-map correctness, and overall agent quality.
Run A/B testing and regression detection across prompts, retrieval, and model changes.
Operate the feedback loop from production usage back into evals and retrieval.
Define what "good" means for the platform when no one else has a clear view, so the team can tell whether the agent is actually improving.
Pair with the Agent Engineers on the prompt-and-eval iteration cycle.

What We're Looking For:

Bachelor's degree in Computer Science, Engineering, Information Science, or a related field, plus 4 years of professional software engineering experience; equivalent experience may substitute for the degree requirement.
Has shipped a production RAG system with quality the candidate can describe in numbers (rigor matters more than scale).
Ability to work in a fast-paced, collaborative environment.
Production experience with retrieval pipelines - ingestion, chunking, embedding, hybrid retrieval, reranking.
Strong applied evaluation skills - benchmark design, regression detection, LLM-as-judge patterns.
Knows when BM25 beats embeddings and when neither is enough.
Measures everything they ship; opinions about chunking are backed by benchmarks.
Patient with detail; comfortable defining metrics before the team has agreed on them.
Heavy native use of AI tooling: agents in parallel, model as collaborator.
Strong TypeScript or Python.
Demonstrated experience in a remote work environment.

Nice to Have:

Code-as-corpus retrieval (search over source code rather than prose).
Applied IR or search-engine background.
Synthetic data generation and LLM-as-judge patterns.
Open-source contributions to retrieval, eval, or RAG tooling.
Experience integrating retrieval feedback loops with production usage.
Healthcare IT or legacy modernization domain experience.
Public technical writing or conference talks on retrieval or evaluation.

What's in it for you?

The opportunity to support high visibility federal missions in IT and healthcare
A culture that values innovation, growth, collaboration, and quality
Access to cutting-edge tools and technologies
Comprehensive benefits for you and your family
A career path that rewards ambition and performance

If you're ready to push boundaries, sharpen your skills, and join a team that is passionate about building what's next, we'd love to meet you. Apply today and let's build a future together!

LTS shares salary ranges to promote transparency. Compensation ranges are provided for informational purposes, and final compensation may vary based on experience, skills, location, and role requirements.

LTS is committed to offering eligible employees comprehensive benefits that will provide them with options intended to meet their needs and the needs of their family.

* Ladders Estimates

Similar Jobs

Machine Learning Engineer
$90K — $130K *
V2Soft
Dearborn, MI 48126 (Wayne County)
Today
Sr AI Machine Learning Engineer
$117K — $175K *
The Hartford Financial Services Group, Inc
Hartford, CT 06106 (Capitol County)
Today
Sr AI Machine Learning Engineer
$117K — $175K *
The Hartford Financial Services Group, Inc
Chicago, IL 60629 (Cook County)
Today
Senior Data & ML Feature Engineer
$62K — $139K *
CGI
Strongsville, OH 44136 (Cuyahoga County)
Reposted Today
Machine Learning Engineer - Autonomy Lab
$90K — $130K *
Carnegie Mellon University
Arlington, VA 22204 (Arlington County)
Reposted Yesterday
Senior Machine Learning Research Scientist - Secure AI Lab
$120K — $150K *
Carnegie Mellon University
Pittsburgh, PA 15237 (Allegheny County)
Reposted Yesterday

Get Ready For Your
Next Interview

More Jobs at LTS

Solution Architect
$120K — $150K *
Remote
Today
Healthcare
Remote in United States
Automation Team Analyst
$100K — $120K *
Remote
3 weeks ago
Enterprise Technology
Remote in United States
Dynamics 365 Customer Service Engineer/System Administrator
$87K — $108K *
Remote
4 weeks ago
Information Technology
Remote in United States
Business Analyst/Technical Writer
$80K — $85K *
Remote
1 month ago
Education, Government & Non-Profit
Remote in United States

More Information Technology Jobs

Client Partner - Banking / Financial Services / Capital Markets
$325K — $350K + $100K bonus *
Large IT Services Firm (client of TechLink Systems)
New York, NY 10001 (New York County)
1 week ago
Sr. SDET Automation Engineer
$140K — $165K *
Yubico
Bellevue, WA 98006 (King County)
Today
Project Engineer III
$90K — $120K *
Palmetto Technology Group
Tucson, AZ 85705 (Pima County)
Reposted Today
HPC-Kubernetes Solutions Architect
$200K — $350K *
INSPYR Solutions
Dallas, TX 75217 (Dallas County)
Reposted Today
Sr Network Engineer / Architect, Global Network & Security - Alpharetta, GA, Boston, MA or Billerica, MA Hybrid
$143K — $214K *
Cabot Corporation
Boston, MA 02115 (Suffolk County)
Today

Find similar RAG and Evaluation Engineer jobs:

Nationwide Remote

RAG and Evaluation Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar RAG and Evaluation Engineer jobs:

Get Ready For Your
Next Interview