Backend ML Engineer

Sterling Computers Corporation

• $90K — $120K *

North Sioux City, SD 57049In-Person

Information Technology

Less than 5 years of experience

2 weeks ago

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

3-5 years of backend or ML engineering experience.
Proficient in Python with experience using FastAPI or Flask.
Hands-on knowledge of popular ML libraries like PyTorch and Hugging Face.
Familiar with cloud platforms such as AWS, GCP, or Azure.
Experience integrating large language models into production environments.
Knowledge of vector databases for enhanced ML retrieval functionalities.
Experience with retrieval-augmented generation (RAG) techniques.

Responsibilities

Build, test, and maintain production ML services including inference APIs and retrieval pipelines.
Design scalable RESTful and streaming APIs to serve ML model outputs efficiently.
Integrate and fine-tune LLMs and embedding models while evaluating options for cost and latency.
Develop ingestion pipelines for unstructured data and manage vector store schemas.
Implement evaluation harnesses to measure and improve retrieval quality and answer correctness.
Containerize and deploy ML workloads using Docker and Kubernetes, while managing resource allocation.
Collaborate with cross-functional teams to translate ML capabilities into product features.

Benefits

Work in a collaborative and innovative environment focused on AI/ML development.
Opportunities for professional development in cutting-edge technology.
Engage directly with real users and client-focused projects.
Potential for significant impact by shipping AI features.
Flexible work arrangements with a focus on work-life balance.

Full Job Description

Title: Backend ML Engineer

Reports to: Senior Software Architect

Location: North Sioux City, SD

Job Description: We are looking for a Backend ML Engineer who is interested in taking AI/ML systems from prototype to production, designing inference APIs, building retrieval and orchestration pipelines, integrating large language models, and operating ML infrastructure at scale. If you thrive in a collaborative, client-focused environment and enjoy shipping AI features that real users depend on, we'd love to have you on our team.

Required Technical Skills:

3-5 years of experience in backend or ML engineering
Strong working knowledge of Python, including FastAPI or Flask
Experience with modern ML libraries such as PyTorch, Hugging Face Transformers, and sentence-transformers
Proficiency with cloud platforms including AWS, GCP, or Azure
Hands-on experience integrating LLMs (OpenAI, Anthropic, Gemini, or open-source models) into production systems
Familiarity with vector databases such as Weaviate, pgvector, Pinecone, or similar
Experience with retrieval-augmented generation (RAG) patterns
Self-motivated with a positive and professional attitude
Knowledge of additional languages such as Node.js, JavaScript, or other relevant languages is a plus

Required Education/Experience:

Bachelor's degree in Computer Science, Machine Learning, or a related field (minimum requirement), or equivalent practical experience
Graduate-level coursework or specialization in ML/AI is a plus
Relevant cloud certifications are a plus
Demonstrated experience shipping ML systems to production is a plus
US DoD Clearance preferred or willingness to obtain such

Qualifications:

Strong experience building backend services with Python (FastAPI/Flask); comfort working with async APIs and request/response patterns for ML inference workloads.
Hands-on experience integrating LLMs and embedding models into production applications, including prompt engineering, context management, and handling rate limits, retries, and streaming responses.
Familiarity with RAG architectures: chunking strategies, embedding pipelines, vector search, reranking, and evaluation metrics (Recall[redacted], MRR, faithfulness, answer relevance).
Experience with vector databases (Weaviate, pgvector, Pinecone, Qdrant, or similar) and traditional databases (PostgreSQL, MariaDB) for hybrid retrieval and metadata filtering.
Cloud experience (AWS/GCP/Azure) for deploying ML services - including managed inference endpoints, GPU instances, or serverless model hosting.
Strong understanding of API authentication, secure handling of model inputs/outputs, and PII/PHI-aware design where applicable.
Experience with ML observability: tracking latency, token usage, cost-per-query, retrieval quality, and model drift in production.
Background in data pipelines, document ingestion/parsing, or evaluation frameworks (Ragas, TruLens, Docling, custom harnesses) is needed.
Familiarity with fine-tuning, LoRA/PEFT, or model distillation is appreciated.
Experience with MLOps tooling (MLflow, Weights & Biases, Kubeflow) or LLM orchestration frameworks (LangChain, LlamaIndex, Haystack, or custom orchestrators) is a plus.

Responsibilities:

Build, test, and maintain production ML services - inference APIs, retrieval pipelines, orchestration layers, and guardrail/evaluation components.
Design scalable RESTful and streaming APIs that serve ML model outputs reliably under real-world load.
Integrate and tune LLMs, embedding models, and rerankers; evaluate trade-offs across hosted (Anthropic, OpenAI, Vertex) and self-hosted (HF, vLLM) options on cost, latency, and quality.
Build ingestion and chunking pipelines for unstructured data (PDFs, HTML, transcripts) and maintain vector store schemas for multi-tenant or multi-domain retrieval.
Implement evaluation harnesses to measure retrieval quality, generation faithfulness, and end-to-end answer correctness; close the loop from evals back into pipeline improvements.
Containerize and deploy ML workloads with Docker and Kubernetes; manage GPU/CPU resource allocation and model versioning.
Optimize database queries, vector search performance, and caching strategies (including LLM prompt caching) to reduce latency and cost.
Implement CI/CD pipelines for ML services and instrument monitoring for both system metrics (latency, error rate) and ML-specific metrics (retrieval quality, hallucination rate, drift)
Collaborate with frontend engineers, ML researchers, and product analysts to translate model capabilities into shipped features.
Document backend and ML infrastructure, including model cards, evaluation results, and architectural decisions
Travel - must be willing to travel 25% and periodically up to 50%.

* Ladders Estimates

Similar Jobs

Python Developer
$90K — $120K *
Leisnoi Incorporated
Remote
Today
Backend Java Developer
$90K — $120K *
Bloomberg
Remote
3 days ago
Application Programmer
$95K — $116K *
Earnin
Remote
3 days ago
Back-End Engineer (Healthcare Consulting)
$95K — $140K *
Sellers Dorsey
Remote
Reposted 4 days ago
Software Engineer II - Inline Mailflow
$100K — $140K *
Abnormal AI, Inc.
Remote
4 days ago
Java Developer (Remote)
$90K — $120K *
Veracity Solutions
Remote
5 days ago

Get Ready For Your
Next Interview

More Jobs at Sterling Computers Corporation

Backend ML Engineer
$90K — $120K *
North Sioux City, SD 57049 (Union County)
2 weeks ago
Information Technology
In-Person
Full Stack Developer/Engineer
$80K — $110K *
North Sioux City, SD 57049 (Union County)
2 weeks ago
Information Technology
In-Person
Data Scientist
$90K — $120K *
North Sioux City, SD 57049 (Union County)
2 weeks ago
Information Technology
In-Person
Field Account Manager
$70K — $95K *
Orlando, FL 32828 (Orange County)
3 weeks ago
Business Services
In-Person
Inside Director of Sales
$90K — $120K *
North Sioux City, SD 57049 (Union County)
1 month ago
Business Services
In-Person

More Information Technology Jobs

Business Development Director
$300K — $345K + $120K bonus *
Tier1 IT Services Firm
Kansas City, MO 64116 (Clay County)
6 days ago
Client Partner / Business Developemnt - Banking
$250K — $320K + $70K bonus *
IT Services Firm (client of TechLink Systems)
New York, NY 10001 (New York County)
6 days ago
Senior Data Engineer
$120K — $150K *
ECS
Remote
Today
Engineer I- Software
$70K — $95K *
Microchip Technology
Chandler, AZ 85225 (Maricopa County)
Today
Software Engineer lll - Payments Modernization
$102K — $179K *
Bank of America Corporation
Charlotte, NC 28269 (Mecklenburg County)
Reposted Today

Find similar Backend ML Engineer jobs:

Nationwide North Sioux City, SD

Backend ML Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Backend ML Engineer jobs:

Get Ready For Your
Next Interview