JOB SUMMARY
We are looking for a Lead Engineer to own the architecture and delivery of a production-grade RAG platform. The solution will extract and unify data from AWS Aurora MySQL and AWS DocumentDB, index knowledge using MongoDB Atlas Vector Search, and generate grounded responses using OpenAI models. This role will lead technical design, implementation, quality, and mentoring, and will establish best practices for tool integration and safe context retrieval.
Key Responsibilities
- Own end-to-end architecture for ingestion, indexing, retrieval, and LLM orchestration (RAG).
- Lead implementation in Python, including hands-on coding, design reviews, and establishing standards.
- Build and optimize data extraction pipelines from AWS Aurora MySQL (SQL optimization, indexing, incremental loads/CDC-style approaches) and AWS DocumentDB (aggregation pipelines, indexing, incremental sync patterns).
- Design chunking/metadata strategy and indexing into MongoDB Atlas Vector Search, including schema design, versioning, upserts/deletes, and re-embedding strategy.
- Implement vector search with metadata filters (tenant, source, permissions, time range).
- Integrate OpenAI for embeddings and generation, focusing on prompting for grounded responses, structured outputs, and citations/source tracking.
- Ensure reliability through retries, batching, caching, rate-limit handling, and cost/performance controls.
- Implement patterns for safe retrieval and extensible tooling, including tool schemas, permissions, audit logs, and deterministic tool behavior.
- Establish production readiness with observability (logs/metrics/tracing), alerting, incident response runbooks, and testing strategy (unit/integration).
- Implement CI/CD and secure secret management.
- Provide technical leadership, mentor engineers, set coding standards, and conduct design/code reviews.
- Coordinate with product, data, and security stakeholders and manage the technical roadmap.
Required Qualifications
- 8+ years of strong hands-on Python backend/data engineering experience (or equivalent depth).
- Prior experience as a Lead/Tech Lead owning architecture and delivery of production systems.
- Proven experience shipping RAG/LLM applications in production.
- Strong experience with OpenAI APIs (embeddings, chat, tool/function calling patterns).
- Hands-on experience with MongoDB Atlas Vector Search: index creation/tuning, schema design, metadata filtering, multi-tenant patterns.
- Strong integration experience with Aurora MySQL (performance tuning, incremental sync) and DocumentDB (queries/aggregations, indexing, performance).
- Experience with API/service development (FastAPI preferred) and background jobs (Celery/RQ/SQS/Kafka).
- Experience with engineering excellence practices: Docker, CI/CD, pytest, secure secrets, monitoring, performance tuning.
Preferred Qualifications
- Hybrid retrieval (keyword + vector), reranking, and evaluation frameworks (RAGAS/TruLens).
- Document parsing/OCR, table extraction, knowledge graph/entity resolution.
- AWS deployment experience (ECS/EKS/Lambda), VPC networking, IAM best practices.
Certifications