JOB SUMMARY
Design, build, deploy, and optimize enterprise-grade AI systems powered by foundation models, LLMs, retrieval-augmented generation, and agentic AI workflows. The role converts AI concepts into secure, scalable, observable, and supportable production systems suitable for a regulated financial-services environment.
Key Responsibilities
Design and implement LLM-powered applications such as knowledge assistants, document intelligence solutions, workflow agents, summarization tools, and decision-support systems.
Build RAG pipelines using embeddings, chunking strategies, vector databases, semantic retrieval, reranking, response grounding, and citation patterns.
Adapt and optimize models using LoRA, PEFT, instruction tuning, distillation, transfer learning, quantization, and domain adaptation techniques.
Develop scalable APIs, microservices, model-serving components, and integration patterns across cloud, hybrid, or containerized environments.
Optimize inference workloads for latency, throughput, token efficiency, cost, reliability, and user experience.
Implement model and application observability, including prompt logs, retrieval quality, hallucination indicators, drift signals, feedback loops, cost telemetry, and service health.
Embed security, privacy, Responsible AI, and model risk controls into AI application design and delivery.
Create production documentation, runbooks, release notes, test evidence, and audit-ready implementation records.
Required Qualifications
7+ years in AI/ML engineering, platform engineering, software engineering, or applied machine learning.
Hands-on experience with LLMs, transformers, embeddings, RAG, semantic search, and GenAI application patterns.
Strong Python engineering skills with PyTorch, TensorFlow, Hugging Face, LangChain, LlamaIndex, Semantic Kernel, or equivalent frameworks.
Experience deploying production AI services using APIs, containers, Kubernetes, CI/CD, cloud-native services, and monitoring platforms.
Practical knowledge of model evaluation, fine-tuning, inference optimization, and secure data handling.
Preferred Qualifications
Banking, risk, compliance, financial crime, operations, or enterprise technology background.
Experience with Azure OpenAI, AWS Bedrock, Vertex AI, Databricks, vLLM, Triton, MLflow, Kubeflow, or model gateways.
Exposure to model risk, AI governance, audit controls, AI cost governance, and private or open-source LLM deployments.