MLOps EngineerJob SummaryWe are looking for an MLOps Engineer to build, deploy, monitor, and maintain machine learning systems in production. The ideal candidate will work closely with data scientists, software engineers, and DevOps teams to automate the ML lifecycle and ensure scalable, reliable, and secure ML applications.
Key Responsibilities- Design and implement end-to-end ML pipelines.
- Deploy machine learning models to production environments.
- Build CI/CD pipelines for ML workflows.
- Automate model training, testing, deployment, and monitoring.
- Manage model versioning and experiment tracking.
- Monitor model performance, drift, and system health.
- Optimize infrastructure for scalable ML workloads.
- Collaborate with Data Scientists and Software Engineers.
- Implement security, governance, and compliance best practices.
- Troubleshoot production ML issues.
Required SkillsMachine Learning- ML model lifecycle
- Model deployment
- Model monitoring
- Feature engineering basics
- Model versioning
Programming- Python
- SQL
- Bash scripting
Cloud Platforms- AWS
- Azure
- Google Cloud Platform (GCP)
MLOps Tools- MLflow
- Kubeflow
- Airflow
- DVC
- Weights & Biases (optional)
DevOps- Docker
- Kubernetes
- Git
- Jenkins / GitHub Actions / GitLab CI
Data Engineering- Apache Spark
- Kafka
- ETL pipelines
Monitoring- Prometheus
- Grafana
- ELK Stack
DatabasesQualifications- Bachelor's or Master's degree in Computer Science, Data Science, AI, or a related field.
- 2-6+ years of experience in MLOps, DevOps, Data Engineering, or Machine Learning Engineering.
- Experience deploying ML models in cloud environments.
- Strong understanding of software engineering best practices.
Preferred Qualifications- Experience with Large Language Models (LLMs).
- Experience with Retrieval-Augmented Generation (RAG).
- Knowledge of vector databases.
- Familiarity with Infrastructure as Code (Terraform or CloudFormation).
- Experience with distributed computing frameworks.
Nice-to-Have Skills- Generative AI
- LangChain or similar orchestration frameworks
- FastAPI
- REST APIs
- Linux administration
- Terraform
- Helm
- Argo CD
- Ray
- Feature Stores (e.g., Feast)
Example Tech Stack- Languages: Python, SQL, Bash
- Cloud: AWS, Azure, GCP
- Containers: Docker, Kubernetes
- CI/CD: GitHub Actions, Jenkins
- ML: MLflow, Kubeflow, Airflow
- Monitoring: Prometheus, Grafana
- Data: Spark, Kafka
- APIs: FastAPI
- Version Control: Git
Common Interview Topics- End-to-end ML pipeline design
- CI/CD for machine learning
- Docker and Kubernetes
- Model serving and deployment
- Model monitoring and drift detection