Computer Vision EngineerJob SummaryWe are seeking a Computer Vision Engineer to design, develop, and deploy AI solutions that analyze images and videos. The ideal candidate has experience with deep learning, image processing, and computer vision algorithms, and can build scalable solutions for applications such as object detection, image classification, facial recognition, OCR, pose estimation, and video analytics.
Key Responsibilities- Design and develop computer vision and deep learning models.
- Build image and video processing pipelines.
- Train, fine-tune, and evaluate computer vision models.
- Develop applications for:
- Image classification
- Object detection
- Object tracking
- Image segmentation
- OCR (Optical Character Recognition)
- Face detection and recognition
- Pose estimation
- Video analytics
- Optimize models for speed, accuracy, and deployment.
- Deploy models using APIs or edge devices.
- Collaborate with data scientists, software engineers, and product teams.
- Monitor model performance and improve production systems.
Required Qualifications- Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Electronics, Robotics, or a related field.
- Strong programming skills in Python.
- Solid understanding of machine learning and deep learning.
- Experience working with image and video datasets.
- Knowledge of software engineering best practices.
Required Technical SkillsProgramming- Python
- C++ (preferred)
- SQL
- Git
Computer Vision Libraries- OpenCV
- Pillow
- scikit-image
- Albumentations
Deep Learning FrameworksComputer Vision Models- CNNs
- ResNet
- EfficientNet
- Vision Transformer (ViT)
- YOLO
- Faster R-CNN
- Mask R-CNN
- U-Net
- SSD
AI & ML Concepts- Image Classification
- Object Detection
- Image Segmentation
- Feature Extraction
- Transfer Learning
- Model Optimization
- Hyperparameter Tuning
Deployment- FastAPI or Flask
- Docker
- Kubernetes (preferred)
- ONNX
- TensorRT
- NVIDIA CUDA
- Edge AI deployment
Cloud Platforms- AWS
- Microsoft Azure
- Google Cloud Platform (GCP)
Preferred Qualifications- Experience with real-time video analytics.
- Knowledge of OCR and document AI.
- Experience with edge devices such as NVIDIA Jetson.
- Familiarity with 3D vision, LiDAR, or depth cameras.
- Understanding of MLOps and CI/CD pipelines.
- Experience working with large-scale datasets such as COCO or ImageNet.
Soft Skills- Strong analytical and problem-solving abilities.
- Good communication skills.
- Team collaboration.
- Attention to detail.
- Ability to work in an Agile environment.
Nice-to-Have Skills- Vision-Language Models (VLMs)
- Multimodal AI
- Generative AI for images
- Image captioning
- Diffusion models
- Stable Diffusion
- Retrieval-Augmented Generation (RAG) with image embeddings
- LangChain or similar orchestration frameworks
- Vector databases (e.g., FAISS, Pinecone)
Common Tools- OpenCV
- Label Studio
- CVAT
- Roboflow
- Weights & Biases
- MLflow
- Docker
- GitHub Actions
Common Interview Topics- Image processing fundamentals
- Convolutional Neural Networks (CNNs)
- Transfer learning
- YOLO architecture
- Vision Transformers (ViTs)
- Object detection metrics (IoU, mAP)
- Image segmentation metrics (Dice coefficient, IoU)
- OCR concepts
- OpenCV coding
- Python programming
- PyTorch or TensorFlow implementation
- Model deployment and optimization
- REST API development
- SQL basics
- Computer vision system design
Experience LevelsJunior (0-2 years):- Knowledge of OpenCV and CNNs
- Basic deep learning projects
- Python proficiency
Mid-Level (2-5 years):- Production experience with object detection and segmentation
- Model deployment
- Performance optimization
- Cloud platform experience
Senior (5+ years):- End-to-end computer vision system architecture
- MLOps and scalable deployment
- Team leadership and mentoring
- Advanced optimization for edge and cloud environments