Senior AI/ML Research Engineer (Computer Vision)

Intuitive Surgical, Inc • $130K — $180K *

Sunnyvale, CA 94087In-Person

Healthcare

5 - 7 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

MS or PhD in Computer Science, Electrical Engineering, Robotics, or related field with 5+ years in applied computer vision research.
Expertise in computer vision and deep learning, including CNNs and vision transformers.
Experience in video understanding, focusing on action segmentation and recognition.
Hands-on familiarity with video architectures and self-supervised pretraining techniques.
Knowledge of vision-action models and frameworks for learning visual representations.
Proficiency in Python and C++, with experience in frameworks like PyTorch, TensorFlow, or JAX.
Ability to collaborate effectively with cross-disciplinary teams and communicate findings clearly.

Responsibilities

Develop temporal models for understanding surgical workflows and actions.
Benchmark in-house models against leading solutions to inform architectural choices.
Define specifications for perception input/output and validate with recorded data.
Establish a continuous-improvement process involving human-in-the-loop methodologies.
Collaborate with data teams on labeling taxonomies and data quality.
Transition models from offline evaluation to real-time application in surgical settings.
Work with AI/ML researchers and robotics teams to enhance the perception layer for rapid prototyping.

Benefits

Comprehensive healthcare coverage including medical, dental, and vision plans.
Retirement plan with company matching contributions.
Stock options allowing employees to share in the company’s success.
Flexible work environment with remote work opportunities.
Generous paid time off plus holidays and paid parental leave.

Full Job Description

Job Description

Primary Function of Position

We are building advanced augmented dexterity capabilities for next-generation robotic platforms. As a Senior AI/ML Research Engineer (Computer Vision), you will develop the perception models that let our Embodied-AI system understand the surgical scene. Working within a hierarchical, multimodal stack-where a high-level model interprets sensory observations into structured intent and a low-level policy turns that intent into precise, safe, real-time control-you will focus on the vision layer: designing, training, and evaluating models that extract anatomy, instruments, actions, and surgical context from intraoperative video. You will partner with the broader AI/ML team to define how perception feeds reasoning and control, and you will drive the research-to-deployment path for your models, taking them from offline experimentation to robust, real-time performance in the OR.

Working within Intuitive's Future Forward research organization, you will identify, build and finetune the AI/ML models and algorithms that enables us to deliver safe and performant embodied AI systems. This role calls for someone who is equally comfortable getting hands-on with models and data and designing systems that scale.

Roles and Responsibilities

Develop temporal models for activity and workflow understanding: event/state recognition and fine-grained temporal action segmentation.
Benchmark in-house models against the state of the art and recommend the target perception architecture.
Define the perception input/output specification and demonstrate offline feasibility on recorded data.
Stand up a continuous-improvement loop (discrepancy flagging, active learning, human-in-the-loop relabeling) and the tooling/UI needed for offline evaluation and the path to real-time use.
Partner with annotation and data teams to shape label taxonomies, QC, and the data pipeline that feeds the AI/ML models.
Establish the path from offline evaluation on recorded data to real-time integration, including the continuous-improvement (human-in-the-loop) data loop.
Partner with AI/ML researchers, robotics, data engineers, and other stakeholders to deliver a perception layer that enables rapid prototyping and learning while working toward a product solution.

Qualifications

Minimum Qualifications

MS or PhD in CS, EE, Robotics, or a related field, with 5+ years of applied computer-vision research experience.
Strong grasp of modern CV and deep-learning fundamentals: CNNs and vision transformers, segmentation, detection, tracking, and representation/self-supervised learning.
Demonstrated work in video understanding, including temporal action segmentation, action/phase recognition, and video segmentation.
Hands-on experience with modern video architectures, including video transformers and self-supervised video pretraining.
Exposure to vision-action (VA) / vision-language-action (VLA) models and world-model / self-supervised predictive architectures (e.g., JEPA-style models, MAE, DINO) for learning visual representations and dynamics.
Experience working with large, messy, real-world video datasets at scale.
Strong software and experimentation skills in Python and C++, with proficiency in one or more of PyTorch/TensorFlow/JAX, and the ability to stand up clean, reproducible experiments and run the full loop (data curation, augmentation, loss design, metrics, error analysis).
A research-and-prototyping mindset: comfortable working in ambiguity, framing open-ended problems, running rapid experiments, and reading and reproducing recent papers to pull promising techniques into practice.
Sound judgment about the path from prototype to product: writing code others can build on, knowing when to optimize versus when to move fast, and thinking ahead about data quality, evaluation, and robustness even at the research stage.
Solid foundations in linear algebra, probability, and optimization, enough to reason about and debug model behavior from first principles.
Comfort collaborating across a multidisciplinary team (ML, robotics, software, and clinical/domain experts) and communicating tradeoffs and findings clearly.

Preferred Qualifications

Background in healthcare, medical devices, surgical robotics, or other regulated technical domains.
Sim-to-real workflows and experience with robotics simulators (e.g., NVIDIA Isaac)
Experience with structured, ontology- or taxonomy-based labeling frameworks for fine-grained activity.
Multimodal fusion of video with sensor, telemetry, and system-log streams.
Designing annotation pipelines, QC processes, and active-learning loops.
Real-time / edge inference optimization (e.g., TensorRT, NVIDIA Jetson).
Fine-grained interaction and object-relationship modeling.
Relevant peer-reviewed publications (CVPR, ICCV, ECCV, NeurIPS, etc.).

Additional Information

Due to the nature of our business and the role, please note that Intuitive and/or your customer(s) may require that you show current proof of vaccination against certain diseases including COVID-19. Details can vary by role.

This position may be filled at a different job level than listed here depending on
business need and/or on the selected candidate's experience, knowledge and skills.
Compensation will be based primarily on the job level at which the role is filled and the
candidate's qualifications, consistent with applicable law.

We provide market-competitive compensation packages, inclusive of base pay, incentives, benefits, and equity. It would not be typical for someone to be hired at the top end of range for the role, as actual pay will be determined based on several factors, including experience, skills, and qualifications. The target compensation ranges are listed.

About Intuitive Surgical, Inc

Intuitive Surgical, Inc. is an American corporation that develops, manufactures, and markets robotic products designed to improve clinical outcomes of patients through minimally invasive surgery, most notably with the da Vinci Surgical System. The company is part of the NASDAQ-100 and S&P 500. Intuitive Surgical has installed more than 5,000 surgical systems worldwide, and has more than 4,000 employees.

Learn more about Intuitive Surgical, Inc

Size

9,793 employees

Market Cap

$93.6 billion

Industry

Healthcare

Net Income

$1 billion

Founded

1999

5 Year Trend

+16.1%

Revenue

$4.3 billion

NASDAQ

ISRG

* Ladders Estimates

Similar Jobs

Computer Vision & Machine Learning Engineer
$130K — $180K *
Apple
Sunnyvale, CA 94087 (Santa Clara County)
Today
Sr. Applied Research Scientist
$130K — $180K *
Apple
Santa Clara, CA 95051 (Santa Clara County)
2 days ago
Medical AI Researcher
$120K — $150K *
Clera
San Francisco, CA 94112 (San Francisco County)
4 days ago
Senior/Lead ML Applied Scientist
$130K — $180K *
Intuition Machines, Inc.
Remote
Reposted 6 days ago
Algorithm Developer
$161K — $221K *
Applied Materials, Inc
Santa Clara, CA 95051 (Santa Clara County)
1 week ago
Applied Machine Learning Scientist - Vice President
$180K — $220K *
JP Morgan Chase & Co.
Palo Alto, CA 94306 (Santa Clara County)
1 week ago

Get Ready For Your
Next Interview

More Jobs at Intuitive Surgical, Inc

Clinical Sales Representative
$80K — $120K *
Dover, DE 19901 (Kent County)
Today
Healthcare
In-Person
Field Service Engineer 2
$75K — $95K *
Long Beach, CA 90805 (Los Angeles County)
Today
Healthcare
In-Person
Therapy CDE Lead - Tissue
$130K — $180K *
Sunnyvale, CA 94087 (Santa Clara County)
Today
Healthcare
In-Person
Staff AI/ML Architect, Embodied AI
$150K — $200K *
Sunnyvale, CA 94087 (Santa Clara County)
Today
Healthcare
In-Person
Varicent Solution Architect
$120K — $160K *
Sunnyvale, CA 94087 (Santa Clara County)
Today
Enterprise Technology
In-Person

More Healthcare Jobs

Licensed Therapist
Small Joys
Remote
Today
Executive Director, Facilities Operations
The Vernon Staffing Group
Cleveland, OH 44106 (Cuyahoga County)
Reposted 3 days ago
Clinical Specialist - Radiology
$125K + $15K bonus + equity *
Confidential Company
Atlanta, GA 30303 (Fulton County)
1 week ago
Software Engineer II
$80K — $110K *
Clinical Architecture
Indianapolis, IN 46227 (Marion County)
Today
Director, Transformation Analytics and Architecture
$120K — $150K *
UofL Health
Louisville, KY 40214 (Jefferson County)
Today

Find similar Senior AI/ML Research Engineer (Computer Vision) jobs:

Nationwide Sunnyvale, CA

Senior AI/ML Research Engineer (Computer Vision)

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Senior AI/ML Research Engineer (Computer Vision) jobs:

Get Ready For Your
Next Interview