Researcher - Computer Vision and Multimodal Foundation Models

Huawei Technologies Canada Co., Ltd.

$90K — $130K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • Advanced degree in Computer Science, Robotics, or related field with a focus on computer vision and multimodal models (Ph.D. preferred).
  • Deep understanding of spatial reasoning and foundation models such as VLM, LLM, and MLLM.
  • Proven experience applying computer vision algorithms to real-world challenges.
  • Strong programming skills in Python and C++, with hands-on experience using deep learning frameworks like PyTorch and TensorFlow.
  • Exceptional problem-solving and analytical abilities to tackle complex technical issues.
  • Robust publication record in top-tier AI conferences and journals (e.g., CVPR, ICCV, NeurIPS, etc.).

Responsibilities

  • Drive research and development in computer vision and multimodal foundation models independently.
  • Collaborate with cross-functional teams to embed computer vision solutions into broader spatial reasoning systems.
  • Mentor junior engineers and interns to create a collaborative and innovative team environment.
  • Develop the company’s intellectual property through patent filings and research publications.
  • Stay updated on the latest advancements in computer vision and machine learning to proactively identify innovative opportunities.
  • Manage projects independently and deliver high-quality results within set timelines.
Full Job Description
Huawei Canada has an immediate permanent opening for a Researcher.

About the team:

Founded in 2012, the Noah's Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab's mission focuses on advancing artificial intelligence and related fields to benefit the company and society. Driven by impactful, long-term projects, the aim is to enhance state-of-the-art research while integrating innovations into the company's products and services, including LLMs, RL, NLP, computer vision, AI theory, and Autonomous driving.

Responsibilities:
  • Independently contribute to research and development efforts within the area of computer vision and multimodal foundation models.
  • Collaborate closely with cross-functional teams to integrate computer vision solutions into the broader spatial reasoning system.
  • Mentor and guide junior engineers and interns, fostering a collaborative and innovative team environment.
  • Contribute to the development of our intellectual property portfolio through patent filings and publications.
  • Stay abreast of the latest research and advancements in computer vision and machine learning. Proactively identify opportunities for innovation.
  • Independently manage projects and deliver high-quality results within established timelines.


  • Advanced degree (Ph.D. preferred) in Computer Science, Robotics, or a related field with a focus on computer vision and multimodal foundation models.
  • Deep understanding of spatial reasoning and foundation models, e.g., VLM, LLM, MLLM, etc.
  • Proven expertise in applying computer vision algorithms for real-world application.
  • Strong programming skills in Python and C++, with experience using deep learning frameworks (PyTorch, TensorFlow).
  • Excellent problem-solving and analytical skills, with the ability to tackle complex technical challenges.
  • A strong publication record in top-tier AI conferences/journals (e.g., CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, AAAI, etc.).

Similar Jobs

More Information Technology Jobs

Find similar Researcher - Computer Vision and Multimodal Foundation Models jobs: