NOPE: Novel Object Pose Estimation from a Single Image

  • Van Nguyen Nguyen
  • , Thibault Groueix
  • , Georgy Ponimatkin
  • , Yinlin Hu
  • , Renaud Marlet
  • , Mathieu Salzmann
  • , Vincent Lepetit

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The practicality of 3D object pose estimation remains limited for many applications due to the need for prior knowledge of a 3D model and a training period for new objects. To address this limitation, we propose an approach that takes a single image of a new object as input and pre-dicts the relative pose of this object in new images without prior knowledge of the object's 3D model and without re-quiring training time for new objects and categories. We achieve this by training a model to directly predict discrim-inative embeddings for viewpoints surrounding the object. This prediction is done using a simple U-Net architecture with attention and conditioned on the desired pose, which yields extremely fast inference. We compare our approach to state-of-the-art methods and show it outperforms them both in terms of accuracy and robustness.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024
PublisherIEEE Computer Society
Pages17923-17932
Number of pages10
ISBN (Electronic)9798350353006
ISBN (Print)9798350353006
DOIs
Publication statusPublished - 1 Jan 2024
Externally publishedYes
Event2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024 - Seattle, United States
Duration: 16 Jun 202422 Jun 2024

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Conference

Conference2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024
Country/TerritoryUnited States
CitySeattle
Period16/06/2422/06/24

Keywords

  • object pose estimation

Fingerprint

Dive into the research topics of 'NOPE: Novel Object Pose Estimation from a Single Image'. Together they form a unique fingerprint.

Cite this