Abstract
Localizing objects in image collections without supervision can help to avoid expensive annotation campaigns. We propose a simple approach to this problem, that leverages the activation features of a vision transformer pre-trained in a self-supervised manner. Our method, LOST, does not require any external object proposal nor any exploration of the image collection; it operates on a single image. Yet, we outperform state-of-the-art object discovery methods by up to 8 CorLoc points on PASCAL VOC 2012. We also show that training a class-agnostic detector on the discovered objects boosts results by another 7 points. Moreover, we show promising results on the unsupervised object discovery task. The code can be found at https://github.com/valeoai/LOST.
| Original language | English |
|---|---|
| Publication status | Published - 1 Jan 2021 |
| Externally published | Yes |
| Event | 32nd British Machine Vision Conference, BMVC 2021 - Virtual, Online Duration: 22 Nov 2021 → 25 Nov 2021 |
Conference
| Conference | 32nd British Machine Vision Conference, BMVC 2021 |
|---|---|
| City | Virtual, Online |
| Period | 22/11/21 → 25/11/21 |
Fingerprint
Dive into the research topics of 'Localizing Objects with Self-Supervised Transformers and no Labels'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver