One point is all you need for weakly supervised object detection

Shiwei Zhang, Zhengzheng Wang, Wei Ke

Research output: Contribution to journalArticlepeer-review

Abstract

Object detection with weak annotations has attracted much attention recently. Weakly supervised object detection(WSOD) methods which only use image-level labels to train a detector encounter some severe problems that it cannot cover the whole object and the region proposal methods waste a large amount of time. Meanwhile, point supervised object detection(PSOD) leverages point annotations that remarkably improves the performance. However, point annotation is still complex and increase the costs of annotation. To overcome these issues, we propose a novel method which only requires one point per category for a training image. Compared to point annotation, our method significantly reduces the annotation cost as the number of point annotations is largely reduced. We design a framework to train a detector with one point per category annotation. Firstly, a pseudo box generation module is introduced to generate the corresponding pseudo boxes of the annotated points. Then, inspired by the observation that the features of objects with the same class in an image are very similar, a dense instances mining module is proposed to make use of the similarity between the features of objects with the same class to discover unlabeled instances and generate pseudo category heatmaps. Finally, the pseudo boxes and pseudo category heatmaps are leveraged to train a detector. Experiments conducted on popular open-source datasets verify the effectiveness of our annotation method and framework. Our proposed method outperforms previous WSOD methods and achieves comparable performance with some PSOD methods in a more efficient way.

Original languageEnglish
Article number111087
JournalPattern Recognition
Volume159
DOIs
Publication statusPublished - Mar 2025
Externally publishedYes

Keywords

  • Object detection
  • Similarity-based learning
  • Weak annotation

Fingerprint

Dive into the research topics of 'One point is all you need for weakly supervised object detection'. Together they form a unique fingerprint.

Cite this