Object detection based on Transformer with prefiltered attention
DOI:
Author:
Affiliation:

College of Automation and Electronic Engineering, Qingdao University of Science and Technology, Qingdao 266061, China

Clc Number:

TP391.9

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Transformer-based target detectors proposed in recent years have simplified the model structure and demonstrated competitive performance. However, most of the models suffer from slow convergence and poor detection of small objects due to the way the Transformer attention module handles feature maps. To address these issues, this study proposes a Transformer detection model based on a pre-filtered attention module. Using the target point as reference, the module only samples a part of the feature points near the target point, which saves training time and improves detection accuracy. A newly defined directional relative position encoding is also integrated in the module. The encoding compensates for the lack of relative position information in the module due to the weight calculation that is more helpful for the detection of small objects. Experiments on the COCO 2017 dataset show that our model reduces the training time by a factor of 10 and improves the detection accuracy, especially on small object detection by 26.8 APs.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: March 08,2024
  • Published: