中文版 | English
Title

Dense Crosstalk Feature Aggregation for Classification and Localization in Object Detection

Author
Publication Years
2022
DOI
Source Title
ISSN
1051-8215
EISSN
1558-2205
Pages1-1
Abstract
The misalignment between classification and localization is a significant performance improvement point for object detection. To cope with the misalignment problem, more attempts have been made to separate different tasks (e.g., Classification, Bounding Box Regression) by introducing extra heads, which emphasizes the separation of multiple tasks to cope with their variability. In this paper, we consider that both separation and crosstalk are important between classification and localization. Considering that the two types of tasks are different and have different regions and features of interest, they are in conflict with each other and therefore need to be separated. However, they also need to be fused, because classification and localization are, after all, about understanding the same object. To realize this idea, we introduce bidirectional crosstalk detection head in a systematic manner to provide a full deep cross-fusion between classification and localization. To our best knowledge, it is the first time that full bidirectional crosstalk is introduced between classification and localization for one-stage detector. Extensive experiments are conducted to demonstrate the effectiveness of the proposed method. With a ResNet-50 backbone, our method can significantly improve the GFLV1 baseline by 2.0 AP with similar inference speed (18.5 fps vs. 18.3 fps) and further boost GFLV1 with a big margin (4.3 AP) by increasing our model size. Fair comparisons also show that the proposed head outperforms state-of-the-art heads (T-Head, DyHead) with comparable or faster inference speed under the same ATSS baseline model. With a Res2Net-DCN backbone, our model achieves 51.7 AP at single-model single-scale testing. The code and pretrained models will be made publicly available.
Keywords
URL[Source Record]
Indexed By
Language
English
SUSTech Authorship
Others
EI Accession Number
20224613110664
EI Keywords
Alignment ; Feature extraction ; Object detection ; Object recognition ; Separation
ESI Classification Code
Mechanical Devices:601.1 ; Data Processing and Image Processing:723.2 ; Chemical Operations:802.3
ESI Research Field
ENGINEERING
Scopus EID
2-s2.0-85141601655
Data Source
Scopus
Citation statistics
Cited Times [WOS]:0
Document TypeJournal Article
Identifierhttp://kc.sustech.edu.cn/handle/2SGJ60CL/411893
DepartmentSouthern University of Science and Technology
Affiliation
1.School of Computer Science and Technology, National University of Defense Technology, Changsha, China
2.Space Engineering University, Beijing, China
3.South University of Science and Technology and Harbin Institute of Technology, Shenzhen, China
Recommended Citation
GB/T 7714
Li,Yuanwei,Zhu,En,Chen,Hang,et al. Dense Crosstalk Feature Aggregation for Classification and Localization in Object Detection[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,2022:1-1.
APA
Li,Yuanwei,Zhu,En,Chen,Hang,Tan,Jiyong,&Shen,Li.(2022).Dense Crosstalk Feature Aggregation for Classification and Localization in Object Detection.IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,1-1.
MLA
Li,Yuanwei,et al."Dense Crosstalk Feature Aggregation for Classification and Localization in Object Detection".IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2022):1-1.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Export to Excel
Export to Csv
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[Li,Yuanwei]'s Articles
[Zhu,En]'s Articles
[Chen,Hang]'s Articles
Baidu Scholar
Similar articles in Baidu Scholar
[Li,Yuanwei]'s Articles
[Zhu,En]'s Articles
[Chen,Hang]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Li,Yuanwei]'s Articles
[Zhu,En]'s Articles
[Chen,Hang]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.