VPU-RTDETR: A Lightweight, Self-Adaptive, Real-Time Model for Small Object Detection on UAVs

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Small target detection in UAV aerial images faces significant challenges due to low resolution, complex backgrounds, and scale variations. To address issues in existing RT-DETR, namely insufficient feature extraction for small targets, inadequate capture of local information by the attention mechanism, and low sensitivity of the loss function, this paper proposes a lightweight and adaptive detection model named VPU-RTDETR. In the backbone network, the VASM module is introduced to achieve dynamic fusion of multi-scale features; in the encoder, the AIFI-Pola module is employed to simultaneously enhance global and local features via a polarized linear attention mechanism; during the feature fusion stage, the USOS scheme is designed, utilizing SPDConv and C-OKM modules to improve the utilization of low-resolution features. Additionally, a hybrid loss function based on FocalerIoU and MPDIoU is constructed to effectively improve the localization accuracy of small targets. Experimental results demonstrate that, compared with the baseline model, VPU-RTDETR achieves a 3.1% improvement in mAP50 and a 2.4% improvement in mAP50:95 on the VisDrone2019 dataset, while maintaining 64 FPS real-time performance and a relatively low parameter count, thereby demonstrating a high cost-performance advantage for detection on UAV platforms.

Related articles

Related articles are currently not available for this article.