BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR
Abstract
This paper presents a Bidirectional Feature Fusion Network (BiDFNet) for 3D object detection, leveraging pseudo-point clouds to achieve bidirectional fusion of point cloud and image features. The proposed model addresses key challenges in multimodal 3D detection by introducing three novel components: (1) the SAF-Conv module, which extends the receptive field through improved submanifold sparse convolution, enhancing feature extraction from pseudo-point clouds while effectively reducing edge noise; (2) the Bidirectional Cross-modal Attention Feature Interaction Module (BiCSAFIM), which employs a multi-head cross-attention mechanism to enable global information interaction between point cloud and image features; and (3) the Attention-based Feature Fusion Module (ADFM), which adaptively fuses dual-stream features to improve robustness. Extensive experiments on the KITTI dataset demonstrate that BiDFNet achieves state-of-the-art performance, with a 3D AP (R40) of 88.79\% on the validation set and 85.27\% on the test set for the Car category, significantly outperforming existing methods. These results highlight the effectiveness of BiDFNet in complex scenarios, showcasing its potential for real-world applications such as autonomous driving.
Related articles
Related articles are currently not available for this article.