Abstract:
Underwater environments usually suffer from issues such as light attenuation, color distortion, complex background interference, diverse target scales, and blurred target features. This paper addresses challenges related to diverse target scales and difficulties in feature localization by proposing an improved underwater target detection algorithm based on Faster R-CNN ((sg_Faster R-CNN). Firstly, we introduced switchable atrous convolution in feature extraction to address the problem of feature loss due to the absence of global contextual information during feature extraction. Secondly, we used a recursive feature pyramid to enable multiple interactions between high-level and low-level features, enhancing the capability of model to detect small underwater targets and complex-shaped objects. Lastly, we introduced a proposal network based on guided anchor boxes, which could dynamically generate anchors that were sparser and shape-adaptive based on the image's semantic features, significantly improving the accuracy and localization ability of the model for underwater target detection. Experiments demonstrate that the improved algorithm achieves a 5.7% increase in mAP (mean average precision) on the DUO underwater dataset and also performs well on the general-purpose object detection dataset VOC.