Abstract:Underwater object detection has been widely applied in various fields such as marine biology research, archaeological exploration, and military defense. With the rapid development of Artificial Intelligence, underwater object detection has also become unmanned and intelligent. Deep learning uses neural networks to mine information features, demonstrating excellent performance in both speed and accuracy, and has become the mainstream algorithm in computer vision technology. However, in complex underwater environments, there are still significant challenges in applying it to underwater image object detection; The complementary information and rich features of various modalities of underwater targets are beneficial for target detection and recognition. Therefore, this article combines application scenarios to investigate existing technologies, and then designs a multi-modal underwater target detection system based on deep learning. At the same time, the advantages and disadvantages of existing core technologies are compared and analyzed; Finally, a summary and outlook on the future development of multimodal object detection systems are of great significance.