SEMA-YOLO：通过浅层增强和多尺度适应实现遥感图像中的轻量级小物体检测

本文提出了一种新颖的网络，称为SEMA-YOLO，作为基于YOLOv11的增强框架，结合了三项技术进步。首先，浅层增强（SLE）策略减少了主干网络的深度，并引入了小物体检测头，从而增加了特征图的大小，提高了小物体检测性能。实验结果表明，SEMA-YOLO在RS-STOD数据集上实现了72.5%的mAP50分数，在AI-TOD数据集上实现了61.5%的mAP50分数，超越了最先进的模型。#小物体检测

水静川流

414人浏览 · 2025-09-13 01:45:05

水静川流 · 2025-09-13 01:45:05 发布

标题：SEMA-YOLO: Lightweight Small Object Detection in Remote Sensing Image via Shallow-Layer Enhancement and Multi-Scale Adaptation

期刊：Remote Sensing

第一印象：浅层增强（SLE）策略 "simple but work"

摘要：小物体检测在遥感领域仍然是一个挑战，因为在下采样过程中会丢失特征，并且复杂背景会造成干扰。本文提出了一种新颖的网络，称为SEMA-YOLO，作为基于YOLOv11的增强框架，结合了三项技术进步。通过从根本上减少信息损失并结合跨尺度特征融合机制，所提出的框架显著增强了小物体检测性能。首先，浅层增强（SLE）策略减少了主干网络的深度，并引入了小物体检测头，从而增加了特征图的大小，提高了小物体检测性能。然后，设计了全局上下文池增强自适应空间特征融合（GCP-ASFF）架构，以优化四个检测头之间的跨尺度特征交互。最后，引入了将感受野适应（RFA）与C3k2结构相结合的RFA-C3k2模块，以实现更精细的特征提取。SEMA-YOLO在复杂的城市环境和密集目标区域表现出显著优势，同时其泛化能力满足了不同场景下的检测需求。实验结果表明，SEMA-YOLO在RS-STOD数据集上实现了72.5%的mAP50分数，在AI-TOD数据集上实现了61.5%的mAP50分数，超越了最先进的模型。

作者：Zhenchuan Wu, Hang Zhen, Xiaoxinxi Zhang, Xuechen Bai, Xinghua Li

Original Title: Remote Sensing, Vol. 17, Pages 1917: SEMA-YOLO: Lightweight Small Object Detection in Remote Sensing Image via Shallow-Layer Enhancement and Multi-Scale Adaptation

Abstract: Small object detection remains a challenge in the remote sensing field due to feature loss during downsampling and interference from complex backgrounds. A novel network, termed SEMA-YOLO, is proposed in this paper as an enhanced YOLOv11-based framework incorporating three technical advancements. By fundamentally reducing information loss and incorporating a cross-scale feature fusion mechanism, the proposed framework significantly enhances small object detection performance. First, the Shallow Layer Enhancement (SLE) strategy reduces backbone depth and introduces small-object detection heads, thereby increasing feature map size and improving small object detection performance. Then, the Global Context Pooling-enhanced Adaptively Spatial Feature Fusion (GCP-ASFF) architecture is designed to optimize cross-scale feature interaction across four detection heads. Finally, the RFA-C3k2 module, which integrates Receptive Field Adaptation (RFA) with the C3k2 structure, is introduced to achieve more refined feature extraction. SEMA-YOLO demonstrates significant advantages in complex urban environments and dense target areas, while its generalization capability meets the detection requirements across diverse scenarios. The experimental results show that SEMA-YOLO achieves mAP50 scores of 72.5% on the RS-STOD dataset and 61.5% on the AI-TOD dataset, surpassing state-of-the-art models.

DOI: 10.3390/rs17111917

Link: https://www.mdpi.com/2072-4292/17/11/1917

#小物体检测# #深度学习# #特征融合# #遥感图像# #城市环境#

2048 AI社区

有“AI”的1024 = 2048，欢迎大家加入2048 AI社区

更多推荐

【室内导航通过视觉惯性数据融合】将用户携带的智能手机收集的惯性数据与手机相机获取的视觉信息进行融合研究（Matlab代码实现）

视觉惯性数据融合在室内导航中的核心价值在于互补纠偏与环境适应性。通过紧耦合算法、多传感器冗余及深度学习优化，系统在复杂场景下的定位误差可控制在1%以内（如100米路径误差<1米）。随着MEMS传感器精度的提升（如下一代陀螺仪零偏不稳定性目标<5°/hr），以及边缘AI算力的发展，智能手机将成为室内外无缝导航的关键载体。📚2 运行结果部分代码：i=0;i=i+1;endfrq=30;🎉3参考文献

2048 AI社区

【室内导航通过视觉惯性数据融合】将用户携带的智能手机收集的惯性数据与手机相机获取的视觉信息进行融合研究（Matlab代码实现）

2048 AI社区

AI Compass前沿速览：Jetson Thor英伟达AI计算、Gemini 2.5 Flash Image

Wan2.2在技术上进行了多项创新。此外，它可能采用了类似“专家混合”（Mixture-of-Experts）的路由机制，根据信噪比（SNR）动态切换不同的专家模型来处理视频生成的不同阶段，例如一个专家处理高噪声的早期帧，另一个处理细节添加，从而在不增加总计算量的情况下提升输出质量和连贯性。AI Prompt Optimizer（AI提示词优化器）是一个专业的提示词工程工具或平台，旨在帮助用户优化