留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

面向光学遥感目标的全局上下文检测模型设计

张瑞琰,姜秀杰,安军社,崔天舒

downloadPDF
张瑞琰, 姜秀杰, 安军社, 崔天舒. 面向光学遥感目标的全局上下文检测模型设计[J]. , 2020, 13(6): 1302-1313. doi: 10.37188/CO.2020-0057
引用本文: 张瑞琰, 姜秀杰, 安军社, 崔天舒. 面向光学遥感目标的全局上下文检测模型设计[J]. , 2020, 13(6): 1302-1313.doi:10.37188/CO.2020-0057
ZHANG Rui-yan, JIANG Xiu-jie, AN Jun-she, CUI Tian-shu. Design of global-contextual detection model for optical remote sensing targets[J]. Chinese Optics, 2020, 13(6): 1302-1313. doi: 10.37188/CO.2020-0057
Citation: ZHANG Rui-yan, JIANG Xiu-jie, AN Jun-she, CUI Tian-shu. Design of global-contextual detection model for optical remote sensing targets[J].Chinese Optics, 2020, 13(6): 1302-1313.doi:10.37188/CO.2020-0057

面向光学遥感目标的全局上下文检测模型设计

doi:10.37188/CO.2020-0057
基金项目:中国科学院复杂航天系统电子信息技术重点实验室自主部署基金(No. Y42613A32S)
详细信息
    作者简介:

    张瑞琰(1995—),女,河南商丘人,博士研究生,2016年于南开大学获得理学学士学位,2018年于中国科学院大学国家空间科学中心转为硕博连读,主要从事空间数据处理、遥感目标检测网络优化及压缩的相关研究。E-mail:zhangruiyan16@mails.ucas.ac.cn

    姜秀杰(1965—),女,黑龙江鹤岗人,博士,研究员,博士生导师,1988年于北京航空航天大学获得工学学士学位,1991年于中国科学院大学获得工学硕士学位,2007年于清华大学获得工学博士学位,主要从事火箭探空技术研究、电场探测技术研究和空间综合电子技术研究。E-mail:jiangxj@nssc.ac.cn

  • 中图分类号:TP391.4

Design of global-contextual detection model for optical remote sensing targets

Funds:Supported by Laboratory Fund of Key Laboratory of Electronics and Information Technology for Space Systems, CAS (No. Y42613A32S)
More Information
  • 摘要:在复杂背景下的光学遥感图像目标检测中,为了提高检测精度,同时降低检测网络复杂度,提出了面向光学遥感目标的全局上下文检测模型。首先,采用结构简单的特征编码-特征解码网络进行特征提取。其次,为提高对多尺度目标的定位能力,采取全局上下文特征与目标中心点局部特征相结合的方式生成高分辨率热点图,并利用全局特征实现目标的预分类。最后,提出不同尺度的定位损失函数,用于增强模型的回归能力。实验结果表明: 当使用主干网络Root-ResNet18时,本文模型在公开遥感数据集NWPU VHR-10上的检测精度可达97.6%AP 50和83.4%AP 75,检测速度达16 PFS,基本满足设计需求,实现了网络速度和精度的有效平衡,便于后续算法在移动设备端的移植和应用。

  • 图 1特征编码-特征解码网络架构

    Figure 1.Framework of the feature encoder-feature decoder network

    图 2全局上下文检测模型总体架构

    Figure 2.Overall framework of the global-contextual detection model

    图 3普通卷积采样和变形卷积采样示意图

    Figure 3.Sampling diagrams in standard convolution and deformation convolution

    图 4全局上下文特征提取流程

    Figure 4.Flow chart of global-contextual feature extraction

    图 5(a)添加目标框的原图及(b)高斯椭圆掩模示意图

    Figure 5.(a) Original image with a target box and (b) schematic diagram of gaussian elliptical mask

    图 6不同γ值对结果的影响

    Figure 6.Effects of differentγvalues on results

    图 7不同ν值对结果的影响

    Figure 7.Effects of differentνvalues on results

    图 8GCDN的可视化检测效果图

    Figure 8.Visual detection results of the GCDN

    表 1ResNet18与Root-ResNet18结构

    Table 1.Structures of ResNet18 and Root-ResNet18

    阶段 输出尺寸 ResNet18 Root-ResNet18
    C1 128×128 7×7, 64 3×3, 64
    3×3, 64
    3×3, 64
    3×3, MaxPool
    $\left[ {\begin{array}{*{20}{c}} {3 \times 3,64} \\ {3 \times 3,64} \end{array}} \right]\times 2$
    C2 64×64 3×3, MaxPool $\left[ {\begin{array}{*{20}{c}} {3 \times 3,128} \\ {3 \times 3,128} \end{array}} \right]\times 2$
    $\left[ {\begin{array}{*{20}{c}} {3 \times 3,64} \\ {3 \times 3,64} \end{array}} \right]\times 2$
    C3 32×32 $\left[ {\begin{array}{*{20}{c}} {3 \times 3,128} \\ {3 \times 3,128} \end{array}} \right]\times 2$ $\left[ {\begin{array}{*{20}{c}} {3 \times 3,256} \\ {3 \times 3,256} \end{array}} \right]\times 2$
    C4 16×16 $\left[ {\begin{array}{*{20}{c}} {3 \times 3,256} \\ {3 \times 3,256} \end{array}} \right]\times 2$ $\left[ {\begin{array}{*{20}{c}} {3 \times 3,512} \\ {3 \times 3,512} \end{array}} \right]\times 2$
    C5 8×8 $\left[ {\begin{array}{*{20}{c}} {3 \times 3,512} \\ {3 \times 3,512} \end{array}} \right]\times 2$
    下载: 导出CSV

    表 2数据集NWPU VHR-10目标尺寸统计表

    Table 2.Statistics of target sizes in the NWPU VHR-10 dataset

    尺度(pixel) 0−10 10−40 40−100 100−300 300−500 500以上
    0 0.1327 0.6948 0.1599 0.0126 0
    0 0.1448 0.7205 0.1242 0.0105 0
    下载: 导出CSV

    表 3不同模型在数据集NWPU VHR-10上的平均精确度对比

    Table 3.Comparison of mean average precisions of different models in the NWPU VHR-10 dataset

    模型 主干网络 飞机 船舰 油罐 棒球场 网球场 篮球场 田径场 港口 桥梁 车辆 平均精确度(mAP)
    RICAOD AlexNet 0.9970 0.9080 0.9061 0.9291 0.9029 0.8031 0.9081 0.8029 0.6853 0.8714 0.8712
    SSD VGG16 0.9839 0.8993 0.8918 0.9851 0.8791 0.8481 0.9949 0.7730 0.7828 0.8739 0.8912
    YOLOv3 Darknet53 0.9091 0.9091 0.9081 0.9913 0.9086 0.9091 0.9947 0.9005 0.9091 0.9035 0.9243
    MMDFN VGG16 0.9934 0.9227 0.9918 0.9668 0.9632 0.9756 1.0000 0.9740 0.8027 0.9136 0.9504
    MSDN ResNet50 0.9976 0.9721 0.8383 0.9909 0.9734 0.9991 0.9868 0.9719 0.9267 0.9010 0.9558
    MSCNN ResNet50 0.9940 0.9530 0.9180 0.9630 0.9540 0.9670 0.9930 0.9550 0.9720 0.9330 0.9600
    GCDN Root-ResNet18 0.9991 0.9983 0.9677 0.9916 0.9991 0.9759 0.9988 0.9412 0.9224 0.9636 0.9757
    下载: 导出CSV

    表 4检测阈值0.75下的不同模型平均精确度对比

    Table 4.Comparison of mean-average precision of different models under the 0.75 detection threshold

    模型 平均准确率 (AP) 平均准确率(AP)($I_{ {\rm{{ou} } } }$=0.50:0.95) 平均召回率(AR)($I_{ {\rm{{ou} } } }$=0.50:0.95)
    $I_{ {\rm{{ou} } } }$=0.50:0.95 $I_{ {\rm{{ou} } } }$=0.50 $I_{ {\rm{{ou} } } }$=0.75 小目标 中目标 大目标 小目标 中目标 大目标
    CenterNet 0.663 0.968 0.768 0.576 0.669 0.704 0.630 0.725 0.741
    MSCNN 0.706 0.960 0.824 0.547 0.578 0.701 0.600 0.605 0.700
    GCDN-woGC 0.679 0.973 0.775 0.572 0.690 0.750 0.639 0.744 0.786
    GCDN 0.705 0.976 0.834 0.612 0.727 0.706 0.683 0.773 0.740
    下载: 导出CSV

    表 5不同模型的平均检测时间对比

    Table 5.Comparison of the average detection times with different models

    模型 输入尺寸 时间/s
    RICAOD 400×400 2.89
    MMDFN 400×400 0.75
    YOLOv3 640×640 0.13
    MSDN 600×800 0.21
    GCDN 640×640 0.06
    下载: 导出CSV

    表 6不同模型在数据集DOTA上的平均精确度对比

    Table 6.Comparison of the mean-average precisions with different models in the DOTA dataset

    模型 主干网络 平均检测精度(mAP)
    YOLOv2 GoogleNet 39.20
    R-FCN ResNet101 52.58
    Faster-RCNN ResNet101 60.46
    SCFPN-scf ResNet101 75.22
    GCDN Root-ResNet18 75.95
    下载: 导出CSV
  • [1] 许夙晖, 慕晓冬, 柯冰, 等. 基于遥感影像的军事阵地动态监测技术研究[J]. 遥感技术与应用,2014,29(3):511-516.

    XU S H, MU X D, KE B,et al. Dynamic monitoring of military position based on remote sensing image[J].Remote Sensing Technology and Application, 2014, 29(3): 511-516. (in Chinese)
    [2] VALERO S, CHANUSSOT J, BENEDIKTSSON J A,et al. Advanced directional mathematical morphology for the detection of the road network in very high resolution remote sensing images[J].Pattern Recognition Letters, 2010, 31(10): 1120-1127.doi:10.1016/j.patrec.2009.12.018
    [3] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C].Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, 2005: 886-893.
    [4] LOWE D G. Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision, 2004, 60(2): 91-110.doi:10.1023/B:VISI.0000029664.99615.94
    [5] LIU W, ANGUELOV D, ERHAN D,et al.. SSD: single shot multibox detector[C].Proceedings of the 14th European Conference on Computer Vision, Springer, 2016: 21-37.
    [6] REDMON J, FARHADI A. Yolov3: an incremental improvement[J]. arXiv: 1804.02767, 2018.
    [7] 马永杰, 宋晓凤. 基于YOLO和嵌入式系统的车流量检测[J]. 液晶与显示,2019,34(6):613-618.doi:10.3788/YJYXS20193406.0613

    MA Y J, SONG X F. Vehicle flow detection based on YOLO and embedded system[J].Chinese Journal of Liquid Crystals and Displays, 2019, 34(6): 613-618. (in Chinese)doi:10.3788/YJYXS20193406.0613
    [8] LIN T Y, GOYAL P, GIRSHICK R,et al.. Focal loss for dense object detection[C].Proceedings of 2017 IEEE International Conference on Computer Vision, IEEE, 2017: 2999-3007.
    [9] LAW H, DENG J. Cornernet: detecting objects as paired keypoints[C].Proceedings of the 15th European Conference on Computer Vision, Springer, 2018: 765-781.
    [10] ZHOU X Y, WANG D Q, KRÄHENBÜHL P. Objects as points[J]. arXiv: 1904.07850, 2019.
    [11] XIAO B, WU H P, WEI Y CH. Simple baselines for human pose estimation and tracking[C].Proceedings of the 15th European Conference on Computer Vision, Springer, 2018: 472-487.
    [12] LI K, CHENG G, BU SH H,et al. Rotation-insensitive and context-augmented object detection in remote sensing images[J].IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(4): 2337-2348.doi:10.1109/TGRS.2017.2778300
    [13] MA W P, GUO Q Q, WU Y,et al. A novel multi-model decision fusion network for object detection in remote sensing images[J].Remote Sensing, 2019, 11(7): 737.doi:10.3390/rs11070737
    [14] 梁华, 宋玉龙, 钱锋, 等. 基于深度学习的航空对地小目标检测[J]. 液晶与显示,2018,33(9):793-800.doi:10.3788/YJYXS20183309.0793

    LIANG H, SONG Y L, QIAN F,et al. Detection of small target in aerial photography based on deep learning[J].Chinese Journal of Liquid Crystals and Displays, 2018, 33(9): 793-800. (in Chinese)doi:10.3788/YJYXS20183309.0793
    [15] 姚群力, 胡显, 雷宏. 基于多尺度卷积神经网络的遥感目标检测研究[J]. 光学学报,2019,39(11):1128002.doi:10.3788/AOS201939.1128002

    YAO Q L, HU X, LEI H. Object detection in remote sensing images using multiscale convolutional neural networks[J].Acta Optica Sinica, 2019, 39(11): 1128002. (in Chinese)doi:10.3788/AOS201939.1128002
    [16] 邓志鹏, 孙浩, 雷琳, 等. 基于多尺度形变特征卷积网络的高分辨率遥感影像目标检测[J]. 测绘学报,2018,47(9):1216-1227.

    DENG ZH P, SUN H, LEI L,et al. Object detection in remote sensing imagery with multi-scale deformable convolutional networks[J].Acta Geodaetica et Cartographica Sinica, 2018, 47(9): 1216-1227. (in Chinese)
    [17] 董潇潇, 何小海, 吴晓红, 等. 基于注意力掩模融合的目标检测算法[J]. 液晶与显示,2019,34(8):825-833.doi:10.3788/YJYXS20193408.0825

    DONG X X, HE X H, WU X H,et al. Object detection algorithm based on attention mask fusion[J].Chinese Journal of Liquid Crystals and Displays, 2019, 34(8): 825-833. (in Chinese)doi:10.3788/YJYXS20193408.0825
    [18] WANG CH, BAI X, WANG SH,et al. Multiscale visual attention networks for object detection in VHR remote sensing images[J].IEEE Geoscience and Remote Sensing Letters, 2019, 16(2): 310-314.doi:10.1109/LGRS.2018.2872355
    [19] 左俊皓, 赵聪, 朱晓龙, 等. Faster-RCNN和Level-Set结合的高分遥感影像建筑物提取[J]. 液晶与显示,2019,34(4):439-447.doi:10.3788/YJYXS20193404.0439

    ZUO J H, ZHAO C, ZHU X L,et al. High-resolution remote sensing image building extraction combined with Faster-RCNN and Level-Set[J].Chinese Journal of Liquid Crystals and Displays, 2019, 34(4): 439-447. (in Chinese)doi:10.3788/YJYXS20193404.0439
    [20] 于渊博, 张涛, 郭立红, 等. 卫星视频运动目标检测算法[J]. 液晶与显示,2017,32(2):138-143.doi:10.3788/YJYXS20173202.0138

    YU Y B, ZHANG T, GUO L H,et al. Moving objects detection on satellite video[J].Chinese Journal of Liquid Crystals and Displays, 2017, 32(2): 138-143. (in Chinese)doi:10.3788/YJYXS20173202.0138
    [21] LIU W, RABINOVICH A, BERG A C. Parsenet: looking wider to see better[J]. arXiv: 1506.04579, 2015.
    [22] ZHANG H, DANA K, SHI J P,et al.. Context encoding for semantic segmentation[C].Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2018: 7151-7160.
    [23] HE K M, ZHANG X Y, REN SH Q,et al.. Deep residual learning for image recognition[C].Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2016: 770-778.
    [24] ZHU R, ZHANG SH F, WANG X B,et al.. ScratchDet: training single-shot object detectors from scratch[C].Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2019: 2263-2272.
    [25] LIN M, CHEN Q, YAN SH CH. Network in network[J]. arXiv: 1312.4400, 2013.
    [26] CHENG G, ZHOU P CH, HAN J W. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J].IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(12): 7405-7415.doi:10.1109/TGRS.2016.2601622
    [27] XIA G S, BAI X, DING J,et al.. DOTA: a large-scale dataset for object detection in aerial images[C].Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2018: 3974-3983.
    [28] ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C].Proceedings of the 13th European Conference on Computer Vision, Springer, 2014: 818-833.
    [29] CHEN K, WANG J Q, PANG J M,et al.. MMDetection: open MMLab detection toolbox and benchmark[J]. arXiv: 1906.07155v1, 2019.
    [30] CHEN CH Y, GONG W G, CHEN Y L,et al. Object detection in remote sensing images based on a scene-contextual feature pyramid network[J].Remote Sensing, 2019, 11(3): 339.doi:10.3390/rs11030339
  • 加载中
图(8)/ 表(6)
计量
  • 文章访问数:1905
  • HTML全文浏览量:451
  • PDF下载量:98
  • 被引次数:0
出版历程
  • 收稿日期:2020-04-07
  • 修回日期:2020-05-11
  • 网络出版日期:2020-10-22
  • 刊出日期:2020-12-01

目录

    /

      返回文章
      返回
        Baidu
        map