八度卷积和双向门控循环单元结合的X光安检图像分类

吴海滨; 魏喜盈; 王爱丽; 岩堀祐之

doi:10.37188/CO.2020-0073

八度卷积和双向门控循环单元结合的X光安检图像分类

doi: 10.37188/CO.2020-0073

1.
哈尔滨理工大学测控技术与通信工程学院，黑龙江哈尔滨 150080
2.
中部大学计算机科学学院，日本爱知 487-8501

基金项目: 国家自然科学基金（No. 61671190）

详细信息

作者简介:
吴海滨（1977—），男，上海人，博士，教授，2002年于哈尔滨工业大学获得硕士学位，2008年于哈尔滨理工大学获得博士学位，现为哈尔滨理工大学测控技术与通信工程学院教授，主要从事机器视觉、医学虚拟现实、深度学习图像分类方面的研究。E-mail：woo@hrbust.edu.cn

王爱丽（1979—），女，天津人，博士，副教授，2008年于哈尔滨工业大学获得博士学位，现为哈尔滨理工大学测控技术与通信工程学院副教授，主要从事机器视觉、深度学习图像分类方面的研究。E-mail：aili925@hrbust.edu.cn

中图分类号: TP391.4
计量
- 文章访问数: 3407
- HTML全文浏览量: 800
- PDF下载量: 211
- 被引次数: 0
出版历程
- 收稿日期: 2020-04-23
- 修回日期: 2020-06-15
- 网络出版日期: 2020-09-16
- 刊出日期: 2020-10-01

X-ray security inspection images classification combined octave convolution and bidirectional GRU

WU Hai-bin^1
,,
WEI Xi-ying^1
,,
WANG Ai-li^{1
, ,},
YUJI Iwahori^2
,

1.
College of Measurement–Control Technology and Communication Engineering, Harbin University of Science and Technology, Harbin 150080, China
2.
Computer Science, Chubu University, Aichi 487-8501, Japan

Funds: Supported by National Natural Science Foundation of China (No. 61671190)

More Information

Corresponding author: aili925@hrbust.edu.cn

摘要

摘要: 针对主动视觉安检方法准确率低、速度慢，不适用于实时交通安检的问题，提出了八度卷积（OctConv）和注意力机制双向门控循环单元（GRU）神经网络相结合的X光安检图像分类方法。首先，利用八度卷积代替传统卷积，对输入的特征向量进行高低分频，并降低低频特征的分辨率，在有效提取X光安检图像特征的同时，减少了空间冗余。其次，通过注意力机制双向GRU，动态学习调整特征权重，提高危险品分类准确率。最后，在通用SIXRay数据集上的实验表明，对8000幅测试样本的整体分类准确率（ACC）、特征曲线下方面积（AUC）、正类分类准确率（PRE）分别为98.73%、91.39%、85.44%，检测时间为36.80 s。相对于目前主流模型，本文方法有效提高了X光安检图像危险品分类的准确率和速度。
- X光安检图像 /
- 八度卷积 /
- 双向门控循环单元 /
- 注意力机制
Abstract: Due to the disadvantages of low accuracy and slow speed in the active vision security inspection method, it is not suitable for real-time security inspection. Aiming at this problem, we propose an x-ray inspection image classification algorithm combining octave convolution (OctConv) with attention-based bidirectional Gate Recurrent Unit (GRU). Firstly, OctConv is introduced to replace the traditional convolution operation to divide the input feature vector into high and low frequency, and reduce the resolution of low frequency features, effectively extracting the features of security image and reducing the spatial redundancy. Then, the feature weight can be adjusted by dynamic learning through attention-based bidirectional GRU to improve the classification accuracy of threat objects. Finally, a lot of experimental results on SIXRay dataset show that the classification accuracy, AUC value and PRE of 8000 test samples are 98.73%, 91.39% and 85.44%, respectively, with a time of 36.80 seconds. Compared with the current mainstream model, the proposed algorithm can improve the performance and speed of threat objects recognition in X-ray security images.
- X-ray inspection images /
- octave convolution /
- bidirectional GRU /
- attention mechanism

HTML全文

图 1 X光安检图像分类算法框图

Figure 1. Block diagram of X-ray security image classification algorithm

下载: 全尺寸图片幻灯片

图 2 八度卷积结构

Figure 2. The structure of octave convolution

下载: 全尺寸图片幻灯片

图 3 双层BiGRU结构

Figure 3. The structure of double-layer BiGRU

下载: 全尺寸图片幻灯片

图 4 SIXray 数据集

Figure 4. SIXRay dataset

下载: 全尺寸图片幻灯片

表 1 SIXray数据集样本分布

Table 1. Sample distribution in SIXray dataset

正类样本 (8929)					负类样本
枪支	刀具	扳手	钳子	剪子	负类样本
3131	1943	2199	3961	983	1050302

下载: 导出CSV

表 2 不同类别数据增强前后对比结果

Table 2. Comparison results of different types of data before and after data augmentation

种类	增强前后	负类样本数	正类样本数	不平衡比率
枪支	增强前	72255	2705	26.27
枪支	增强后	89672	12659	7.08
刀具	增强前	73212	1748	41.88
刀具	增强后	93723	8608	10.89
扳手	增强前	72948	2012	36.26
扳手	增强后	92380	9951	9.28
钳子	增强前	71524	3436	20.82
钳子	增强后	85574	16757	5.10
剪子	增强前	74153	807	91.89
剪子	增强后	99760	2571	38.80

下载: 导出CSV

表 3 不同模型的ACC (%)比较

Table 3. Comparison of ACC (%) for different network modules

方法	枪支	刀具	扳手	钳子	剪子	平均
InceptionV3	94.63	87.52	88.97	80.50	96.95	89.71
VGG19	97.88	98.36	97.48	96.03	97.33	97.42
ResNet	98.36	99.20	98.16	96.10	97.80	97.92
DenseNet	98.69	99.25	98.18	96.16	97.65	97.99
STN-DenseNet	99.15	98.73	97.52	96.32	98.46	98.03
OnlyBiGRU	98.77	99.40	97.73	94.37	99.14	97.88
CNN-ABiGRU	98.89	99.42	98.89	97.07	98.96	98.65
OctConv-ABiGRU	98.60	99.25	99.10	97.50	99.20	98.73

下载: 导出CSV

表 4 不同模型的AUC (%) 比较

Table 4. Comparison of AUC (%) for different network modules

方法	枪支	刀具	扳手	钳子	剪子	平均
InceptionV3	63.34	54.57	51.33	52.92	50.74	54.57
VGG19	93.34	89.03	77.49	76.57	71.08	81.50
ResNet	94.06	88.68	76.00	73.92	60.45	78.64
DenseNet	93.91	90.37	72.59	74.65	61.08	78.52
STN-DenseNet	95.69	93.58	75.60	76.98	65.09	81.39
OnlyBiGRU	92.73	93.90	68.03	73.33	89.42	83.48
CNN-ABiGRU	93.96	93.94	82.22	80.09	87.99	87.65
OctConv-ABiGRU	91.53	94.59	87.84	86.15	96.70	91.39

下载: 导出CSV

表 5 不同网络用时比较

Table 5. Comparison of detection time for different network modules

方法	参数量(百万)	模型大小（MB）	检测时间(s)
VGG19	45.12	344	41.56
DenseNet	57.22	437	24.91
CNN-ABiGRU	14.42	108	75.14
OctConv-ABiGRU	121.47	1382	36.80

下载: 导出CSV

表 6 不同方法的分类精度比较

Table 6. Comparison of PRE (%) for different network modules

方法	枪支	刀具	扳手	钳子	剪子	平均
VGG19	87.20	86.40	56.60	55.20	46.20	66.32
DenseNet	88.20	82.18	51.25	54.50	38.50	62.93
CNN-ABiGRU	88.50	87.20	63.00	61.20	76.40	75.26
OctConv-ABiGRU	86.78	92.22	77.44	76.22	94.56	85.44

下载: 导出CSV

map

参考文献(28)

陈志强, 张丽, 金鑫. X射线安全检查技术研究新进展[J]. 科学通报,2017,62(13):1350-1365. doi: 10.1360/N972016-00698

CHEN ZH Q, ZHANG L, JIN X. Recent progress on X-ray security inspection technologies[J]. Chinese Science Bulletin, 2017, 62(13): 1350-1365. (in Chinese) doi: 10.1360/N972016-00698

CAO S S, LIU Y H, SONG W W, et al.. Toward human-in-the-loop prohibited item detection in X-ray baggage images[C]. Proceedings of 2019 Chinese Automation Congress (CAC), IEEE, 2019: 4360-4364.

LYU SH J, TU X, LU Y. X-Ray image classification for parcel inspection in high-speed sorting line[C]. Proceedings of the 2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), IEEE, 2018: 1-5.

费彬, 孙京阳, 张俊举, 等. 基于稀疏处理的多能X射线分离成像[J]. 光学精密工程,2017,25(4):1106-1111. doi: 10.3788/OPE.20172504.1106

FEI B, SUN J Y, ZHANG J J, et al. Separation of multi-energy X-ray imaging based on sparse processing[J]. Optics and Precision Engineering, 2017, 25(4): 1106-1111. (in Chinese) doi: 10.3788/OPE.20172504.1106

王旖旎. 基于Inception V3的图像状态分类技术[J]. 液晶与显示,2020,35(4):389-394. doi: 10.3788/YJYXS20203504.0389

WANG Y N. Image classification technology based on inception V3[J]. Chinese Journal of Liquid Crystals and Displays, 2020, 35(4): 389-394. (in Chinese) doi: 10.3788/YJYXS20203504.0389

CHOUAI M, MERAH M, SANCHO-GOMEZ J L, et al. Supervised feature learning by adversarial autoencoder approach for object classification in dual X-Ray image of luggage[J]. Journal of Intelligent Manufacturing, 2020, 31(5): 1101-1112. doi: 10.1007/s10845-019-01498-5

张万征, 胡志坤, 李小龙. 基于LeNet-5的卷积神经图像识别算法[J]. 液晶与显示,2020,35(5):486-490. doi: 10.3788/YJYXS20203505.0486

ZHANG W ZH, HU ZH K, LI X L. Convolutional neural image recognition algorithm based on LeNet-5[J]. Chinese Journal of Liquid Crystals and Displays, 2020, 35(5): 486-490. (in Chinese) doi: 10.3788/YJYXS20203505.0486

刘恋秋. 基于深度卷积生成对抗网络的图像识别算法[J]. 液晶与显示,2020,35(4):383-388. doi: 10.3788/YJYXS20203504.0383

LIU L Q. Image recognition algorithms based on deep convolution generative adversarial network[J]. Chinese Journal of Liquid Crystals and Displays, 2020, 35(4): 383-388. (in Chinese) doi: 10.3788/YJYXS20203504.0383

龚希, 吴亮, 谢忠, 等. 融合全局和局部深度特征的高分辨率遥感影像场景分类方法[J]. 光学学报,2019,39(3):0301002. doi: 10.3788/AOS201939.0301002

GONG X, WU L, XIE ZH, et al. Classification method of high-resolution remote sensing scenes based on fusion of global and local deep features[J]. Acta Optica Sinica, 2019, 39(3): 0301002. (in Chinese) doi: 10.3788/AOS201939.0301002

贠卫国, 史其琦, 王民. 基于深度卷积神经网络的多特征融合的手势识别[J]. 液晶与显示,2019,34(4):417-422. doi: 10.3788/YJYXS20193404.0417

YUN W G, SHI Q Q, WANG M. Multi-feature fusion gesture recognition based on deep convolutional neural network[J]. Chinese Journal of Liquid Crystals and Displays, 2019, 34(4): 417-422. (in Chinese) doi: 10.3788/YJYXS20193404.0417

LIU J Y, LENG X X, LIU Y. Deep convolutional neural network based object detector for X-Ray baggage security imagery[C]. Proceedings of 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), IEEE, 2019: 1757-1761.

AKCAY S, KUNDEGORSKI M E, WILLCOCKS C G, et al. Using deep convolutional neural network architectures for object classification and detection within X-ray baggage security imagery[J]. IEEE Transactions on Information Forensics and Security, 2018, 13(9): 2203-2215. doi: 10.1109/TIFS.2018.2812196

ZHU Y, ZHANG H G, AN J Y, et al. GAN-based data augmentation of prohibited item X-ray images in security inspection[J]. Optoelectronics letters, 2020, 16(3): 225-229.

AKÇAY S, ATAPOUR-ABARGHOUEI A, BRECKON T P. Skip-GANomaly: skip connected and adversarially trained encoder-decoder anomaly detection[C]. Proceedings of 2019 International Joint Conference on Neural Networks (IJCNN), IEEE, 2019.

AYDIN I, KARAKOSE M, AKIN E. A new approach for baggage inspection by using deep convolutional neural networks[C]. Proceedings of 2018 International Conference on Artificial Intelligence and Data Processing (IDAP), IEEE, 2018: 1-6.

MERY D, SVEC E, ARIAS M, et al. Modern computer vision techniques for X-Ray testing in baggage inspection[J]. IEEE Transactions on Systems,Man,and Cybernetics:Systems, 2017, 47(4): 682-692. doi: 10.1109/TSMC.2016.2628381

GALVEZ R L, DADIOS E P, BANDALA A A, et al.. Threat object classification in X-ray images using transfer learning[C]. Proceedings of 2018 IEEE 10th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM), IEEE, 2018: 1-5.

HOWARD A G, ZHU M L, CHEN B, et al.. MobileNets: efficient convolutional neural networks for mobile vision applications[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2017.

IANDOLA F N, HAN S, MOSKEWICZ M W, et al.. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size[C]. Proceedings of 2017 International Conference on Learning Representations (ICLR), Toulon, France, 2017.

CHEN Y P, FAN H Q, XU B, et al.. Drop an octave: reducing spatial redundancy in convolutional neural networks with octave convolution[C]. Proceedings of 2019 IEEE/CVF International Conference on Computer Vision, IEEE, 2019: 3434-3443.

CHO K, VAN MERRIËNBOER B, GULCEHRE C, et al.. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, 2014: 1724-1734.

董潇潇, 何小海, 吴晓红, 等. 基于注意力掩模融合的目标检测算法[J]. 液晶与显示,2019,34(8):825-833. doi: 10.3788/YJYXS20193408.0825

DONG X X, HE X H, WU X H, et al. Object detection algorithm based on attention mask fusion[J]. Chinese Journal of Liquid Crystals and Displays, 2019, 34(8): 825-833. (in Chinese) doi: 10.3788/YJYXS20193408.0825

MIAO C J, XIE L X, WAN F, et al.. SIXray: a large-scale security inspection X-ray benchmark for prohibited item discovery in overlapping images[C]. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2019: 2119-2128.

SZEGEDY C, VANHOUCKE V, IOFFE S, et al.. Rethinking the inception architecture for computer vision[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016: 2818-2826.

SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. Proceedings of the 3rd International Conference on Learning Representations, 2014.

HE K M, ZHANG X Y, REN SH Q, et al.. Deep residual learning for image recognition[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016.

HUANG G, LIU ZH, VAN DER MAATEN L, et al.. Densely connected convolutional networks[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2017.

WANG A L, WANG M H, JIANG K Y, et al.. A novel lidar data classification algorithm combined densenet with STN[C]. Proceedings of 2019 IEEE International Geoscience and Remote Sensing Symposium, IEEE, 2019: 2483-2486.

施引文献

附加信息(0)

访问统计