راهکاری مبتنی‌بر شبکه‌های عصبی کاملاً کانوولوشنی برای تشخیص هم‌زمان جاده‌ها و ساختمان‌ها در تصاویر هوایی

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشیار دانشکدة فنّاوری اطلاعات و مهندسی کامپیوتر، دانشگاه شهید مدنی آذربایجان، تبریز

2 دانشجوی کارشناسی ارشد دانشکدة فنّاوری اطلاعات و مهندسی کامپیوتر، دانشگاه شهید مدنی آذربایجان، تبریز

چکیده

توسعة سیستم‌های خودکار تشخیص جاده و ساختمان در تصاویر هوایی همواره با چالش‌های مهمی مانند متفاوت‌بودن ظاهر ساختمان‌ها، تغییرات روشنایی، زاویة تصویربرداری و فشرده و چگال‌بودن جاده‌ها و ساختمان‌ها در نواحی شهری روبه‌روست. در چند سال اخیر، استفاده از شبکه‌های عصبی مصنوعی چندلایه (شبکه‌های عصبی عمیق) مورد توجه بسیاری از پژوهشگران این حوزه (و حوزه‌های مشابه) قرار گرفته و نتایج خیره‌کننده‌ای با به‌کارگیری آنها حاصل شده است. باوجوداین، به‌دلیل استفاده از لایه‌های کاملاً متصل در راهکار‌های داده‌شده، میانگین مدت زمان پردازش هنوز بسیار زیاد است و مدل ساخته‌شده نیز به‌سرعت دچار پدیدة بیش‌برازش می‌شود. علاوه‌براین، در بیشتر روش‌های پیشنهادی، برای تفسیر تصاویر هوایی براساس چنین راهکاری از رویکرد تک‌کلاس استفاده شده است. به‌عبارتی، تشخیص جاده‌ها و ساختمان‌ها از عوارض طبیعی به‌طور هم‌زمان امکان‌پذیر نیست و لازم است مدل‌های جداگانه‌ای برای تشخیص هریک از آنها ایجاد شود. هدف اصلی، در این پژوهش، طراحی معماری جدیدی است که مدل ساخته‌شده با استفاده از آن بتواند، هم‌زمان، جاده‌ها و ساختمان‌ها را از عوارض طبیعی تشخیص دهد و به‌این‌ترتیب، پیچیدگی عمل طبقه‌بندی را به حداقل برساند. همچنین، در طراحی معماری پیشنهادی، حذف لایه‌های کاملاً متصل از معماری چندلایه‌ای مرسوم و در نتیجه، کاهش میانگین مدت زمان پردازش مورد توجه قرار گرفته است. نتایج آزمایش‌های انجام‌گرفته روی بانک تصاویر هوایی ماساچوست نشان می‌دهد عملکرد معماری پیشنهادی %۳۸ سریع‌تر از دیگر روش‌های مبتنی‌بر شبکه‌های عصبی چندلایه بوده است و دقت تشخیص را به‌طور میانگین، %۲ افزایش می‌دهد.

کلیدواژه‌ها


عنوان مقاله [English]

A Fully Convolutional Neural Network-Based Approach for Detecting Simultaneously Roads and Buildings in Aerial Imagery

نویسندگان [English]

  • Nacer Farajzadeh 1
  • Hiwa Ebrahimzadeh 2
1 Associate Prof., Faculty of IT and Computer Engineering Dep., Azarbaijan Shahid Madani University, Tabriz
2 M.Sc. Student, Faculty of IT and Computer Engineering Dep., Azarbaijan Shahid Madani University, Tabriz
چکیده [English]

The development of automatic road and building detection systems in aerial imagery are always faced with challenges such as the appearance of buildings, illumination changes, imaging angles, and the density of roads and buildings in urban areas, to name a few. In recent years, employing multi-layered approach in artificial neural networks, known as deep neural networks, has attracted many researchers in this field (and the other fields alike), achieving stunning results. However, the use of fully connected layers in this approach, significantly increases the average processing time and results in an overfitted model. In addition, in most of these methods, a single-class approach has been considered. That is, detecting the roads and the buildings from natural scenes is not possible at the same time, and therefore, it is necessary to build separate binary models for each of them. The main goal of this research is to design a new architecture by which the produced model can be able to simultaneously detect roads and buildings from natural scenes, and thus minimizing the complexity of the classification process. In addition, in the proposed architecture, excluding all fully connected layers from the traditional multi-layered architectures is considered in order to reduce the average processing time. The results of the experiments performed on the Massachusetts dataset, show that the proposed architecture performs 38% faster than the other deep neural network-based methods, and also increases the accuracy by an average of 2%.

کلیدواژه‌ها [English]

  • Deep learning
  • Artificial neural networks
  • Convolutional neural networks
  • Aerial imagery
  • Road detection
  • Building detection
  • Natural scene detection
  • Artificial intelligence
فرج‌زاده، ن.، هاشم‌زاده، م.، ۱۳۹۸، تشخیص سازه‌های ساخت بشر در تصاویر هوایی با استفاده از ویژگی‌های آماری مبتنی‌بر رنگ و یادگیری ماشین، سنجش از دور و GIS ایران، سال یازدهم، شمارة ۳، صص. 42-21.
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P. & Süsstrunk, S., 2012, SLIC Superpixels Compared to State-of-the-Art Superpixel Methods, IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), PP. 2274-2282.
Aggarwal, C.C., 2018, Neural Networks and Deep Learning, Springer.
Akçay, H.G. & Aksoy, S., 2010, Building Detection Using Directional Spatial Constraints, Geoscience and Remote Sensing Symposium (IGARSS), 2010 IEEE International, IEEE.
Alshehhi, R., Marpu, P.R., Woon, W.L., Dalla Mura, M. & Sensing, R., 2017, Simultaneous Extraction of Roads and Buildings in Remote Sensing Imagery with Convolutional Neural Networks, ISPRS Journal of Photogrammetry and Remote Sensing, 130, PP. 139-149.
Arı, Ç., Aksoy, S. & Sensing, R., 2014, Detection of Compound Structures Using a GaussianMixture Model with Spectral and Spatial Constraints, IEEE Transactions on Geoscience and Remote Sensing, 52(10), PP. 6627-6638.
Bai, X., Zhang, H. & Zhou, J., 2014, VHR Object Detection Based on Structural Feature Extraction and Query Expansion, IEEE Transactions on Geoscience and Remote Sensing, 52(10), PP. 6508-6520.
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K. & Yuille, A., 2014, Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected Crfs.
Chen, L., Zhu, Q., Xie, X., Hu, H. & Zeng, H., 2018, Road Extraction from VHR Remote-Sensing Imagery via Object Segmentation Constrained by Gabor Features, ISPRS Int. J. Geo-Inf., 7(9), P. 362.
Cheng, G. & Han, J., 2016, A Survey on Object Detection in Optical Remote Sensing Images, ISPRS Journal of Photogrammetry and Remote Sensing, 117, PP. 11-28.
Cheng, Y., Wang, D., Zhou, P. & Zhang, T., 2018, Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges, IEEE Signal Processing Magazine, 35(1), PP. 126-136.
Chollet, F., 2017, Xception: Deep Learning withDepthwise Separable Convolutions, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Clinton, N., Holt, A., Scarborough, J., Yan, L. & Gong, P., 2010, Accuracy Assessment Measures for Object-Based Image Segmentation Goodness, Photogramm. Eng. Remote Sens, 76(3), PP. 289-299.
Contreras, D., Blaschke, T., Tiede, D., Jilge, M.J.C. & Science, G.I., 2016, Monitoring Recovery after Earthquakes through the Integration of Remote sensing, GIS, and Ground Observations: The Case of L’Aquila (Italy), Cartography and Geographic Information Science, 43(2), PP. 115-133.
 
Das, S., Mirnalinee, T., Varghese, K. & Sensing, R., 2011, Use of Salient Features for the Design of a Multistage Framework to Extract Roads from High-Resolution Multispectral Satellite Images, IEEE Transactions on Geoscience and Remote Sensing, 49(10), PP. 3906-3931.
Eckle, K. & Schmidt-Hieber, J., 2019, A Comparison of Deep Networks with ReLU Activation Function and Linear Spline-Type Methods, Neural Networks, 110, PP. 232-242.
Feizizadeh, B., Tiede, D., Rezaei Moghaddam, M.H. & Blaschke, T., 2014, Systematic Evaluation of Fuzzy Operators for Object-Based Landslide Mapping, South-Eastern European Journal of Earth Observation and Geomatics, 3(2s), PP. 219-222.
Goodin, D.G., Anibas, K.L. & Bezymennyi, M., 2015, Mapping Land Cover and Land Use from Object-Based Classification: An Example from a Complex Agricultural Landscape, International Journal of Remote Sensing, 36(18), PP. 4702-4723.
Grabner, H., Nguyen, T.T., Gruber, B. & Bischof, H., 2008, On-Line Boosting-Based Car Detection from Aerial Images, ISPRS Journal of Photogrammetry and Remote Sensing, 63(3), PP. 382-396.
Hay, G.J., Blaschke, T., Marceau, D.J., Bouchard A., 2003, A Comparison of Three Image-Object Methods for the Multiscale Analysis of Landscape Structure, ISPRS Journal of Photogrammetry and Remote Sensing, 57(5-6), PP. 327-345.
Hinton, G.E. & Salakhutdinov, R.R., 2006, Reducing the Dimensionality of Data with Neural Networks, Science, 313(5786), PP. 504-507.
Hui, J., Du, M., Ye, X., Qin, Q. & Sui, J., 2018, Effective Building Extraction From High-Resolution Remote Sensing Images With Multitask Driven Deep Neural Network, IEEE Geoscience and Remote Sensing Letters, 16(5), PP. 786-790.
Ioffe, S. & Szegedy, C., 2015, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.
Kluckner, S. & Bischof, H., 2009, Semantic Classification by Covariance Descriptors within a RandomizedForest, Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on, IEEE.
Kluckner, S., Mauthner, T., Roth, P.M. & Bischof, H., 2009, Semantic Classification in Aerial Imagery by Integrating Appearance and Height Information, Asian Conference on Computer Vision, Springer.
Kozma, R., Alippi, C., Choe, Y. & Morabito, F.C., 2018, Artificial Intelligence in the Age of Neural Networks and Brain Computing, Academic Press.
Lefèvre, S. & Weber, J., 2007, Automatic Building Extraction in VHR Images Using Advanced Morphological Operators, Urban Remote Sensing Joint Event, 2007, IEEE.
Leitloff, J., Hinz, S. & Stilla, U., 2010, Vehicle Detection in Very High Resolution Satellite Images of City Areas, IEEE Transactions on Geoscience and Remote Sensing, 48(7), PP. 2795-2806.
Leninisha, S. & Vani, K., 2015, Water Flow Based Geometric Active Deformable Model for Road Network, ISPRS Journal of Photogrammetry and Remote Sensing, 102, PP. 140-147.
Li, E., Femiani, J., Xu, S., Zhang, X. & Wonka, P., 2015, Robust Rooftop Extraction from Visible Band Images Using Higher Order CRF, IEEE Transactions on Geoscience and Remote Sensing, 53(8), PP. 4483-4495.
Lin, Y., He, H., Yin, Z. & Chen, F., 2015, Rotation-Invariant Object Detection in Remote Sensing Images Based on Radial-Gradient Angle, IEEE Geoscience and Remote Sensing Letters, 12(4), PP. 746-750.
Liu, G., Sun, X., Fu, K. & Wang, H., 2013, Aircraft Recognition in High-Resolution Satellite Images Using Coarse-to-Fine Shape Prior, IEEE Geoscience and Remote Sensing Letters,1(3), PP. 573-577.
Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y. & Alsaadi, F.E., 2017, A Survey of Deep Neural Network Architectures and their Applications, Neurocomputing, 234, PP.11-26.
Long, J., Shelhamer, E. & Darrell, T., 2015, Fully Convolutional Networks for Semantic Segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Maggiori, E., Tarabalka, Y., Charpiat, G. & Alliez, P., 2017, Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification, IEEE Transactions on Geoscience and Remote Sensing, 55(2), PP. 645-657.
Mayer, H., 1999, Automatic Object Extraction from Aerial Imagery—A Survey Focusing on Buildings, Computer Vision and Image Understanding, 74(2), PP. 138-149.
Miikkulainen, R., Liang, J., Meyerson, E., Rawal, A., Fink, D., Francon, O., Raju, B., Shahrzad, H., Navruzyan, A. & Duffy, N., 2019, Evolving Deep Neural Networks, Artificial Intelligence in the Age of Neural Networks and Brain Computing, Elsevier, PP. 293-312.
Minh, V., 2013, Machine Learning for Aerial Image Labeling, University of Toronto (Canada).
Nogueira, K., Penatti O.A.B. & dos Santos, J.A., 2017, Towards Better Exploiting Convolutional Neural Networks for Remote Sensing Scene Classification, Pattern Recognition, 61, PP. 539-556.
Ok, A.O., Senaras, C. & Yuksel, B., 2013, Automated Detection of Arbitrarily ShapedBuildings in Complex Environments from Monocular VHR Optical Satellite Imagery, IEEE Transactions on Geoscience and Remote Sensing, 51(3), PP. 1701-1717.
Panboonyuen, T., Jitkajornwanich, K., Lawawiro-jwong, S., Srestasathiern, P. & Vateekul, P., 2017, Road Segmentation of Remotely-Sensed Images Using Deep Convolutional Neural Networks with Landscape Metrics and Conditional Random Fields, Remote Sensing, 9(7), P. 680.
Ronneberger, O., Fischer, P. & Brox, T., 2015, U-net: Convolutional Networks for Biomedical Image Segmentation, International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer.
Saito, S., Yamashita, T. & Aoki, Y., 2016, Multiple Object Extraction from Aerial Imagery with Convolutional Neural Networks, Journal of Imaging Science and Technology, 60(1).
Seeliger, K., Fritsche, M., Güçlü, U., Schoenmakers, S., Schoffelen, J.-M., Bosch, S. & van Gerven, M.J.N., 2018, Convolutional Neural Network-Based Encoding and Decoding of Visual Object Recognition in Space and Time, NeuroImage, 180, PP. 253-266.
Song, M., Civco, D. & Sensing, R., 2004, Road Extraction Using SVM and Image Segmentation, American Society for Photogrammetry and Remote Sensing, 70(12), PP. 1365-1371.
Sun, H., Sun, X., Wang, H., Li, Y. & Li, X., 2012, Automatic Target Detection in High-Resolution Remote Sensing Images Using Spatial Sparse Coding Bag-of-Words Model, IEEE Geoscience and Remote Sensing Letters, 9(1), PP. 109-113.
Tuermer, S., Kurz, F., Reinartz, P. & Stilla, U., 2013, Airborne Vehicle Detection in Dense Urban Areas Using HoG Features and Disparity Maps, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 6(6), PP. 2327-2337.
Walker, J. & Blaschke, T., 2008, Object Based Land Cover Classification for the Phoenix Metropolitan Area: Optimization vs. Transportability, International Journal of Remote Sensing, 29(7), PP. 2021-2040.
Wang, H., Nie, F., Huang, H. & Ding, C., 2013, Heterogeneous Visual Features Fusion via Sparse Multimodal Machine, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Wang, J., Song, J., Chen, M. & Yang, Z., 2015, Road Network Extraction: A Neural-Dynamic Framework Based on Deep Learning and a Finite State Machine, International Journal of Remote Sensing, 36(12), PP. 3144-3169.
Yokoya, N. & Iwasaki, A., 2015, Object Detection Based on Sparse Representation and Hough Voting for Optical Remote Sensing Imagery, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 8(5), PP. 2053-2062.
Zhao, Y.-Q. & Yang, J., 2015, Hyperspectral Image Denoising via Sparse Representation and Low-Rank Constraint, IEEE Transactions on Geoscience and Remote Sensing, 53(1), PP. 296-308.