Evaluation of the Deep Learning-Based Sat-MVSF Algorithm in DSM Extraction from High Resolution Satellite Images

Document Type : Original Article

Authors

1 Dep of Geomatics Engineering, Faculty of Civil Engineering, University of Tabriz, Tabriz

2 Dep. of Geomatics Engineering, Faculty of Civil Engineering, University of Tabriz, Tabriz, Iran

3 Dep of Geomatics Engineering, Faculty of Civil Engineering, University of Tabriz, Tabriz, Iran

Abstract

Introduction: The extraction of 3D geospatial information from the Earth’s surface using remote sensing and photogrammetric data has become a pivotal and widely utilized subject within the field of geosciences, attracting increasing attention from researchers in recent years. One of the most significant outputs of such data is the Digital Surface Model (DSM), which, in addition to representing the Digital Elevation Model (DEM), includes all natural and man-made features such as vegetation, trees, buildings, and other structures. DSM extraction plays a crucial role in a wide array of applications, including urban planning, building detection, disaster management, 3D modeling, and change monitoring. In recent years, remarkable advances in deep learning have significantly influenced the process of 3D information extraction from remote sensing data. Traditional 3D reconstruction methods often face challenges such as managing large datasets, complexity in extracting features, and difficulity in accessing acurate details. In this context, the use of deep neural networks for extracting complex features from multi-view images has introduced a transformative approach in this domain.
Material and Methods: A novel deep learning-based algorithm, Sat-MVSF, has recently been developed for DSM extraction from multi-view satellite images. This algorithm is designed to extract DSM from multi-view satellite images and performs all steps, from image preprocessing to final DSM generation, based on deep learning. Given the limited availability of training data and the authors' claims regarding the generalizability of the trained model weights, the objective of this study is to evaluate the performance of the Sat-MVSF algorithm in generating DSMs from high-resolution satellite images. The main innovations of this research include: 1) Preparation of three sets of WorldView-3 satellite data and two sets of ZY3-2 satellite data, involving block bundle adjustment for RPC refinement and reference DSM generation using LiDAR point clouds. 2) DSM extraction using the Sat-MVSF algorithm for multi-view images from both WorldView-3 and ZY3-2 sensors, followed by performance comparison against existing algorithms such as S2P and SS-DSM, as well as commercial software including CATALYST and ERDAS IMAGINE. To ensure a comprehensive evaluation, the performance of all algorithms is analyzed across three types of areas: (1) non-built areas, (2) building areas with moderate elevation changes, and (3) building areas with significant elevation changes. The dataset used in this study consists of five sets of satellite images—three from WorldView-3 and two from ZY3-2—with each set containing three overlapping images.
Results and Discussion: The results demonstrate that Sat-MVSF outperforms many existing algorithms and commercial software in DSM extraction. For WorldView-3 imagery, Sat-MVSF achieves an average vertical accuracy of 1.1 meters and completeness of 87%, surpassing SS-DSM and commercial tools. On the other hand, S2P provides slightly better height accuracy (1.0 meters), suggesting Sat-MVSF is less precise in terms of elevation RMSE but still competitive. However, the performance of the S2P algorithm on the WV3-3 dataset is highly dependent on the study area, given that it has low elevation completeness. In the ZY3-2 datasets, Sat-MVSF achieves elevation accuracies of 2.43 and 3.27 meters, indicating acceptable performance. More specifically, in the first two WorldView-3 datasets, S2P attains the best performance with completeness of 90.76% and 90.16%, and elevation accuracies of 0.94 and 1.1 meters, respectively. In the third dataset, Sat-MVSF leads with a completeness of 83% and a elevation accuracy of 1.04 meters. The obtained results show that S2P performs best in building zones with significant elevation changes with accuracies of 1.03, 1.14, and 0.88 meters for the first, second, and third datasets, respectively and CATALYST application achieves the highest accuracy in non-built-up areas with values of 0.71, 1.12, and 0.68 meters across the same datasets. Overall, commercial software such as CATALYST and ERDAS IMAGINE exhibit higher height errors in built-up areas which have significant elevation differences. The reason for this is that these softwares use interpolation methods to fill gaps, which reduces accuracy in building areas with height differences. Given that if the height threshold limit is considered to be a large number in calculating the height accuracy and height completeness evaluation criteria, the error increases, meaning that pixels with a high height error are considered as correct pixels, and both the height accuracy and height completeness criteria will optimistically have a high value. At a small height threshold limit, both criteria will have a low value.

Keywords


Ahmadian, N., Sedaghat, A., Mohammadi, N. & Aghdami-Nia, M., 2024, Deep-Learning-Based Edge Detection for Improving Building Footprint Extraction from Satellite Images, Environmental Sciences Proceedings, 29(1), P. 61, https://www.mdpi.com/2673-4931/29/1/61 .
Albanwan, H. & Qin, R., 2022, A Comparative Study on Deep-Learning Methods for Dense Image Matching of Multi-Angle and multi-Date Remote Sensing Stereo-Images, The Photogrammetric Record, 37(180), PP. 385-409, https://doi.org/ https://doi.org/10.1111/phor.12430
Alobeid, A., Jacobsen, K. & Heipke, C., 2010, Comparison of Matching Algorithms for DSM Generation in Urban Areas from Ikonos Imagery, Photogrammetric Enginee-ring & Remote Sensing, 76(9), PP. 1041-1050, https://doi.org/10.14358/PERS.76.9.1041.
Amini Amirkolaee, H. & Arefi, H., 2019, Digital Surface Model Extraction with High Details Using Single High Resolution Satellite Image and SRTM Global DEM Based on Deep Learning [Research], Journal of Geospatial Information Technology, 7(3), PP. 173-198, https://doi.org/ 10.29252/jgit.7.3.173.
Bosch, M., Kurtz, Z., Hagstrom, S. & Brown, M., 2016, A multiple View Stereo Benchmark for Satellite Imagery,. 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR).
Catalyst Professional, 2021, https://catalyst.earth/ tutorial/dem-extraction-satellite-2/.
Chang, J.-R. & Chen, Y.-S., 2018, Pyramid Stereo Matching Network, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
de Franchis, C., Meinhardt-Llopis, E., Michel, J., Morel, J.M. & Facciolo, G., 2014, An Automatic and Modular Stereo Pipeline for Pushbroom Images. ISPRS Ann. Photogramm, Remote Sens. Spatial Inf. Sci., II-3, PP. 49-56, https://doi.org/10.5194/ isprsannals-II-3-49-2014.
Eckert, S. & Hollands, T., 2010, Comparison of Automatic DSM Generation Modules by Processing IKONOS Stereo Data of an Urban Area, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 3(2), PP. 162-167, https://doi.org/ 10.1109/JSTARS.2010.2047096.
ERDAS IMAGINE, 2014, https://hexagon.com/ products/imagine-dsm-extractor.
Fraser, C.S., Dial, G. & Grodecki, J., 2006, Sensor Orientation via RPCs, ISPRS Journal of Photogrammetry and Remote Sensing, 60(3), PP. 182-194, https://doi.org/ https://doi.org/10.1016/j.isprsjprs.2005.11.001 .
Gao, J., Liu, J. & Ji, S., 2021, Rational Polynomial Camera Model Warping for Deep Learning Based Satellite Multi-View Stereo Matching, Proceedings of the IEEE/CVF International Conference on Computer Vision.
Gao, J., Liu, J. & Ji, S., 2023, A General Deep Learning Based Framework for 3D Reconstruction from Multi-View Stereo Satellite Images, ISPRS Journal of Photogrammetry and Remote Sensing, 195, PP. 446-461, https://doi.org/https://doi.org/ 10.1016/j.isprsjprs.2022.12.012.
Gong, K. & Fritsch, D., 2019, DSM Generation from High Resolution Multi-View Stereo Satellite Imagery. Photogrammetric Enginee-ring & Remote Sensing, 85(5), PP. 379-387, https://doi.org/10.14358/PERS.85.5.379.
Grodecki, J. & Dial, G., 2001, IKONOS Geometric Accuracy, Proceedings of Joint Workshop of ISPRS Working Groups I/2, I/5 and IV/7 on High Resolution Mapping from Space.
Guo, X., Yang, K., Yang, W., Wang, X. & Li, H., 2019, Group-Wise Correlation Stereo Network, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
Habib, A.F., Morgan, M., Jeong, S. & Kim, K.-O., 2005, Analysis of Epipolar Geometry in Linear Array Scanner Scenes, The Photogrammetric Record, 20(109), PP. 27-47, https://doi.org/https://doi.org/10.1111/j.1477-9730.2005.00303.x.
Hirschmuller, H., 2008, Stereo Processing by Semiglobal Matching and Mutual Information, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), PP. 328-341, https://doi.org/10.1109/ TPAMI.2007.1166.
Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P., Kennedy, R., Bachrach, A. & Bry, A., 2017, End-to-End Learning of Geometry and Context for Deep Stereo Regression, Proceedings of the IEEE International Conference on Computer Vision.
Kim, T. & Jeong, J., 2011, DEM Matching for Bias Compensation of Rigorous Pushbroom Sensor Models, ISPRS Journal of Photogrammetry and Remote Sensing, 66(5), PP. 692-699, https://doi.org/https:// doi.org/10.1016/j.isprsjprs.2011.06.002.
Kuschk, G., d'Angelo, P., Qin, R., Poli, D., Reinartz, P. & Cremers, D., 2014, DSM Accuracy Evaluation for the ISPRS Commission I Image Matching Benchmark. Int. Arch. Photogramm, Remote Sens. Spatial Inf. Sci., XL-1, PP. 195-200, https://doi.org/10.5194/ isprsarchives-XL-1-195-2014.
Laga, H., Jospin, L.V., Boussaid, F. & Bennamoun, M., 2022, A Survey on Deep Learning Techniques for Stereo-Based Depth Estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(4), PP. 1738-1764, https://doi.org/10. 1109/TPAMI.2020.3032602.
Leys, C., Ley, C., Klein, O., Bernard, P. & Licata, L., 2013, Detecting Outliers: Do Not Use Standard Deviation around the Mean, Use Absolute Deviation around the Median, Journal of Experimental Social Psychology, 49(4), PP. 764-766.
Marí, R., de Franchis, C., Meinhardt-Llopis, E. & Facciolo, G., 2019, To Bundle Adjust or Not: A Comparison of Relative Geolocation Correction Strategies for Satellite Multi-View Stereo, Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops.
Morgan, M., Kim, K.-O., Jeong, S. & Habib, A., 2006, Epipolar Resampling of Space-Borne Linear Array Scanner Scenes Using Parallel Projection, Photogrammetric Engineering & Remote Sensing, 72(11), PP. 1255-1263.
Oh, J., Lee, W.H., Toth, C.K., Grejner-Brzezinska, D.A. & Lee, C., 2010, A Piecewise Approach to Epipolar Resampling of Pushbroom Satellite Images Based on RPC, Photogrammetric Engineering & Remote Sensing, 76(12), PP. 1353-1363, https:// doi.org/10.14358/PERS.76.12.1353.
Poli, D. & Toutin, T., 2012, Review of Developments in Geometric Modelling for High Resolution Satellite Pushbroom Sensors, The Photogrammetric Record, 27(137), PP. 58-73.
Qin, R., Chen, M., Huang, X. & Hu, K., 2019, Disparity Refinement in Depth Discontinuity Using Robustly Matched Straight Lines for Digital Surface Model Generation, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(1), PP. 174-185, https://doi.org/10.1109/JSTARS.2018.2886000.
Ronneberger, O., Fischer, P. & Brox, T., 2015, U-Net: Convolutional Networks for Biomedical Image Segmentation, Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, Cham.
Serati, G., Sedaghat, A., Mohammadi, N. & Li, J.,(2022, Digital Surface Model Generation from High-Resolution Satellite Stereo Imagery Based on Structural Similarity, Geocarto International, 37(26), PP. 11390-11419, https://doi.org/10.1080/10106049. 2022. 2057594.
Shariat, M., Azizi, A. & Saadatseresht, M., 2008, Analysis and the Solutions for Generating a True Digital Ortho Photo in Close Range Photogrammetry, Int. Arch. Photogramm., Remote Sens., Spat. Inf. Sci., 37, PP. 439-422.
Stucker, C. & Schindler, K., 2022, ResDepth: A Deep Residual Prior for 3D Reconstruction from High-Resolution Satellite Images, ISPRS Journal of Photogrammetry and Remote Sensing, 183, PP. 560-580, https://doi.org/https://doi.org/ 10.1016/j.isprsjprs.2021.11.009.
Tao, C.V. & Hu, Y., 2001, A Comprehensive Study of the Rational Function Model for Photogrammetric Processing, Photo-grammetric Engineering and Remote Sensing, 67(12), PP. 1347-1358.
Toutin, T., Chénier, R. & Carbonneau, Y., 2001, 3D Geometric Modelling of Ikonos Geo Images, Proceedings of ISPRS Joint Workshop “High Resolution from Space”, Hannover.
Wang, S., Ren, Z., Wu, C., Lei, Q., Gong, W., Ou, Q., Zhang, H., Ren, G. & Li, C., 2019, DEM Generation from Worldview-2 Stereo Imagery and Vertical Accuracy Assessment for Its Application in Active Tectonics, Geomorphology, 336, PP. 107-118, https://doi.org/https://doi.org/10.1016/ j.geomorph.2019.03.016.
Yang, G., Manela, J., Happold, M. & Ramanan, D., 2019, Hierarchical Deep Stereo Matching on High-Resolution Images, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
Yao, Y., Luo, Z., Li, S., Shen, T., Fang, T. & Quan, L., 2019, Recurrent Mvsnet for High-Resolution Multi-View stereo Depth Inference, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
Zhang, C. & Fraser, C., 2008, Generation of Digital Surface Model from High Resolution Satellite Imagery, Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci., 37, PP. 785-790.
Zhang, F., Prisacariu, V., Yang, R. & Torr, P.H., 2019, Ga-Net: Guided Aggregation Net for End-to-End Stereo Matching, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
Zheng, Z., Wan, Y., Zhang, Y., Hu, Z., Wei, D., Yao, Y., Zhu, C., Yang, K. & Xiao, R., 2024, Digital Surface Model Generation from High‐Resolution Satellite Stereos Based on Hybrid Feature Fusion Network. The Photogrammetric Record.