A new forecasting approach using the combination of machine learning to predict flood susceptibility (case study: Karun catchment)

Document Type : Original Article

Authors

1 MSC, Department of Environment, University of Environment, Karaj, Iran

2 Professor, Department of Environment, Islamic Azad University, Tabriz, Iran

Abstract

Introduction: Due to its environmental diversity, Iran ranks high in terms of crises caused by natural disasters. Flooding, as one of these disasters, is  causing severe social, economic, health, and environmental damage in many areas due to rapid urban growth and climate change. Therefore spatial forecasting of floods is crucial, as failure to identify flood risk areas in a catchment can exacerbate the destructive effects of floods. Recent advances in remote sensing, geographic information systems, machine learning, and statistical modelling have made it possible to produce highly accurate flood prediction maps. This study aims to predict flood risk areas in the Karun watershed using Sentinel satellite images and a novel ensemble approach with six machine learning models.
Materials and Methods: In this study, Synthetic Aperture Radar (SAR) data from Sentinel-1 images were used to identify areas affected by flooding.  First, the dates of heavy rainfall and flooding events in the study area were identified from various sources of information. Subsequently, Sentinel-1 images were obtained from the Copernicus database, representing the area before and after the flood events. The aforementioned data were processed using the SNAP platform. The identification of flood-affected areas was achieved through the application of the thresholding technique. For this purpose, the Normalized Difference Water Index (NDWI) generated from Sentinel-2 images and land cover classes indicating permanent water bodies were employed to determine the threshold for identifying flood-affected areas. The flood polygon layer was converted to a point layer, resulting in a total of 70 flood occurrence points. A review of previous studies and local characteristics identified seven main factors that significantly affect flood occurrence in the region. These factors include the Normalized Difference Vegetation Index (NDVI), Topographic Wetness Index (TWI), slope, flow direction, flow accumulation, distance from the river, and monthly rainfall. Additionally, the Digital Elevation Model (DEM) of the region was obtained from the SRTM database, and the spatial resolution of all factors was aligned with the DEM layer. Subsequently, various machine learning algorithms were employed to develop a combined model that provides more accurate predictions of flood-prone areas. The individual models include the Generalized Linear Model (GLM), Boosted Regression Tree (BRT), Support Vector Machine (SVM), Random Forest (RF), Multivariate Adaptive Regression Splines (MARS), and Maximum Entropy (MAXENT).
                                
Results and Discussion: The results of this study indicate that the northeast of Aligudarz city, parts of Durud and Azna in Lorestan province, Khademmirza, Shahrekord, and Kiyar in Chaharmahal Bakhtiari province, Dana and Boyer Ahmad in Kohgiluyeh and Boyer Ahmad province, Semirom city in Isfahan province, and the southern border areas of Karun River in Khuzestan province have the highest flood potential in this basin. The performance evaluation of the models revealed that the Random Forest (RF) and Maximum Entropy (MaxEnt) models exhibited the highest accuracy among the individual models. These models, by combining environmental information and flood occurrence data, can produce highly accurate flood susceptibility maps. These maps can serve as crucial management tools to mitigate the adverse effects of floods and prevent development in vulnerable areas.
 
Conclusion: Overall, this study demonstrates that the use of an ensemble approach which combines machine learning models can provide more reliable results in the prediction of flood risk areas. The findings of this research are beneficial for managers and planners, as they can prevent development in vulnerable areas and consequently help reduce financial losses and human damages in the future.
 

Keywords


Abbaspour, M., Mahiny, A. S., Arjmandy, R., & Naimi, B. (2011). Integrated approach for land use suitability analysis. International Agrophysics, 25(4). bwmeta1.element.agro-babd4e54-f64e-4413-9f46-3df1073f8d02
Abdi Dehkordi, M., Bozorg Haddad, O., & Salavitabar, A. (2020). Investigation of the Karun River Basin Landscape under the Utilization of Development Projects in Study or Implementation Based on the System Dynamics Approach. Iranian Journal of Soil and Water Research. 51 (2), 489-501. https://doi.org/10.22059/ijswr.2019.284941.668252    
Abedi, M., Moghadam, H., Morid, S., Booij, M., and Delavar, M., 2020, Evaluation of ECMWF mid-range ensemble forecasts of precipitation for the Karun River basin: Theoretical and Applied Climatology, 141, 61-70. https://doi.org/10.1007/s00704-020-03160-0
Alvarado-Aguilar, D., Jiménez, J. A., & Nicholls, R. J. (2012). Flood hazard and damage assessment in the Ebro Delta (NW Mediterranean) to relative sea level rise. Natural Hazards, 62, 1301-1321. https://doi.org/10.1007/s11069-012-0149-x
Amato, U., Antoniadis, A., De Feis, I., Goude, Y., & Lagache, A. (2021). Forecasting high resolution electricity demand data with additive models including smooth and jagged components. International Journal of Forecasting, 37(1), 171-185. https://doi.org/10.1016/j.ijforecast.2020.04.001
Amiri, N., Vaissi, S., Aghamir, F., Saberi‐Pirooz, R., Rödder, D., Ebrahimi, E., & Ahmadzadeh, F. (2021). Tracking climate change in the spatial distribution pattern and the phylogeographic structure of Hyrcanian wood frog, Rana pseudodalmatina (Anura: Ranidae). Journal of Zoological Systematics and Evolutionary Research, 59(7), 1604-1619. https://doi.org/10.1111/jzs.12503
Araujo M.B., New M., 2007. Ensemble forecasting of species Distributions, TRENDS in Ecology and Evolution, 22 (1): 42-47. https://doi.org/10.1016/j.tree.2006.09.010
Austin, M. P. (2002). Spatial prediction of species distribution: an interface between ecological theory and statistical modelling. Ecological modelling, 157(2-3), 101-118. https://doi.org/10.1016/S0304-3800(02)00205-3
Avand, M., Janizadeh, S., Naghibi, S. A., Pourghasemi, H. R., Khosrobeigi Bozchaloei, S., & Blaschke, T. (2019). A comparative assessment of random forest and k-nearest neighbor classifiers for gully erosion susceptibility mapping. Water, 11(10), 2076. https://doi.org/10.3390/w11102076
Beven, K. (1979). On the generalized kinematic routing method. Water Resources Research, 15(5), 1238-1242. https://doi.org/10.1029/WR015i005p01238
Billa, L., Mansor, S., Mahmud, A. R., & Ghazali, A. H. (2006). Modelling rainfall intensity from NOAA AVHRR data for operational flood forecasting in Malaysia. International Journal of Remote Sensing, 27(23), 5225-5234. https://doi.org/10.1080/01431160500192603
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Breiman, L., & Cutler, A. (2001). Random forest. Machine Learning. Statistics Department-University of California, 1, 33.
Brivio, P. A., R. Colombo, M. Maggi, and R. Tomasoni (2002). Integration of remote sensing data and GIS for accurate mapping of flooded areas. International Journal of Remote Sensing , 23 (3), 429-441. https://doi.org/10.1080/01431160010014729
Caballero, G. R., Platzeck, G., Pezzola, A., Casella, A., Winschel, C., Silva, S. S., Delegido, J. (2020). Assessment of multi-date Sentinel-1 polarizations and GLCM texture features capacity for onion and sunflower classification in an irrigated valley: an object level approach. Agronomy, 10(6), 845. https://doi.org/10.3390/agronomy10060845
Cao, H., Zhang, H., Wang, C., & Zhang, B. (2019). Operational flood detection using Sentinel-1 SAR data over large areas. Water, 11(4), 786. https://doi.org/10.3390/w11040786
Carreño Conde, F., & De Mata Muñoz, M. (2019). Flood monitoring based on the study of Sentinel-1 SAR images: The Ebro River case study. Water, 11(12), 2454. https://doi.org/10.3390/w11122454
Cavender-Bares, J., Schneider, F. D., Santos, M. J., Armstrong, A., Carnaval, A., Dahlin, K. M., ... & Wilson, A. M. (2022). Integrating remote sensing with ecology and evolution to advance biodiversity conservation. Nature Ecology & Evolution, 6(5), 506-519. https://doi.org/10.1038/s41559-022-01702-5
Chapi, K., Singh, V. P., Shirzadi, A., Shahabi, H., Bui, D. T., Pham, B. T., & Khosravi, K. (2017). A novel hybrid artificial intelligence approach for flood susceptibility assessment. Environmental modelling & software, 95, 229-245. https://doi.org/10.1016/j.envsoft.2017.06.012
Chen, W., Li, Y., Xue, W., Shahabi, H., Li, S., Hong, H., ... & Ahmad, B. B. (2020). Modeling flood susceptibility using data-driven approaches of naïve bayes tree, alternating decision tree, and random forest methods. Science of The Total Environment, 701, 134979. https://doi.org/10.1016/j.scitotenv.2019.134979
Clement, M. A., Kilsby, C. G., & Moore, P. (2018). Multi‐temporal synthetic aperture radar flood mapping using change detection. Journal of Flood Risk Management, 11(2), 152-168. https://doi.org/10.1111/jfr3.12303
Conforti, M., Muto, F., Rago, V., & Critelli, S. (2014). Landslide inventory map of north-eastern Calabria (South Italy). Journal of maps, 10(1), 90-102. https://doi.org/10.1080/17445647.2013.852142
Ebrahimi, E., Araújo, M. B., & Naimi, B. (2023). Flood susceptibility mapping to improve models of species distributions. Ecological Indicators, 157, 111250. https://doi.org/10.1016/j.ecolind.2023.111250
Ebrahimi, E., Ranjbaran, Y., Sayahnia, R., & Ahmadzadeh, F. (2022). Assessing the climate change effects on the distribution pattern of the Azerbaijan Mountain Newt (Neurergus crocatus). Ecological Complexity, 50, 100997. https://doi.org/10.1016/j.ecocom.2022.100997
Elith, J., Leathwick, J. R., & Hastie, T. (2008). A working guide to boosted regression trees. Journal of animal ecology, 77(4), 802-813. https://doi.org/10.1111/j.1365-2656.2008.01390.x
ESA, ESA’s radar observatory mission for GMES operational services, vol. 1, no. sp-1322/1. 2012.
Fechter, D., & Storch, I. (2014). How many wolves (Canis lupus) fit into Germany? The role of assumptions in predictive rule-based habitat models for habitat generalists. PloS one, 9(7), e101798. https://doi.org/10.1371/journal.pone.0101798
Franklin, J. (2010). Moving beyond static species distribution models in support of conservation biogeography. Diversity and Distributions, 16(3), 321-330. https://doi.org/10.1111/j.1472-4642.2010.00641.x
Friedman, J. H. (1991). Multivariate Adaptive Regression Splines. The annals of statistics, 19(1), 1-67.
Garcia, C. A., Savilaakso, S., Verburg, R. W., Stoudmann, N., Fernbach, P., Sloman, S. A., ... & Waeber, P. O. Strategy games to improve environmental policymaking. Nat Sustain 5, 464–471 (2022). https://doi.org/10.1038/s41893-022-00881-0
Gayen, A., Pourghasemi, H. R., Saha, S., Keesstra, S. & Bai, S. Gully erosion susceptibility assessment and management of hazardprone areas in India using diferent machine learning algorithms. Science of The Total Environment. 668, 124–138 (2019). https://doi.org/10.1016/j.scitotenv.2019.02.436
Gayen, A., Pourghasemi, H. R., Saha, S., Keesstra, S., & Bai, S. (2019). Gully erosion susceptibility assessment and management of hazard-prone areas in India using different machine learning algorithms. Science of The Total Environment, 668, 124-138. https://doi.org/10.1016/j.scitotenv.2019.02.328
Ghayoumi, R., & Ebrahimi, E. (2020). Predicting the potential distribution of Avicennia marina across mangrove forest area in Southern Iran using Biochemical datase. Journal of Oceanography, 10(40), 55-63.   http://joc.inio.ac.ir/article-1-1530-en.html
Ghayoumi, R., Ebrahimi, E., & Mousavi, S. M. (2022). Dynamics of mangrove forest distribution changes in Iran. Journal of Water and Climate Change, 13(6), 2479-2489. https://doi.org/10.2166/wcc.2022.069
Ghyoumi, R., Ebrahimi, E., Hosseini, F., Hosseini Taifeh, M., Kashtkar, M. (2019). Predicting the Effects of Climate Changes on the Distribution of Mangrove Forests in Iran Using Maximum Entropy Model, Remote Sensing, and Geographic Information System in Natural Resources. Journal of Remote Sensing & GIS in Natural Resources, 10(2), 34-47. http://dorl.net/dor/20.1001.1.26767082.1398.10.2.3.2
Goetz, J. N., Guthrie, R. H., & Brenning, A. (2011). Integrating physical and empirical landslide susceptibility models using generalized additive models. Geomorphology, 129(3-4), 376-386. https://doi.org/10.1016/j.geomorph.2011.03.001
Gunn, S. R. (1998). Support vector machines for classification and regression. ISIS technical report, 14(1), 5-16.
Hammami, S., Zouhri, L., Souissi, D., Souei, A., Zghibi, A., Marzougui, A., & Dlala, M. (2019). Application of the GIS based multi-criteria decision analysis and analytical hierarchy process (AHP) in the flood susceptibility mapping (Tunisia). Arabian Journal of Geosciences, 12(21), 1-16. https://doi.org/10.1007/s12517-019-4754-9
Hill, D. J., Minsker, B. S. (2010). Anomaly detection in streaming environmental sensor data: A data-driven modeling approach. Environmental Modelling & Software, 25(9), 1014-1022. https://doi.org/10.1016/j.envsoft.2009.08.010
Horritt, M. S., Mason, D. C., & Luckman, A. J. (2001). Flood boundary delineation from synthetic aperture radar imagery using a statistical active contour model. International Journal of Remote Sensing, 22(13), 2489-2507. https://doi.org/10.1080/01431160116902
Hosseinalizadeh, M., Kariminejad, N., Rahmati, O., Keesstra, S., Alinejad, M., & Behbahani, A. M. (2019). How can statistical and artificial intelligence approaches predict piping erosion susceptibility?. Science of the Total Environment, 646, 1554-1566. https://doi.org/10.1016/j.scitotenv.2018.07.368
Huang, X., Tan, H., Zhou, J., Yang, T., Benjamin, A., Wen, S. W., ... & Li, X. (2008). Flood hazard in Hunan province of China: an economic loss analysis. Natural Hazards, 47, 65-73. https://doi.org/10.1007/s11069-007-9197-z
Ilanloo, S. S., Ebrahimi, E., Valizadegan, N., Ashrafi, S., Rezaei, H. R., & Yousefi, M. (2020). Little owl (Athene noctua) around human settlements and agricultural lands: Conservation and management enlightenments. Acta Ecologica Sinica, 40(5), 347-352. https://doi.org/10.1016/j.chnaes.2020.06.001
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112, p. 18). New York: Springer.
Kalantari, Z., Ferreira, C. S. S., Koutsouris, A. J., Ahlmer, A. K., Cerdà, A., & Destouni, G. (2019). Assessing flood probability for transportation infrastructure based on catchment characteristics, sediment connectivity and remotely sensed soil moisture. Science of the total environment, 661, 393-406. https://doi.org/10.1016/j.scitotenv.2019.01.009
Kapur, J. N., Sahoo, P. K., Wong, A. K. (1985). A new method for gray-level picture thresholding using the entropy of the histogram. Computer vision, graphics, and image processing, 29(3), 273-285. https://doi.org/10.1016/0734-189X(85)90125-2
Kourgialas, N. N., & Karatzas, G. P. (2011). Flood management and a GIS modelling method to assess flood-hazard areas—a case study. Hydrological Sciences Journal–Journal des Sciences Hydrologiques, 56(2), 212-225. https://doi.org/10.1080/02626667.2011.555836
Kussul, N. Shelestov, A. and Skakun, S. 2011. Flood monitoring from SAR data. In: Use of Satellite and In-Situ Data to Improve Sustainability. Springer, Dordrecht. pp. 19-29. https://doi.org/10.1007/978-90-481-9618-0_3
Markantonis, V., Meyer, V., & Lienhoop, N. (2013). Evaluation of the environmental impacts of extreme floods in the Evros River basin using Contingent Valuation Method. Natural hazards, 69, 1535-1549. https://doi.org/10.1007/s11069-013-0762-3  
Martínez-López, J., Martínez-Fernández, J., Naimi, B., Carreno, M. F., & Esteve, M. A. (2015). An open-source spatio-dynamic wetland model of plant community responses to hydrological pressures. Ecological Modelling, 306, 326-333. https://doi.org/10.1016/j.ecolmodel.2014.11.024
Matgen, P., Henry, J. B., Pappenberger, F., Pfister, L., De Fraipont, P., & Hoffmann, L. (2004). Uncertainty in calibrating flood propagation models with flood boundaries derived from synthetic aperture radar imagery. Proc. 20th Congr. Int. Soc. Photogramm. Remote Sens., Istanbul, Turkey, 352-358.
McCullagh, P. (2019). Generalized linear models. Routledge. https://doi.org/10.1201/9780203753736 
McCullagh, P., & Nelder, J. A. (1989). Generalized linear models, volume 37 of. Monographs on statistics and applied probability, 37.
Mohammadi, S., Ebrahimi, E., Shahriari Moghadam, M., & Bosso, L. (2019). Modelling current and future potential distributions of two desert jerboas under climate change in Iran. Ecol Inf, 52: 7–13. https://doi.org/10.1016/j.ecoinf.2019.04.003
Nachappa, T. G., Piralilou, S. T., Gholamnia, K., Ghorbanzadeh, O., Rahmati, O., & Blaschke, T. (2020). Flood susceptibility mapping with machine learning, multi-criteria decision analysis and ensemble using Dempster Shafer Theory. Journal of hydrology, 590, 125275. https://doi.org/10.1016/j.jhydrol.2020.125275
Naimi, B., & Araújo, M. B. (2016). sdm: a reproducible and extensible R platform for species distribution modelling. Ecography, 39(4), 368-375. https://doi.org/10.1111/ecog.01881
Otsu, N. (1975). A threshold selection method from gray-level histograms. Automatica, 11(285-296), 23-27. https://doi.org/10.1109/TSMC.1979.4310076
Ozdemir, A., & Altural, T. (2013). A comparative study of frequency ratio, weights of evidence and logistic regression methods for landslide susceptibility mapping: Sultan Mountains, SW Turkey. Journal of Asian Earth Sciences, 64, 180-197. https://doi.org/10.1016/j.jseaes.2012.12.014
Park, D., & Markus, M. (2014). Analysis of a changing hydrologic flood regime using the Variable Infiltration Capacity model. Journal of Hydrology, 515, 267-280. https://doi.org/10.1016/j.jhydrol.2014.05.004
Phillips, S. J., Anderson, R. P., & Schapire, R. E. (2006). Maximum entropy modeling of species geographic distributions. Ecological Modelling, 190, 231-259. https://doi.org/10.1016/j.ecolmodel.2005.03.026
Radhika, K. R., Sekhar, G. N., & Venkatesha, M. K. (2009, April). Pattern recognition techniques in on-line hand written signature verification-A survey. In 2009 International Conference on Multimedia Computing and Systems (pp. 216-221). IEEE. https://doi.org/10.1109/MMCS.2009.5256701
Rastmanesh, F., Barati-haghighi, T., and Zarasvandi, A., 2020, Assessment of the impact of 2019 Karun River flood on river sediment in Ahvaz city area, Iran. Environmental Monitoring and Assessment, 192, https://doi.org/10.1007/s10661-020-08607-5
Rozalis, S., Morin, E., Yair, Y., & Price, C. (2010). Flash flood prediction using an uncalibrated hydrological model and radar rainfall data in a Mediterranean watershed under changing hydrological conditions. Journal of Hydrology, 394, 245255. https://doi.org/10.1016/j.jhydrol.2010.03.021
Sayedain, S. A., Maghsoudi, Y., & Eini-Zinab, S. (2020). Assessing the use of cross-orbit Sentinel-1 images in land cover classification. International Journal of Remote Sensing, 41(20), 7801-7819. https://doi.org/10.1080/01431161.2020.1763512
Schumann, G., Di Baldassarre, G., Bates, P. D. (2009). The utility of spaceborne radar to render flood inundation maps based on multialgorithm ensembles. IEEE Transactions on Geoscience and Remote Sensing, 47(8), 2801-2807. https://doi.org/10.1109/TGRS.2009.2017937
Shafizadeh-Moghadam, H., Valavi, R., Shahabi, H., Chapi, K., & Shirzadi, A. (2018). Novel forecasting approaches using combination of machine learning and statistical models for flood susceptibility mapping. Journal of environmental management, 217, 1-11. https://doi.org/10.1016/j.jenvman.2018.03.089
Shahabi, H. et al. A semi-automated object-based gully networks detection using diferent machine learning models: A case study of Bowen catchment, Queensland, Australia. Sensors (Switzerland) 19, 4893 (2019). https://doi.org/10.3390/s19224893
Sheykhi Ilanloo, S., Khani, A., Kafash, A., Valizadegan, N., Ashrafi, S., Loercher, F., Ebrahimi, E., Yousefi, M. (2021). Applying opportunistic observations to model current and future suitability of the Kopet Dagh Mountains for a Near Threatened avian scavenger. Avian Biology Research, 14(1), 18-26.  https://doi.org/10.1177/1758155920962750
Taalab, K., Cheng, T., & Zhang, Y. (2018). Mapping landslide susceptibility and types using Random Forest. Big Earth Data, 2(2), 159-178.  https://doi.org/10.1080/20964471.2018.1472392
Tayfehrostami, A., Azmoudeh Ardalan, A. R., Roohi, S., & Pourmina, A. H. (2021). Dams Surface Area Monitoring from VV and VH Polarization of Sentinel-1 Mission SAR Images (Case study: Doroudzan Dam, Shiraz, Iran). Journal of Geomatics Science and Technology, 10(4), 103-116. http://jgst.issgeac.ir/article-1-988-en.html
Tehrany, M. S., Pradhan, B., & Jebur, M. N. (2013). Spatial prediction of flood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS. Journal of hydrology, 504, 69-79. https://doi.org/10.1016/j.jhydrol.2013.09.034
Tehrany, M. S., Pradhan, B., & Jebur, M. N. (2015). Flood susceptibility analysis and its verification using a novel ensemble support vector machine and frequency ratio method. Stochastic environmental research and risk assessment, 29(4), 1149-1165. https://doi.org/10.1007/s00477-015-1021-9
Thorup, K., Pedersen, L., Da Fonseca, R. R., Naimi, B., Nogués-Bravo, D., Krapp, M., ... & Rahbek, C. (2021). Response of an Afro-Palearctic bird migrant to glaciation cycles. Proceedings of the National Academy of Sciences, 118(52), e2023836118. https://doi.org/10.1073/pnas.2023836118
Thuiller W., 2003. BIOMOD–optimizing predictions of species distributions and projecting potential future shifts under global change. Global Change Biology, 9 (10): 1353-1362. https://doi.org/10.1046/j.1365-2486.2003.00666.x
Wang, S. S. Y., Kim, H., Coumou, D., Yoon, J. H., Zhao, L., & Gillies, R. R. (2019). Consecutive extreme flooding and heat wave in Japan: Are they becoming a norm?. Atmospheric Science Letter. https://doi.org/10.1002/asl.933  
Yokoya, N., Chan, J. C. W., & Segl, K. (2016). Potential of resolution-enhanced hyperspectral data for mineral mapping using simulated EnMAP and Sentinel-2 images. Remote Sensing, 8(3), 172. https://doi.org/10.3390/rs8030172
Yommy, A.‌S. Liu, R. and Wu, S. 2015. SAR image despeckling using refined Lee filter. In 2015 7th IEEE International Conference on Intelligent Human Machine Systems and Cyberneticsh. pp. 260-265. https://doi.org/10.1109/IHMSC.2015.236
Yousefi, S., Pourghasemi, H. R., Emami, S. N., Rahmati, O., Tavangar, S., Pouyan, S., ... & Nekoeimehr, M. (2020). Assessing the susceptibility of schools to flood events in Iran. Scientific reports, 10(1), 18114. https://doi.org/10.1038/s41598-020-75291-3