Air Quality Assessment by Monitoring PM10 and PM2.5 Parameters Using Multispectral Satellite Images

Document Type : علمی - پژوهشی

Authors

1 School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran, Iran

2 Department of Geomatics Engineering, Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia

Abstract

Background and Objectives: Air pollution, particularly particulate matter (PM2.5 and PM10), poses significant challenges in large urban areas, leading to severe impacts on public health, ecosystems, and overall quality of life. These issues are especially pronounced in densely populated cities such as Tehran, where air quality management is of utmost importance. Accurate monitoring and forecasting of air quality are essential for developing effective public health and policy strategies. However, the spatial limitations of ground-based air quality monitoring stations prevent comprehensive observation of air quality variations across the entire city. To address these limitations, this study utilized satellite imagery from Landsat-8 and Sentinel-2 to predict particulate matter concentrations, specifically PM2.5 and PM10. By combining spectral reflectance data with advanced machine learning methods, the research aims to identify efficient predictive models and determine the most influential spectral bands for estimating particulate matter concentrations.

Materials and Methods: The study began by developing linear regression models using single-band reflectance and multi-band combinations to establish relationships between spectral data and particulate matter concentrations. To capture more complex patterns, nonlinear regression models were also examined. For optimal feature selection, a hybrid Genetic Algorithm-Support Vector Regression (GA-SVR) method was implemented. The Genetic Algorithm (GA) identified the optimal spectral band combinations, while Support Vector Regression (SVR) constructed robust predictive models based on these optimized features. Key evaluation metrics, including the coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE), were used to assess and compare model performance. To ensure reliability and generalizability, data were divided into training (70%) and testing (30%) subsets, and cross-validation was applied to validate the models’ robustness.

Results and Discussion: The findings revealed that the visible spectrum bands of Landsat-8 and Sentinel-2 showed strong correlations with PM2.5 and PM10 concentrations. Linear regression models developed using bands 1 and 2 of Landsat-8 and bands 2, 3, and 4 of Sentinel-2 achieved significant correlations in the training datasets. For Landsat-8, the R² values for PM2.5 were 70.56% and 67.24% for training and testing datasets, respectively, while Sentinel-2 reached an R² of 68.89% for the testing dataset. The RMSE values for Landsat-8 were 7.01 and 7.48 for the training and testing datasets, respectively, while Sentinel-2 demonstrated superior performance with RMSE values of 6.93 and 7.32. These results highlight the effectiveness of Sentinel-2 imagery in predicting particulate matter concentrations.

In the nonlinear regression analysis, power models showed the highest R² values among the tested models. The normalized RMSE (NRMSE) values ranged between 0.066 and 0.115, demonstrating greater accuracy than linear models. Although nonlinear models proved more capable of capturing complex relationships, their high computational costs and only marginal accuracy improvements suggest that combining linear models with feature optimization is a more practical approach.

The GA-SVR model yielded the best prediction accuracy, showing that shorter wavelengths play a crucial role in estimating particulate matter concentrations. With optimized feature selection, this model achieved an R² close to 70%, underscoring the potential of GA-SVR as a powerful tool for enhancing prediction accuracy in air quality studies.

Conclusion: This study underscores the critical importance of visible spectrum bands in predicting air quality. Sentinel-2 imagery, when combined with the optimal spectral bands identified through the GA-SVR method, demonstrated superior accuracy in estimating PM2.5 concentrations. Linear regression models yielded reliable results; however, the integration of feature optimization and advanced machine learning methods significantly enhanced prediction performance. The GA-SVR model achieved remarkable accuracy, with R² values as high as 70.56%, underscoring the effectiveness of optimized models for precise and timely air quality monitoring across various spatial scales. These findings highlight the transformative potential of leveraging multispectral satellite imagery alongside machine learning techniques to address the complexities of urban air pollution, offering a robust framework for more informed environmental management and decision-making.

Keywords