Daily PM2.5 concentration forecasting in Mashhad metropolis using long short-term memory (LSTM) neural networks and feature importance analysis

AuthorsAliasghar Dehqanpour Aliaqa,Seyed Javad Rasouli,Mohammad Reza Mansouri Daneshvar
JournalAir Quality Atmosphere and Health
Page number1-18
Serial number19
Volume number27
Paper TypeFull Paper
Published At2026
Journal GradeISI
Journal TypeElectronic
Journal CountryIran, Islamic Republic Of
Journal IndexJCR،Scopus
KeywordsPM2.5 forecasting · Deep learning · Feature selection · Air quality management · Urban

Abstract

Urban air quality, particularly the concentration of fine particulate matter less than 2.5 micrometers in diameter (PM2.5), poses a significant environmental and public health challenge, necessitating accurate forecasting in major metropolitan areas. This study develops and evaluates daily PM2.5 concentration forecasting models for Mashhad, one of Iran’s largest metropolitan centers. Long Short-Term Memory (LSTM) neural networks were utilized to model the complex PM2.5 concentration time series at three air quality monitoring stations: Taghiabad, Nakhrisi, and Resalat. The input dataset comprised the previous day’s mean PM2.5 concentration, meteorological parameters (temperature, relative humidity, wind speed and direction, precipitation, and solar radiation), and temporal features (day of the week and month of the year), spanning the period from 2018 to 2023. A feature importance analysis was conducted using Normalized Mutual Information (NMI) to identify the most influential predictors—specifically, the previous day’s PM2.5 concentration and relative humidity—and to exclude variables with low correlation, such as wind direction and solar radiation. To optimize forecasting performance and ensure station-specific adaptation, the LSTM model’s hyperparameters were independently tuned for each station using a Genetic Algorithm (GA). The results showed that the proposed LSTM models delivered strong and reliable performance. The Taghiabad station achieved the highest accuracy, with R2 = 0.86 and RMSE = 5.82 µg/m3, followed by the Resalat and Nakhrisi stations with R2 values of 0.83 and 0.73 and RMSEs of 6.37 µg/m3 and 8.36 µg/m3, respectively. These predictive accuracies highlight the considerable potential of the proposed model as an effective tool for urban air quality management and timely public health advisories in megacities with similar environmental conditions.

Paper URL