Forecasting Air Quality in Peninsular Malaysia: Unveiling the Power of Artificial Neural Networks

doi:10.21203/rs.3.rs-4063318/v1

Download PDF

Research Article

Forecasting Air Quality in Peninsular Malaysia: Unveiling the Power of Artificial Neural Networks

https://doi.org/10.21203/rs.3.rs-4063318/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Air pollution poses significant risks to human health, the environment, and the economy. Therefore, striving for cleaner air through efficient air quality monitoring is imperative for fostering a healthier and more sustainable future. Predicting air quality is essential to enhance the quality of life, maintain environmental sustainability, and reduce the economic burden associated with poor air quality issues. The artificial neural network (ANN) is widely recognized as a predominant computational tool in air quality studies due to its capabilities in predicting gaseous and particulate pollutant concentrations, as well as forecasting the air pollutant index (API). This study aimed to investigate the predictive performance of ANN in determining the API by utilizing identified potential sources of air pollutants. Five prediction models were created, namely ANN-PC2018, ANN-PC2019, ANN-PC2020, ANN-PC2021, and ANN-PC2022. Principal component analysis (PCA) was conducted to identify the most significant sources of air pollution, and the results were employed to predict the API using ANN. The ANN-PC2019 model exhibited the highest performance with an R² value of 0.8612 and RMSE of 7.7467, utilizing four major pollutants as input variables. These findings suggest that forecasting air quality using fewer parameters yields reliable outcomes.

Artificial neural networks

air pollutant index

principal component analysis

Rapid urbanization and industrialization, particularly in developing countries, have led to increased air pollution. According to Latif et al. (2018) and Abdullah et al. (2019), ongoing processes of urbanization, industrialization, and population expansion have collectively led to a steady deterioration in air quality. This deterioration is marked by a rise in air pollutants stemming from diverse origins, such as emissions linked to industrial operations, vehicular combustion, and agricultural practices, as documented by Kean Hua (2018), Sentian et al. (2019), and Halim et al. (2020).

Air pollution poses substantial risks and implications for human health, environmental well-being, and the economy. The public widely recognizes air quality as a crucial aspect significantly impacting their daily lives, comfort, and overall health (Cao et al., 2022; Raunaq et al., 2023). Exposure to these pollutants can cause various health complications for humans (Kampa, 2008; Sentian et al., 2019; Wong et al., 2020). The Ministry of Health Malaysia has reported that respiratory fatality and natural death show significant associations with daily mean levels of air pollutants (Usmani et al., 2020). Previous studies have shown that exposure to PM₁₀, NO₂, and CO is significantly associated with an increase in hospital admissions due to respiratory diseases (Sofwan et al., 2021).

Ambient air is considered contaminated with the presence of harmful particulate and gaseous compounds that originate from various sources and undergo transformation through atmospheric conditions (Seinfeld, 1989). The concentrations of these pollutants are significantly influenced by spatial and temporal variations (Raunaq et al., 2023). Particulate matter (PM₁₀ and PM_2.5), O₃, CO, NO_X, SO₂, heavy metals, VOCs, acid gases, pesticides, radiation, and bio-aerosols are a few common air pollutants (Sanidas et al., 2017; Manisalidis et al., 2020). According to Manisalidis et al. (2020), air pollution adversely affects the environment by contaminating precipitation, which then infiltrates soil and water environments, leading to degradation of soil and water quality. Acid precipitation alters soil chemistry, affecting plant growth, agricultural crops, and overall water quality. Moreover, the acidity of soil promotes the migration of heavy metals into aquatic ecosystems, posing a significant threat to wildlife and fish populations.

The impact of air pollution on health and environmental well-being has significant economic implications. According to a report entitled 'The Health and Economic Impacts of Ambient Air Quality in Malaysia', the cost associated with healthcare for diseases caused by air pollution and loss of productivity due to illnesses is estimated at RM303 billion. The study conducted by Li et al. (2020) revealed that air pollution increased the occurrence of respiratory diseases and worsened health conditions, resulting in higher healthcare costs. Consequently, it was suggested that to mitigate the burden of disease, the government should initiate reforms from the supply side of healthcare services, including the restructuring of medical insurance payments and adopting new technologies and equipment.

The detrimental effects of air pollution on both human health and the environment are profound, resulting in significant economic repercussions. Therefore, striving for cleaner air through efficient air quality monitoring and management is imperative for fostering a healthier and more sustainable future. Monitoring air quality employing atmospheric pollutants is widely used worldwide (Baldasano et al., 2003). As in Malaysia, the monitoring and data collection is conducted by the Department of Environment (DOE) in collaboration with Alam Sekitar Malaysia Sdn Bhd (ASMA), aiming to provide real-time information to the public regarding major pollutant concentrations, as noted by Hawari et al., (2019), and Mohd Shafie et al., (2022). In 1989, the Recommended Malaysian Air Quality Guidelines (RMAQG) were formulated and established in Malaysia. Later, the Malaysian Air Quality Index (MAQI) was introduced in 1993 to complement the existing framework. In 2015, an improvised guideline was introduced with three-tier intervals of implementation. The New Malaysia Ambient Air Quality Standards (NMAAQS) serve as the basis for the Air Pollutant Index (API) calculation. (DOE 2020; Mohd Shafie et al., 2022). A new limit for the concentration range of major pollutants has been put into effect since then.

In Malaysia, the API is determined through the monitoring of key air pollutants, which encompass O₃, CO, NO₂, SO₂, PM_2.5, and PM₁₀. The API value is derived by assessing the highest concentration of these pollutants detected over a specific timeframe. This approach to API calculation is rooted in an averaging method (Wong et al., 2020). The interpretation of air quality, as derived from the API values, is succinctly presented in Table 1.

Table 1

Air quality guidelines (Yahaya, Ali, A, & Ishak, F. (2006).
API Scale	Classification	Description
Below 50	Good	Ambient air is cleaner with low air pollution levels and no significant impact on health.
51 to 100	Moderate	Moderate pollution levels and air quality are considered acceptable. However, exposure to air pollutants may cause health issues for sensitive groups.
101 to 200	Unhealthy	Individuals at high risk may experience mild symptoms. Those with heart or breathing issues should limit outdoor exercise.
201 to 300	Very Unhealthy	Healthy individuals may experience noticeable effects, while those with breathing or heart issues should limit activities and remain indoors. Elders are also encouraged to limit activities.
301 to 500	Hazardous	The elderly and sick are recommended to stay indoors. Healthy individuals should avoid outdoor activities or might experience a decrease in their physical strength while performing activities due to being exposed to air pollutants. Potentially having severe irritations and other symptoms could lead to serious illnesses.
Above 500	Emergency	There is a high risk to human and public health. It is strongly recommended to adhere to the guidance provided by National Security Councils and stay informed through announcements disseminated via mass media channels.

According to Carnevale et al., (2011), air quality forecasting is an important measure to prevent air pollution from exceeding critical levels by calculating the volume of air pollutant concentrations ahead of time. There are three types of forecasting models: machine learning models, numerical, and statistical (Raunaq et al., 2023). Artificial intelligence (AI) has become a popular technology for managing and preventing air pollution. In recent years, it has garnered notable interest in environmental studies (Bai et al., 2018). AI is a crucial tool for safeguarding the environment, as it helps authorities select effective mitigation methods to limit public exposure to air pollution (Hassan, 2010; Schürholz et al., 2020). Furthermore, AI's ability to manage intricate interactions among air quality parameters enables more precise forecasting of air pollutant concentrations (Liang et al., 2020).

The recent demand for air pollution prediction can be attributed to the emergence of AI-driven forecasting systems, known for their precision and accuracy. This enthusiasm has been fueled by significant technological advancements in big data analytics, including extensible storage solutions, advanced computing platforms, and high-speed data processing. Artificial neural networks have become the primary computational technique in the field of AI. They are extensively used in various studies for predicting both gaseous and particulate pollutant levels. Some notable examples of such studies are Benjamin et al., (2014), Liyana Zakri et al., (2018), Ganesh et al. (2018), Wong et al. (2020), and Agarwal et al. (2020). This study aimed to investigate the predictive performance of ANN in determining the API by utilizing identified potential sources of air pollutants.

2.1 Study region and retrospective data

Malaysia's Department of Environment, Ministry of Natural Resources, and Environmental Sustainability generously provided retrospective data on air quality from all states in Peninsular Malaysia, including two federal territories. This study encompasses forty-seven Continuous Air Quality Monitoring (CAQM) stations. The list of CAQM is shown in Table 2.

Table 2

CAQM stations in Peninsular Malaysia.
No.	State	Location	Coordinates	Zone	Classification
1.	Perlis	Kangar	06° 25' 47.71" N, 100° 12' 39.84" E	North	Sub Urban
2.	Kedah	Langkawi	06° 19' 53.54" N, 099° 51' 30.45" E		Sub Urban
3.		Alor Setar	06° 08' 13.49" N, 100° 20' 48.71" E		Sub Urban
4.		Sungai Petani	05° 37' 46.63" N, 100° 28' 03.83" E		Sub Urban
5.		Kulim	05° 24' 05.82" N, 100° 35' 22.70" E		Industrial
6.	Pulau Pinang	Seberang Jaya	05° 23' 53.41" N, 100° 24' 14.20" E		Urban
7.		Seberang Perai	05° 19' 45.68" N, 100° 26' 36.51" E		Sub Urban
8.		Minden	05° 21' 22.35" N, 100° 18' 28.51" E		Urban
9.		Balik Pulau	05° 20' 13.61" N, 100° 12' 59.21" E		Sub Urban
10.	Perak	Taiping	04° 53' 55.86" N, 100° 40' 44.78" E		Sub Urban
11.		Ipoh	04° 37' 45.99" N, 101° 06' 59.94" E		Urban
12.		Pegoh	04° 33' 12.00" N, 101° 04' 48.84" E		Sub Urban
13.		Seri Manjung	04° 12' 01.23" N, 100° 39' 48.08" E		Rural
14.		Tanjung Malim	03° 41' 15.92" N, 101° 31' 28.17" E		Sub Urban
15.	Kuala Lumpur	Batu Muda	03° 12' 44.78" N, 101° 40' 56.02" E	Central	Sub Urban
16.	Kuala Lumpur	Cheras	03° 06' 22.44" N, 101° 43' 04.50" E		Urban
17.	Putrajaya	Putrajaya	02° 54' 53.33" N, 101° 41' 24.17" E		Sub Urban
18.	Selangor	Kuala Selangor	03° 19' 16.70" N, 101° 15' 22.47" E		Rural
19.		Petaling Jaya	03° 07' 59.40" N, 101° 36' 28.83" E		Sub Urban
20.		Shah Alam	03° 06' 16.98" N, 101° 33' 22.39" E		Urban
21.		Klang	03° 00' 53.60" N, 101° 24' 47.19" E		Sub Urban
22.		Banting	02° 49' 00.08" N, 101° 37' 23.36" E		Sub Urban
23.	Negeri Sembilan	Nilai	02° 49' 18.09" N, 101° 48' 41.34" E		Sub Urban
24.		Seremban	02° 43' 24.17" N, 101° 58' 06.58" E		Urban
25.		Port Dickson	02° 26' 28.97" N, 101° 52' 00.68" E		Sub Urban
26.	Melaka	Alor Gajah	02° 22' 15.33" N, 102° 13' 28.53" E	South	Rural
27.		Bukit Rambai	02° 16' 06.57" N, 102° 11' 37.19" E		Sub Urban
28.		Bandaraya Melaka	02° 11' 27.36" N, 102° 15' 25.40" E		Urban
29.	Johor	Segamat	02° 29' 38.09" N, 102° 51' 45.69" E		Sub Urban
30.		Batu Pahat	01° 55' 09.56" N, 102° 51' 59.82" E		Sub Urban
31.		Kluang	02° 02' 16.37" N, 103° 18' 43.42" E		Rural
32.		Larkin	01° 29' 40.65" N, 103° 44' 09.50" E		Urban
33.		Pasir Gudang	01° 28' 12.43" N, 103° 53' 36.44" E		Sub Urban
34.		Pengerang	01° 23' 22.16" N, 104° 08' 58.50" E		Industrial
35.		Kota Tinggi	01° 33' 50.60" N, 104° 13' 31.10" E		Sub Urban
36.		Tangkak	02° 29' 47.94" N, 102° 57' 15.95" E		Sub urban
37.	Pahang	Rompin	02° 55' 35.92" N, 103° 25' 09.11" E	East	Rural
38.		Temerloh	03° 28' 17.77" N, 102° 22' 35.06" E		Sub Urban
39.		Jerantut	03° 56' 54.09" N, 102° 21' 59.87" E		Sub Urban
40.		Indera Mahkota	03° 49' 09.18" N, 103° 17' 47.57" E		Sub Urban
41.		Balok Baru	03° 57' 38.31" N, 103° 22' 55.76" E		Industrial
42.	Terengganu	Kemaman	04° 15' 43.46" N, 103° 25' 32.90" E		Industrial
43.		Paka	04° 35' 53.03" N, 103° 26' 05.34" E		Industrial
44.		Kuala Terengganu	05° 18' 29.13" N, 103° 07' 13.41" E		Urban
45.		Besut	05° 44' 54.41" N, 102° 30' 56.27" E		Sub Urban
46.	Kelantan	Tanah Merah	05° 48' 40.21" N, 102° 08' 04.20" E		Sub Urban
47.	Kelantan	Kota Bharu	06° 08' 50.75" N, 102° 14' 57.24" E		Sub Urban

The retrospective data employed in this study encompasses a duration from January 1, 2018, to December 31, 2022, spanning five years. This dataset comprises daily readings of O₃, CO, NO₂, SO₂, PM_2.5, and PM₁₀, categorized as major pollutants. These parameters served as variables in all analyses conducted.

2.2 Descriptive Analysis

Univariate statistics were conducted to analyze the minimum, maximum, mean, median, and standard deviation values of each air quality parameter. The results of this analysis will offer valuable insights into the ambient air quality in the study region. These findings were then evaluated against the New Malaysia Ambient Air Quality Standards (NMAAQS), as illustrated in Table 3.

Table 3

New Malaysia Ambient Air Quality Standards (DOE, 2020).
Pollutants	Averaging Time	Ambient Air Quality Standard
Pollutants	Averaging Time	ppm	µg/m³ / *mg/m³
Ozone, O₃	1-Hour	0.090	180
Ozone, O₃	8-Hour	0.050	100
Carbon Monoxide, CO	1-Hour	26.2	30*
Carbon Monoxide, CO	8-Hour	8.75	10*
Nitrogen Dioxide, NO₂	1-Hour	0.150	280
Nitrogen Dioxide, NO₂	24-Hour	0.037	70
Sulfur Dioxide, SO₂	1-Hour	0.095	250
Sulfur Dioxide, SO₂	24-Hour	0.030	80
Particulate Matter, PM₁₀	24-Hour		100
Particulate Matter, PM₁₀	1-Year		40
Particulate Matter, PM_2.5	24-Hour		35
Particulate Matter, PM_2.5	1-Year		15

2.3 Identification of Major Sources of Air Pollution

According to Gao et al., 2018, accurate identification and classification of air pollutants is crucial to effectively mitigate their harmful effects. For this reason, the selection of appropriate statistical techniques is essential to address the issue. Principal component analysis (PCA), sensitivity analysis (SA), and discriminant analysis (DA) are among the effective methods that are useful in discovering sources of variation in air pollution. These multivariate techniques have been applied successfully in studies by Yu (2015), Kean Hua (2018), Shihab (2022), and Azizan et al., (2023). Thus, in this research, PCA techniques were chosen for to investigate the target objectives.

2.3.1 Principal Component Analysis

An enormous set of interconnected variables is simplified into a smaller set of uncorrelated variables, known as principal components (PCs) by employing a PCA. According to Mutalib et al., (2013), and Shihab, (2022), this analysis intends to understand the variance within the data. These components essentially represent linear combinations of the original dataset, and their formulation is outlined by Eq. (1).

Z _ij = a_i1x_1j + a_i2x_2j + a_i3x_3j + ... + a_imx_mj (1)

where Z is the component score, a is the component loading, x is the measured value of the variable, i is the component number, j is the sample number, and m is the total number of variables (Isiyaka & Azid, 2015; Shihab, 2022).

Air quality parameters exhibit varying magnitudes and measurement scales. Therefore, the original data must be standardized to meet the requirements of the Z-scale to a mean of 0.0 and a variance of 1.0 by employing Eq. (2) (Isiyaka & Azid, 2015; Shihab, 2022).

Z _ij = (X_ij – µ)/ σ (2)

where Z_ij is the standard score of the jth value of the measured variable i; X_ij is the jth observation of variable i; µ is the variable mean value and σ is the standard deviation.

Standardizing the data will ensure equal weighting of air quality variables during statistical analysis. Additionally, it will homogenize the variance of the distribution. (Simeonov et al., 2002).

PCA is applied in conjunction with varimax rotation to streamline component complexity. Varimax rotation simplifies the factor structure, making the data set more reliable and straightforward for interpretation. In rotation method, only principal components (PCs) with eigenvalues exceeding one are utilized and justified as significant (Kim and Mueller, 1987) for generating new variables called varimax factors (VFs) with factor loadings. Following, Kaiser Criterion is applied to address the issue of determining the number of variables to retain (Kaiser, 1958). VFs serve as indicators of the correlation strength between variables. VFs exceeding a threshold of 0.75 are considered strong, while those falling between 0.50 and 0.75 are termed moderate, and scores ranging from 0.30 to 0.49 are labeled as weak factor loadings (Liu et al., 2003). Only VFs with values surpassing 0.75 were selected as a criterion in this research. Subsequently, the factor scores obtained from the rotated varimax were utilized for ANN modeling with the XLSTAT 2021.

The sufficiency of the dataset for analysis was evaluated by conducting the Kaiser-Meyer-Olkin (KMO) and Bartlett's sphericity tests at the beginning of PCA. KMO values equal to or more than 0.5, alongside Bartlett's test with a p-value ≤ 0.05, indicate sufficient sampling for analysis (Bartlett, 1954; Tabachnick et al., 2018).

2.4 Artificial Neural Network for Air Quality Forecasting

ANN is structured by the interconnection of the input, hidden, and output layers to form a network (Yadav & Nath, 2017). These networks are typically trained with training data, and their widespread use for modelling and forecasting is due to the capability to compare the actual value with the predicted output (Bączkiewicz et al., 2021). A multi-layer perceptron feed-forward artificial neural network (MLP-FF-ANN) was employed to forecast the air pollutant index, leveraging its capabilities to develop nonlinear models with high complexity (Abdullah et al., 2019). Figure 1 shows the architecture of a three-layer perceptron that was used. The network variables—O₃, CO, NO₂, SO₂, PM₁₀, and PM_2.5—are in the initial input layer. The signal then travels through weighted connections to the hidden layer (dependent variable), where processing occurs (Thorat et al., 2023). Trial-and-error procedures were employed to optimize both the number of hidden nodes and the neurons within each layer (Arhami et al., 2013), with a backpropagation algorithm utilized to minimize prediction errors (Juahir et al., 2010; Arhami et al., 2013).

The accuracy of the developed model in predicting API will be assessed based on the coefficient of determination (R²) value and root mean square error (RMSE). Enhanced prediction capabilities of the model are indicated by a higher R² and a lower RMSE (Sarkar & Kumar, 2012). The mathematical expressions for these metrics are denoted as Eq. (3) and Eq. (4).

𝑅² = \(1-\frac{\sum {\left({\text{x}}_{\text{i}}-{\text{y}}_{\text{i}}\right)}^{2}}{\sum {y}_{1}^{2}-\frac{\sum {y}_{1}^{2}}{n}}\) (3)

𝑅𝑀𝑆𝐸 = \(\sqrt{\frac{1}{n}\varSigma \frac{n}{i=1}}{\left({x}_{i}-{y}_{i}\right)}^{2}\) (4)

In the equation, x_i represents the observed data, y_i signifies the predicted data and n denotes the number of observations.

3.1 Descriptive Analysis

Descriptive statistics have been carried out to gain insight into the air quality pattern in Peninsular Malaysia. The minimum, maximum, and mean values of six air quality pollutants and the API were obtained and reported in Table 4. The concentrations of O₃, CO, NO₂, and SO₂ were found to be below the NMAAQS for minimum, maximum, and mean values. However, the maximum concentrations of PM₁₀ and PM_2.5 exceeded the approved levels of air pollutant concentration limits based on NMAAQS, with values of 237.62 µg/m³ and 210.71 µg/m³ respectively. Peninsular Malaysia has experienced the highest API reading of 283, indicating a very unhealthy level, while the average API for five years showed a moderate level with a value of 55.17.

Table 4

Descriptive analysis of forty-seven CAQM stations throughout Peninsular Malaysia
Statistic	Variables
	PM₁₀	PM_2.5	SO₂	NO₂	O₃	CO	API
	(µg/m³)	(µg/m³)	(ppm)	(ppm)	(ppm)	(ppm)	API
Minimum	3.001	1.730	0.000	0.000	0.000	0.061	9.00
Maximum	237.622	210.709	0.048	0.047	0.062	2.239	283.00
Mean	24.337	16.403	0.001	0.007	0.018	0.582	55.17
Standard deviation (n-1)	12.419	10.467	0.001	0.005	0.007	0.220	15.07
Averaging Period NMAAQS	24hrs 100	24hrs 35	1hr 0.095	1hr 0.150	1hr 0.090	1hr 26.2

3.2 Identification of Possible Sources of Air Pollution

In this study, five datasets were created and named PC2018, PC2019, PC2020, PC2021, and PC2022. To ensure the suitability of the datasets for PCA, preliminary assessments were conducted using Bartlett’s test and the KMO test. These assessments aimed to evaluate the correlation between variables and assess the adequacy of the sampling dataset. Table 5 shows that the p-value < 0.0001, indicating statistically significant correlations between variables in the dataset. Hence, it can be inferred that the air quality data meet the assumption of sphericity, indicating a strong relationship among variables.

Table 5

Result of Bartlett’s Sphericity test in each analysis model dataset.
Statistic	PC2018	PC2019	PC2020	PC2021	PC2022
Chi-square (Observed value)	63372.774	83159.321	48866.259	48722.757	44591.538
Chi-square (Critical value)	24.996	24.996	24.996	24.996	24.996
DF	15	15	15	15	15
p-value (Two-tailed)	< 0.0001	< 0.0001	< 0.0001	< 0.0001	< 0.0001
alpha	0.050	0.050	0.050	0.050	0.050

The result of the KMO test in Table 6 shows a value of 0.650, which indicates the degree of correlation between variables and the suitability of PCA. According to Rencher (2003) and Shrestha (2021), a value above 0.6 indicates a mediocre and is acceptable for the next analysis.

Table 6

Result of Kaiser-Meyer-Olkin test in each analysis model dataset.
Statistic	PC2018	PC2019	PC2020	PC2021	PC2022
PM₁₀ (µg/m³)	0.611	0.604	0.617	0.600	0.615
PM_2.5 (µg/m³)	0.609	0.595	0.606	0.596	0.602
SO₂ (ppm)	0.789	0.711	0.643	0.644	0.846
NO₂ (ppm)	0.681	0.616	0.702	0.724	0.842
O₃ (ppm)	0.582	0.718	0.479	0.480	0.449
CO (ppm)	0.717	0.692	0.733	0.789	0.804
KMO	0.650	0.626	0.644	0.638	0.659

PCA was performed on each dataset that comprised of six variables; O₃, CO, NO₂, SO₂, PM_2.5, and PM₁₀. The analysis yielded two PCs for all datasets with eigenvalues exceeding 1.0 (Kim and Mueller, 1987). The scree plot depicted in Fig. 2(i)-(v) shows the association between eigenvalues and the number of factors in descending order, explaining the most significant variance in the data.

Varimax rotation was performed with two PCs, and the results are displayed in Table 7. A threshold value of VF greater than 0.75 was established for selection. Figure 3(i)-(v) illustrates the percentage of variance after varimax rotation.

Table 7

Factor loading after varimax rotation.
Variables		PM₁₀ (µg/m³)	PM_2.5 (µg/m³)	SO₂ (ppm)	NO₂ (ppm)	O₃ (ppm)	CO (ppm)	Variability (%)	Cumulative (%)
PC2018	VF1	0.848	0.866	0.213	0.212	0.71	0.219	35.199	35.199
PC2018	VF2	0.35	0.361	0.522	0.881	-0.333	0.824	34.858	70.058
PC2019	VF1	0.893	0.899	0.357	0.191	0.642	0.437	39.578	39.578
PC2019	VF2	0.296	0.299	-0.039	0.853	-0.523	0.798	30.245	69.823
PC2020	VF1	0.919	0.934	0.097	0.665	0.256	0.636	43.986	43.986
PC2020	VF2	-0.093	-0.101	0.019	0.571	-0.833	0.515	21.749	65.735
PC2021	VF1	0.877	0.911	0.188	0.732	0.128	0.689	44.359	44.359
PC2021	VF2	0.223	0.201	0.287	-0.41	0.875	-0.315	20.088	64.447
PC2022	VF1	0.889	0.914	0.237	0.743	0.123	0.636	44.256	44.256
PC2022	VF2	0.184	0.196	0.299	-0.293	0.881	-0.37	19.33	63.586

Dataset PC2018 showed the cumulative percentage of variance for VF1 and VF2 as 35.199% and 34.858% respectively, totaling 70.058%. The two highest positive factor loadings in VF1 were from PM₁₀ and PM_2.5, with values of 0.848 and 0.866 respectively, while VF2 demonstrated the highest positive factor loadings from NO₂ (0.881) and CO (0.824). In PC2019, the cumulative percentage of variance scored 69.823%, with VF1 contributing 39.578% and VF2 contributing 30.245%. Major pollutants in VF1 were PM₁₀ and PM_2.5, with positive factor loading values of 0.893 and 0.899 respectively, while VF2 scored the highest positive factor loadings from NO₂ (0.853) and CO (0.798). PC2020 showed a cumulative percentage of variance of 65.735%, with VF1 contributing 43.986% and VF2 contributing 21.749%. The two highest positive loadings in this dataset were from PM₁₀ (0.919) and PM_2.5 (0.934), while VF2 demonstrated a negative factor loading from O₃ (-0.833). VF1 contributed 44.359% and VF2 contributed 20.088%, totaling 64.44 7% of cumulative percentage of variance, in dataset PC2021. PM₁₀ and PM_2.5 were identified as major pollutants in VF1 and O₃ in VF2, with factor loading values of 0.877, 0.911, and 0.875 respectively. PC2022 showed results after varimax rotations, with a total cumulative percentage of variance of 63.586%, obtained from VF1 (44.256%) and VF2 (19.330%). PM₁₀ and PM_2.5 were reported as major pollutants in VF with positive factor loadings of 0.889 and 0.914 respectively, while O₃ in VF2 had a positive factor loading of 0.881.

Based on the results, it can be observed that there is a similarity of major pollutants in a five-year trend. PM₁₀ and PM_2.5 showed strong positive loadings for each model with a range of values from 0.848 to 0.919 and 0.866 to 0.934, respectively. According to Rahman et al. (2015), PM₁₀ and PM_2.5 stem from motor vehicles, factories, power generators, construction sites, quarries, and incinerators, collectively contributing to atmospheric pollutant level. Furthermore, Malaysia encountered a haze event characterized by the highest recorded intensity of PM_2.5, attributable to both open burning and haze from bordering countries such as Sumatera and Kalimantan, Indonesia (Latif et al., 2018; Liyana Zakri et al., 2018; Ab. Rahman et al., 2022). NO₂ showed strong positive loadings in the years 2018 and 2019 with a range of values from 0.853 to 0.881. The presence of NO₂ in ambient air is mainly caused by industrial operations and heavy traffic (Isiyaka & Azid, 2015; Ismail et al., 2017), while Dominick et al. (2012) concluded that NO₂ is a product of traffic congestion and manufacturing activities. From 2020 to 2022, NO₂ showed weak factor loading, which can be attributed to the decrease in industrial and commercial activities during the COVID-19 pandemic (Mazlan et al., 2022). O₃ showed strong positive factor loadings in the years 2021 and 2022 but was negative in 2020, with a range of values from 0.833 to 0.881. According to Mazlan et al. (2022), O3 levels rose during the post-MCO period as industries resumed operations, road traffic increased, and people resumed their activities. However, in 2020, O₃ showed a negative value, indicating an inverse relationship. In this context, the decrease in VOCs and NOx during the pandemic is anticipated to have resulted in a reduction of ozone (Tavella & da Silva Júnior, 2021). CO showed strong positive loadings in 2018 and 2019 with a range of values from 0.782 to 0.824. High concentrations of CO are primarily associated with incomplete fuel combustion in automobiles, making it an important indicator of atmospheric contamination in this region (Dominick et al., 2012; Angatha and Mehar, 2020).

3.2.3 Forecasting Air Quality Using ANN

The models for forecasting API were developed by combining the result obtained from PCA and ANN. The analysis was conducted by computing MLP-FF in JMP10 software, and the models were then renamed as ANN-PC2018, ANN-PC2019, ANN-PC2020, ANN-PC2021 and ANN-PC2022.

The performance results of R² and RMSE based on the 10 network structures were compared in forecasting air quality. The R² score ranges from zero to one, with higher values indicating greater explanatory power. Therefore, the best performance model is the one with the highest R² value and the lowest RMSE (Chenard & Caissie, 2008; Nasir, 2011; Azid et al., 2013). As per Rumsey (2011), a value of R² greater than 0.90 is considered significant with perfect linear regression, a range between 0.70–0.89 is significant with strong linear regression, a range of 0.50–0.69 is considered significant with moderate linear regression, a range between 0.30–0.49 and 0.00–0.29 are considered no significant with weak linear relationship and no linear relationship, respectively.

Table 8 shows the structure of constructing networks and the performance level based on training and validation. Input parameters in each model are based on PCA results. ANN-PC2018 and ANN-PC2019 used PM₁₀, PM_2.5, NO₂, and CO, while ANN-PC2020, ANN-PC2021, and ANN-PC2022 used PM₁₀, PM_2.5, and O₃. These pollutants are considered major sources in each dataset. Figure 4(i)-(x) depicts scatter plots of the predicted API (versus actual API) for both training and validation datasets.

ANN-PC2018 showed optimum performance at node eight with R² = 0.7905, RMSE = 5.4684 for training and R² = 0.7826, RMSE = 5.5446 for validation. This model is considered significant with strong linear regression. ANN-PC2019 scored the highest R² = 0.8612, RMSE = 7.7467 for training and R² = 0.8356, RMSE = 7.7990 at node eight, indicating significance with strong linear regression. ANN-PC2020 showed the best performance results for training and validation at node nine with the values R² = 0.7384, RMSE = 6.3382 and R² = 0.7586, RMSE = 5.9427, respectively. This indicates that this model is significant with strong linear regression. ANN-PC2021 obtained the highest result at node 9 with R² = 0.8230, RMSE = 5.9020 for training and R² = 0.8270, RMSE = 5.9010 for validation. This model is categorized as significant with strong linear regression. ANN-PC2022 is considered significant with strong linear regression after the performance at node 5 showed R² = 0.8057, RMSE = 5.9613 for training and R² = 0.8042, RMSE = 6.0240 for validation. Therefore, all models are acceptable, and based on their performance metrics, the rankings are as follows:

ANN-PC2019 > ANN-PC2021 > ANN-PC2022 > ANN-PC2018 > ANN-PC2020

The prediction models also introduced a new approach by eliminating the least significant pollutant from observation. Models ANN-PC2018 and ANN-PC2019 identified SO₂ and O₃ as the least important pollutants, whereas models ANN-PC2020, ANN-PC2021, and ANN-PC2022 suggested SO₂, NO₂, and CO. Currently, the API is determined by the highest reading among the sub-index for O₃, CO, NO₂, SO₂, PM_2.5, and PM₁₀. For that reason, the DOE is responsible for collecting data on all these pollutants, regardless of their significance to the overall API value. This information is crucial for accurately determining the API and monitoring air quality in Malaysia. Thus, the findings suggest considering the removal of the least important pollutant from the list of major pollutants when determining API readings. Moreover, air pollution trends in Malaysia are predominantly influenced by PM_2.5 and PM₁₀ (DOE, 2018–2021; Sentian et al., 2019; Ab. Rahman et al., 2022), consistent with the findings of this study. Therefore, this study highly suggests that model ANN-PC2019 is most suitable for forecasting API. Although models ANN-PC2020, ANN-PC2021, and ANN-PC2022 showed significance, they are least recommended due to the COVID-19 pandemic and post-pandemic period, which reflects an improvement in air quality during those periods (Zahid et al., 2022).

Table 8

The forecasting performance of ANN-PCA model.
Model	Network Structures	Training		Validation
Model	Network Structures	R²	RMSE	R²	RMSE
ANN-PC2018	[4,1,1]	0.7872	5.5114	0.7791	5.5888
	[4,2,1]	0.7856	5.5316	0.7786	5.5944
	[4,3,1]	0.7866	5.5185	0.7797	5.5807
	[4,4,1]	0.7866	5.5185	0.7797	5.5807
	[4,5,1]	0.7880	5.5007	0.7811	5.5636
	[4,6,1]	0.7890	5.4882	0.7818	5.5541
	[4,7,1]	0.7897	5.4787	0.7810	5.5641
	[4,8,1]	0.7905	5.4684	0.7826	5.5446
	[4,9,1]	0.7889	5.4889	0.7820	5.5521
	[4,10,1]	0.7885	5.4936	0.7804	5.5721
ANN-PC2019	[4,1,1]	0.8600	7.7791	0.8345	7.8245
	[4,2,1]	0.8605	7.7658	0.8353	7.8064
	[4,3,1]	0.8606	7.7620	0.8348	7.8181
	[4,4,1]	0.8610	7.7532	0.8354	7.8033
	[4,5,1]	0.8597	7.7882	0.8355	7.8013
	[4,6,1]	0.8591	7.8037	0.8349	7.8159
	[4,7,1]	0.8600	7.7787	0.8351	7.8105
	[4,8,1]	0.8612	7.7467	0.8356	7.7990
	[4,9,1]	0.8608	7.7567	0.8352	7.8074
	[4,10,1]	0.8600	7.7788	0.8351	7.8111
ANN-PC2020	[3,1,1]	0.7329	6.4048	0.7502	6.0452
	[3,2,1]	0.7339	6.3924	0.7518	6.0251
	[3,3,1]	0.7376	6.3482	0.7572	5.9592
	[3,4,1]	0.7329	6.4047	0.7500	6.0478
	[3,5,1]	0.7367	6.3589	0.7574	5.9575
	[3,6,1]	0.7332	6.4013	0.7506	6.0396
	[3,7,1]	0.7365	6.3606	0.7534	6.0060
	[3,8,1]	0.7377	6.3461	0.7575	5.9555
	[3,9,1]	0.7384	6.3382	0.7586	5.9427
	[3,10,1]	0.7380	6.3424	0.7578	5.9523
ANN-PC2021	[3,1,1]	0.8199	5.9542	0.8254	5.9293
	[3,2,1]	0.8195	5.9602	0.8248	5.9397
	[3,3,1]	0.8174	5.9953	0.8234	5.9627
	[3,4,1]	0.8206	5.9417	0.8267	5.9071
	[3,5,1]	0.8220	5.9193	0.8269	5.9042
	[3,6,1]	0.8224	5.9117	0.8267	5.9070
	[3,7,1]	0.8227	5.9079	0.8263	5.9145
	[3,8,1]	0.8203	5.9470	0.8258	5.9219
	[3,9,1]	0.8230	5.9020	0.8270	5.9010
	[3,10,1]	0.8216	5.9248	0.8260	5.9191
ANN-PC2022	[3,1,1]	0.8025	6.0095	0.7967	6.1394
	[3,2,1]	0.8035	5.9950	0.7991	6.1026
	[3,3,1]	0.8043	5.9824	0.8001	6.0878
	[3,4,1]	0.8044	5.9810	0.8000	6.0894
	[3,5,1]	0.8057	5.9613	0.8042	6.0240
	[3,6,1]	0.8034	5.9959	0.7987	6.1085
	[3,7,1]	0.8040	5.9860	0.7996	6.0946
	[3,8,1]	0.8031	6.0000	0.7987	6.1094
	[3,9,1]	0.8046	5.9774	0.8017	6.0634
	[3,10,1]	0.8036	5.9931	0.7991	6.1032

In conclusion, the findings reveal that artificial neural networks can produce a reliable model for forecasting the API. Specifically, the ANN-PC2019 model demonstrated the highest performance, achieving an R² value of 0.8612 and an RMSE value of 7.7467. This indicates that the model's predictions explain 87% of the variation in the experimentally calculated API values, suggesting its specificity. The dataset for this study covers periods before, during, and after the COVID-19 pandemic, providing comprehensive coverage over five years to identify major contributors to air quality degradation, namely PM₁₀, PM_2.5, NO₂, and CO. Additionally, employing diverse input parameters results in variations in API predictions, highlighting the versatility and adaptability of the model compared to API values calculated by the DOE. Interestingly, reducing input parameters does not significantly affect the determination of API values. These findings underscore the ability of advanced analytical approaches to offer more robust and precise air quality forecasts, critical for effective environmental management and public health planning in Malaysia.

ACKNOWLEDGMENT

The author extends sincere appreciation to the Department of Environment of Malaysia and East Coast Environmental Research Institute (ESERI), Universiti Sultan Zainal Abidin, Malaysia, for their invaluable guidance in completing this research project.

AUTHOR CONTRIBUTIONS

Mohd Suzairi Mohd Shafi’i conducted the collection and treatment of the data, performed measurements, processed experimental data, conducted analysis, drafted the manuscript, and designed the figures. Hafizan Juahir was involved in planning and supervising the work, aided in interpreting the results, and contributed to the manuscript. All authors discussed the results and provided comments on the manuscript.

FUNDING

No funding involved in this research.

DATA AVAILABILITY

No data and material availability.

CODE AVAILABILITY

No code availability.

The authors declare no competing interest.

Ab. Rahman, E., Hamzah, F. M., Latif, M. T., & Dominick, D. (2022). Assessment of PM2.5 Patterns in Malaysia Using the Clustering Method. Aerosol and Air Quality Research, 22(1), 210161. https://doi.org/10.4209/aaqr.210161
Abdullah, S., Ismail, M., & Najah, A. M. (2019). Multi-Layer Perceptron Model for Air Quality Prediction. Malaysian Journal of Mathematical Sciences, 13(S), 85–95.https://www.researchgate.net/publication/339088619_Multi-Layer_Perceptron_Model_for_Air_Quality_Prediction
Agarwal, S., Sharma, S., Suresh, R., Rahman, H., Vranckx, S., Maiheu, B., Blyth, L., Janssen, S., Gargava, P., Shukla, V. K., & Batra, S. (2020). Air quality forecasting using artificial neural networks with real time dynamic error correction in highly polluted regions. The Science of the Total Environment, 735, 139454. https://doi.org/10.1016/j.scitotenv.2020.139454
Angatha, R. K., & Mehar, A. (2020). Impact of Traffic on Carbon Monoxide Concentrations Near Urban Road Mid-Blocks. Journal of the Institution of Engineers (India): Series A. https://doi.org/10.1007/s40030-020-00464-2
Arhami, M., Kamali, N., & Rajabi, M. M. (2013). Predicting hourly air pollutant levels using artificial neural networks coupled with uncertainty analysis by Monte Carlo simulations. Environmental Science and Pollution Research, 20(7), 4777–4789. https://doi.org/10.1007/s11356-012-1451-6
Azid, A., Juahir, H., Latif, M. T., Zain, S. M., & Osman, M. R. (2013). Feed-Forward Artificial Neural Network Model for Air Pollutant Index Prediction in the Southern Region of Peninsular Malaysia. Journal of Environmental Protection, 04(12), 1–10. https://doi.org/10.4236/jep.2013.412a1001
Azizan, N. A., Othman, A. S., Meramat, A. A., Muhammad Amin, S. N. S., & Azid, A. (2023). A Framework to Spatially Cluster Air Quality Monitoring Stations in Peninsular Malaysia using the Hybrid Clustering Method. Malaysian Journal of Fundamental and Applied Sciences, 19(5), 804–816. https://doi.org/10.11113/mjfas.v19n5.2620
Bączkiewicz, A., Wątróbski, J., Sałabun, W., & Kołodziejczyk, J. (2021). An ANN Model Trained on Regional Data in the Prediction of Particular Weather Conditions. Applied Sciences, 11(11), 4757. https://doi.org/10.3390/app11114757
Bai, L., Wang, J., Ma, X., & Lu, H. (2018). Air Pollution Forecasts: An Overview. International Journal of Environmental Research and Public Health, 15(4), 780. https://doi.org/10.3390/ijerph15040780
Baldasano, J., Valera, E., & Jimenez, P. (2003). Air quality data from large cities. The Science of the Total Environment, 307(1–3), 141–165. https://doi.org/10.1016/s0048-9697(02)00537-5
Bartlett, M. S. (1954). A Note on the Multiplying Factors for Various χ2 Approximations. Journal of the Royal Statistical Society: Series B (Methodological), 16(2), 296–298. https://doi.org/10.1111/j.2517-6161.1954.tb00174.x
Benjamin, N. L., Sharma, S., Pendharker, U., & Shrivastava, J. (2014). Air quality prediction using artificial neural network. International Journal of Chemical Studies, 2, 7–9. https://www.semanticscholar.org/paper/Air-quality-prediction-using-artificial-neural-Benjamin-Sharma/356e7ebe29fa06c415667dda25eb120b47819229
Cao, L., Zhai, D., Kuang, M., & Xia, Y. (2022). Indoor air pollution and frailty: A cross-sectional and follow-up study among older Chinese adults. Environmental Research, 204, 112006. https://doi.org/10.1016/j.envres.2021.112006
Carnevale, C., Finzi, G., Pisoni, E., Singh, V., & Volta, M. (2011). An integrated air quality forecast system for a metropolitan area. Journal of Environmental Monitoring, 13, 3437–3447. https://doi.org/10.1039/c1em10303b
Chenard, J., & Caissie, D. (2008). Stream temperature modelling using artificial neural networks: application on Catamaran Brook, New Brunswick, Canada. Hydrological Processes, 22(17), 3361–3372. https://doi.org/10.1002/hyp.6928
Department of Environment (DOE) (2018) Malaysia Environmental Quality Report 2018. Department of Environment, Putrajaya, Malaysia. (n.d.). In https://www.doe.gov.my/en/environmental-quality-report/.
Department of Environment (DOE) (2019) Malaysia Environmental Quality Report 2019. Department of Environment, Putrajaya, Malaysia. (n.d.). In https://www.doe.gov.my/en/environmental-quality-report/.
Department of Environment (DOE) (2020) Malaysia Environmental Quality Report 2020. Department of Environment, Putrajaya, Malaysia. (n.d.). In https://www.doe.gov.my/en/environmental-quality-report/.
Department of Environment (DOE) (2021) Malaysia Environmental Quality Report 2021. Department of Environment, Putrajaya, Malaysia. (n.d.). In https://www.doe.gov.my/en/environmental-quality-report/.
Dominick, D., Juahir, H., Latif, M. T., Zain, S. M., & Aris, A. Z. (2012). Spatial assessment of air quality patterns in Malaysia using multivariate analysis. Atmospheric Environment, 60, 172–181. https://doi.org/10.1016/j.atmosenv.2012.06.021
Ganesh, S. S., Arulmozhivarman, P., & Tatavarti, V. R. (2018). Air quality index forecasting using artificial neural networks - a case study on Delhi. International Journal of Environment and Waste Management, 22(1/2/3/4), 4. https://doi.org/10.1504/ijewm.2018.094105
Gao, M., Yin, L., & Ning, J. (2018). Artificial neural network model for ozone concentration estimation and Monte Carlo analysis. Atmospheric Environment, 184, 129–139. https://doi.org/10.1016/j.atmosenv.2018.03.027
Halim, N. D. A., Latif, M. T., Mohamed, A. F., Maulud, K. N. A., Idrus, S., Azhari, A., Othman, M., & Sofwan, N. M. (2020). Spatial assessment of land use impact on air quality in mega urban regions, Malaysia. Sustainable Cities and Society, 63, 102436. https://doi.org/10.1016/j.scs.2020.102436
Hassan, R. (2010). Urban Air Pollution Forecasting Using Artificial Intelligence-Based Tools. In Air Pollution. IntechOpen. https://www.semanticscholar.org/paper/Urban-Air-Pollution-Forecasting-Using-Artificial-Hassan-Li/10b6ac2f09d9e67a88291ddf3b45679c7753fd37
Hawari, H. F., Zainal, A. A., & Ahmad, M. R. (2019). Development of real time internet of things (IoT) based air quality monitoring system. Indonesian Journal of Electrical Engineering and Computer Science, 13(3), 1039. https://doi.org/10.11591/ijeecs.v13.i3.pp1039-1047
Isiyaka, H. A., & Azid, A. (2015). Air Quality Pattern Assessment in Malaysia Using Multivariate Techniques. Malaysian Journal of Analytical Sciences, 19(5), 966–978. https://www.researchgate.net/publication/283842911_Air_quality_pattern_assessment_in_Malaysia_using_multivariate_techniques
Ismail, A. S., Abdullah, A. M., & Samah, M. A. A. (2017). Environmetric Study on Air Quality Pattern for Assessment in Northern Region of Peninsular Malaysia. Journal of Environmental Science and Technology, 10(4), 186–196. https://doi.org/10.3923/jest.2017.186.196
Juahir, H., Zain, S. M., Aris, A. Z., Yusoff, M. K., & Mokhtar, M. B. (2010). Spatial assessment of Langat River water quality using chemometrics. J. Environ. Monit., 12(1), 287–295. https://doi.org/10.1039/b907306j
Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23(3), 187–200. https://doi.org/10.1007/bf02289233
Kampa, M. (2008). Human health effects of air pollution. Environmental Pollution, 151(2), 362–367. https://www.semanticscholar.org/paper/Human-health-effects-of-air-pollution.-Kampa-Castanas/30d7b5c1f9dee2b3b44c0f8359eb784051db74ba
Kean Hua, A. (2018). Applied Chemometric Approach in Identification Sources of Air Quality Pattern in Selangor, Malaysia. Sains Malaysiana, 47(3), 471–479. https://doi.org/10.17576/jsm-2018-4703-06
Kim, & Mueller. (1987). Introduction to factor analysis: What it is and how to do it. Quantitative Applications in the Social Sciences Series. Newbury Park: Saga University Press.
Latif, M. T., Othman, M., Idris, N., Juneng, L., Abdullah, A. M., Hamzah, W. P., Khan, M. F., Nik Sulaiman, N. M., Jewaratnam, J., Aghamohammadi, N., Sahani, M., Xiang, C. J., Ahamad, F., Amil, N., Darus, M., Varkkey, H., Tangang, F., & Jaafar, A. B. (2018). Impact of regional haze towards air quality in Malaysia: A review. Atmospheric Environment, 177, 28–44. https://doi.org/10.1016/j.atmosenv.2018.01.002
Li, L., Du, T., & Zhang, C. (2020). The Impact of Air Pollution on Healthcare Expenditure for Respiratory Diseases: Evidence from the People’s Republic of China. Risk Management and Healthcare Policy, Volume 13, 1723–1738. https://doi.org/10.2147/rmhp.s270587
Liang, Y. C., Maimury, Y., Chen, A. H. L., & Juarez, J. C. (2020). Machine Learning-Based Prediction of Air Quality. Applied Sciences, 10(24), 9151. https://doi.org/10.3390/app10249151
Liu, C. W., Lin, K. H., & Kuo, Y. M. (2003). Application of factor analysis in the assessment of groundwater quality in a blackfoot disease area in Taiwan. Science of the Total Environment, 313(1–3), 77–89. https://doi.org/10.1016/s0048-9697(02)00683-6
Liyana Zakri, N., Shakir Mohd Saudi, A., Juahir, H., Ekhwan Toriman, M., Fahmy Abu, I., Muaz Mahmud, M., & Feroz Khan, M. (2018). Identification Source of Variation on Regional Impact of Air Quality Pattern using Chemometric Techniques in Kuching, Sarawak. International Journal of Engineering & Technology, 7(3.14), 49. https://doi.org/10.14419/ijet.v7i3.14.16861
Manisalidis, I., Stavropoulou, E., Stavropoulos, A., & Bezirtzoglou, E. (2020). Environmental and Health Impacts of Air Pollution: A Review. Frontiers in Public Health, 8, 14. https://doi.org/10.3389/fpubh.2020.00014
Mazlan, N. A., Zaki, N. A. M., Narashid, R. H., Talib, N., Manokaran, J., Arshad, F. C., Fauzi, S. S. M., Dom, N. C., Valipour, M., Dambul, R., & Blenkinsop, S. (2022). COVID-19 Restriction Movement Control Order (MCO) Impacted Emissions of Peninsular Malaysia Using Sentinel-2a and Sentinel-5p Satellite. Earth Systems and Environment, 7(1), 347–358. https://doi.org/10.1007/s41748-022-00329-7
Mohd Shafie, S. H., Mahmud, M., Mohamad, S., Rameli, N. L. F., Abdullah, R., & Mohamed, A. F. (2022). Influence of urban air pollution on the population in the Klang Valley, Malaysia: a spatial approach. Ecological Processes, 11(1). https://doi.org/10.1186/s13717-021-00342-0
Mutalib, S. N. S. A., Juahir, H., Azid, A., Sharif, S. M., Latif, M. T., Aris, A. Z., Zain, S. M., & Dominick, D. (2013). Spatial and temporal air quality pattern recognition using environmetric techniques: a case study in Malaysia. Environmental Science: Processes & Impacts. https://doi.org/10.1039/c3em00161j
Nasir, M. (2011). Artificial Neural Networks Combined with Sensitivity Analysis as a Prediction Model for Water Quality Index in Juru River, Malaysia. International Journal of Environmental Protection, 1(3), 1–8. https://www.semanticscholar.org/paper/Artificial-Neural-Networks-Combined-with-Analysis-a-Nasir-Juahir/bcb37e269cf7d43d45c5e4445faa07738e9d2110
Rahman, S. R. A., Ismail, S. N. S., Raml, M. F., & Praveena, S. M. (2015). The Assessment of Ambient Air Pollution Trend in Klang Valley, Malaysia. World Environment, 5(1), 1–11. https://doi.org/10.5923/j.env.20150501.01
Raunaq, S. S., Ajay, K. J., Nishant, R. K., & Aman, K. (2023). Air Quality Prediction - A Study Using Neural Network Based Approach. Journal of Soft Computing in Civil Engineering, 7(1), 93–113.
Rencher, A. C. (2003). Methods of Multivariate Analysis. John Wiley & Sons. http://books.google.ie/books?id=SpvBd7IUCxkC&printsec=frontcover&dq=Methods+of+Multivariate+Analysis&hl=&cd=1&source=gbs_api
Rumsey, D. J. (2011). Statistics For Dummies. John Wiley & Sons. http://books.google.ie/books?id=kpMFklYskF8C&printsec=frontcover&dq=Statistics+For+Dummies+(For+Dummies+(Lifestyle))+2nd+Edition&hl=&cd=1&source=gbs_api
Sanidas, E., Papadopoulos, D. P., Grassos, H., Velliou, M., Tsioufis, K., Barbetseas, J., & Papademetriou, V. (2017). Air pollution and arterial hypertension. A new risk factor is in the air. Journal of the American Society of Hypertension, 11(11), 709–715. https://doi.org/10.1016/j.jash.2017.09.008
Sarkar, A., & Kumar, R. (2012). Artificial Neural Networks for Event Based Rainfall-Runoff Modeling. Journal of Water Resource and Protection, 04(10), 891–897. https://doi.org/10.4236/jwarp.2012.410105
Schürholz, D., Kubler, S., & Zaslavsky, A. (2020). Artificial intelligence-enabled context-aware air quality prediction for smart cities. Journal of Cleaner Production, 271, 121941. https://doi.org/10.1016/j.jclepro.2020.121941
Seinfeld, J. (1989). Urban Air Pollution: State of the Science. Science, 243(4892), 745–752. https://www.semanticscholar.org/paper/Urban-Air-Pollution%3A-State-of-the-Science-Seinfeld/3e192a872a48a167e325f97a363cb42074b5daec
Sentian, J., Herman, F., Yih, C. Y., & Hian Wui, J. C. (2019). Long-term air pollution trend analysis in Malaysia. International Journal of Environmental Impacts: Management, Mitigation and Recovery, 2(4), 309–324. https://doi.org/10.2495/ei-v2-n4-309-324
Shihab, A. (2022). Identification of Air Pollution Sources and Temporal Assessment of Air Quality at a Sector in Mosul City Using Principal Component Analysis. Polish Journal of Environmental Studies, 31(3), 2223–2235. https://doi.org/10.15244/pjoes/143295
Shrestha, N. (2021). Factor Analysis as a Tool for Survey Analysis. American Journal of Applied Mathematics and Statistics, 9(1), 4–11. https://doi.org/10.12691/ajams-9-1-2
Simeonov, V., Einax, J. W., Stanimirova, I., & Kraft, J. (2002). Environmetric modeling and interpretation of river water monitoring data. Analytical and Bioanalytical Chemistry, 374, 898–905. https://doi.org/10.1007/s00216-002-1559-5
Sofwan, N. M., Mahiyuddin, W. R. W., Latif, M. T., Ayub, N. A., Yatim, A. N. M., Mohtar, A. A. A., Othman, M., Aizuddin, A. N., & Sahani, M. (2021). Risks of exposure to ambient air pollutants on the admission of respiratory and cardiovascular diseases in Kuala Lumpur. Sustainable Cities and Society, 75, 103390. https://doi.org/10.1016/j.scs.2021.103390
Tabachnick, B. G., Fidell, L. S., & Ullman, J. B. (2018). Using Multivariate Statistics. http://books.google.ie/books?id=cev2swEACAAJ&dq=using+multivariate+statistick&hl=&cd=1&source=gbs_api
Tavella, R. A., & da Silva Júnior, F. M. R. (2021). Watch out for trends: did ozone increased or decreased during the COVID-19 pandemic? Environmental Science and Pollution Research, 28(47), 67880–67885. https://doi.org/10.1007/s11356-021-17142-w
The Health and Economic Impacts of Ambient Air Quality in Malaysia. (2022). In https://energyandcleanair.org/publication/hia-ambient-aq-malaysia/.
Thorat, M., Pandit, S., & Balote, S. (2023). Artificial Neural Network: A brief study. International Research Journal of Engineering and Technology (IRJET), 10(2), 771–776.
Usmani, R. S. A., Saeed, A., Abdullahi, A. M., Pillai, T. R., Jhanjhi, N. Z., & Hashem, I. A. T. (2020). Air pollution and its health impacts in Malaysia: a review. Air Quality, Atmosphere & Health, 13(9), 1093–1118. https://doi.org/10.1007/s11869-020-00867-x
Wong, K. S., Chew, Y. J., Ooi, S. Y., & Pang, Y. H. (2020). Toward forecasting future day air pollutant index in Malaysia. The Journal of Supercomputing, 77(5), 4813–4830. https://doi.org/10.1007/s11227-020-03463-z
Yadav, V., & Nath, S. (2017). Prediction of air quality using artificial Neural Network techniques: A review. Pollution Research, 36(3), 242–244. https://www.researchgate.net/publication/322043736_Prediction_of_air_quality_using_artificial_Neural_Network_techniques_A_review
Yahaya, Ali, A, & Ishak, F. (2006). Air pollution index (API) and the effects on human health: case study in Terengganu City, Terengganu, Malaysia. Paper Submitted to the International Association for People Environmental Studies (IAPS) Conference. 2006.
Yu, H. L. (2015). A time series analysis of multiple ambient pollutants to investigate the underlying air pollution dynamics and interactions. Chemosphere, 134, 571–580. https://www.semanticscholar.org/paper/A-time-series-analysis-of-multiple-ambient-to-the-Yu-Lin/ea76a90410b1cd1b333d5318e34ae8a12de98e0f
Zahid, A. Z. M., Bakar, A. A. A., Halim, N. F. M., & Salleh, M. Z. M. (2022). Air quality status before, during and after the pandemic COVID-19 Movement Control Order (MCO) at urban and suburban areas in Malaysia. IOP Conference Series: Earth and Environmental Science, 1013(1), 012007. https://doi.org/10.1088/1755-1315/1013/1/012007

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Forecasting Air Quality in Peninsular Malaysia: Unveiling the Power of Artificial Neural Networks

Status:

Version 1

Abstract

Figures

1. INTRODUCTION

2. MATERIALS AND METHODS

2.1 Study region and retrospective data

2.2 Descriptive Analysis

2.3 Identification of Major Sources of Air Pollution

2.3.1 Principal Component Analysis

2.4 Artificial Neural Network for Air Quality Forecasting

3. RESULTS AND DISCUSSION

3.1 Descriptive Analysis

3.2 Identification of Possible Sources of Air Pollution

3.2.3 Forecasting Air Quality Using ANN

4. CONCLUSION

DECLARATIONS

REFERENCES

Additional Declarations

Status:

Version 1