3.1 Data Collection and Analysis
The majority of research on risks in public-private partnership (PPP) projects focuses on risk identification and classification, risk analysis and evaluation, and risk allocation and management strategies (Wang et al.2018). Extensive research has been conducted over the past decade to investigate risk management issues in PPP projects, identifying various types of risks, such as financial, operational, political, and environmental risks (Xu et al.2010). Water environment systems are dynamic, complex, open systems with temporal, spatial, and volumetric variations. This complexity results in distinct techno-economic characteristics of water environment treatment PPP projects compared to purely commercial PPP projects, including strong quasi-public interest, high difficulty in integrating governance technologies, complex assessment of governance effects, and difficult project coordination and collaboration (An et al.2018). Current research difficulty is identifying risk factors in water environment treatment PPP projects.
Using keywords or subject terms in both Chinese and English, such as "PPP", "risk", and "water environment treatment", a combined search was conducted in databases such as CNKI, ISI Web of Science, and ScienceDirect to find relevant literature for risk factor analysis. This resulted in a preliminary list of risk factors for water environment treatment PPP projects, as detailed in Table 1 below. The existing literature mainly discusses risks caused by the government, risks caused by social capital, and risks generated by the external environment.Government-caused risks primarily stem from government involvement in project management, including tax adjustments (Li et al.2022; Liu et al.2018), government intervention and credit issues (Wang et al.2021; Cu et al.2019), and inadequacies in existing laws, regulations, and regulatory systems (Su et al.2022; Feng et al.2022). Social capital-caused risks mainly arise from actual project construction and operation, such as completion risks (Su et al.2022; Feng et al.2022), construction technology risks (Zhang et al.2021; Zhang et al.2021), contract change risks (Su et al.2022; Feng et al.2022; Li et al.2019), delay risks (Su et al.2022; Li et al.2022; Li et al.2019), cost overrun risks (Su et al.2022; El-Kholy et al.2021), insufficient project revenue risks (Su et al.2022; El-Kholy et al.2021), dispute and infringement risks (Wang et al.2019; Fu et al.2023; Chou et al.2013), and social capital change risks (El-Kholy et al.2021; Wang et al.2019). External environment risks refer to risks directly or indirectly caused by the external environment, including environmental damage risks (Su et al.2022; An et al.2018; Owolabi et al.2020), geological condition risks (Feng et al.2022; Cui et al.2019), social stability risks (Li et al.2021; Wang et al.2019), public satisfaction (Li et al.2020; Fu et al.2023), inflation risks (Zhang et al.2021; Wang et al.2018), and force majeure (Li et al.2018; Wang et al.2018).
Although these summarized risk indicators can provide some reference for this study, they mostly analyze the project itself and do not consider the different categories of risks that different project participants should bear. Moreover, many of them involve qualitative data, which is often difficult to obtain comprehensively in practice.
Table 1
Preliminary List of Risk Factors for Water environment treatment PPP projects.
Risk Type
|
Specific Risk Indicators
|
Government-induced
|
Tax adjustment risk
|
Government intervention and credit issues
|
Inadequate legal and regulatory frameworks
|
Social capital-induced
|
Quality completion risk
|
Construction technology risk
|
Contract change risk
|
Schedule delay risk
|
Operational cost overrun risk
|
Project revenue shortfall risk
|
Dispute and infringement risk
|
Social capital change risk
|
External environment-induced
|
Environmental damage risk
|
Geological condition risk
|
Social stability risk
|
Public opinion risk (public satisfaction)
|
Inflation risk
|
Force majeure risk (political, natural conditions)
|
The primary objectives of water environment treatment PPP projects, from the government's perspective, are to restore aquatic ecosystems, address water pollution issues, and maximize environmental, social, and economic benefits (Li et al.2022). From the perspective of government regulation, the public interest and public attributes of water environment treatment projects must be considered. Evidently, government departments bear the supervisory responsibility for project operation and ecological restoration, which is often overlooked or less emphasized by other stakeholders. Consequently, many risk incidents arise from this aspect (Li et al.2020). In light of these realities, this study includes continuous operation risks and natural ecological environment risks among the categories of risks that the government must prioritize in public-private partnership (PPP) projects for water environment treatment. In addition, discussing specific water environment treatment projects allows for the collection of as much subjective risk data as possible and the application of empirical analysis to validate the model's viability.
As a result, Jiujiang City, one of the first batch of demonstration cities for green development in China's Yangtze River Economic Belt, was chosen as the area of study. Jiujiang, the only city in Jiangxi Province located along the Yangtze River, has 152 kilometers of Yangtze River shoreline and two-thirds of the water surface and shoreline of Poyang Lake, China's largest freshwater lake. High-quality water resources are Jiujiang's most important ecological assets for high-quality development. In promoting the "Yangtze River Protection", Jiujiang is therefore saddled with significant responsibilities, heavy burdens, and numerous obstacles. This PPP-modeled project has a total investment of 76.99 billion yuan, prioritizes ecological and green development, and focuses on the management of the "Two Rivers" basin system. It is modeled after the Phase I project of the comprehensive water environment treatment in the central urban area of Jiujiang.
Adjustments were made to the preliminary list of risk factors for water environment treatment PPP projects using resources such as the Chinese government's public data platform, various monitoring stations in Jiujiang City, and joint bidding units. Five years' worth of project risk factor information was compiled, and the specific contents of various risk categories were categorized. This resulted in a set of twelve risk data features for water environment treatment PPP projects, covering natural environment, ecological environment, socio-economic, and project entity subsystems. Indicators of evaluation are detailed in Table 2 below. The natural environment and ecological environment subsystems primarily reflect the government's natural ecological environment regulatory risks, whereas the socio-economic subsystem takes into account the significant public impact of water environment treatment PPP projects. The project entity subsystem, on the other hand, takes into account the government's primary role in the supervision of the sustainable operation of PPP projects, excluding engineering risks that the government does not share, such as cost overrun risks and construction technology risks.
Table 2
Risk Data Feature Table for Water environment treatment PPP projects.
System Name
|
Risk Feature Name
|
Risk Feature-Related Evaluation Indicator Set
|
Natural Environment Subsystem
|
Water environment
|
Hydro-sediment, Water quality, Water temperature, Water level, sediment
|
Acoustic environment
|
Noise
|
Atmospheric environment
|
Dust, Exhaust emissions, Local climate
|
Surface environment
|
Solid waste, Soil nutrients, Geology, Soil erosion, Soil salinization, Soil marshification, Landslides
|
Ecological Environment Subsystem
|
Terrestrial organisms
|
Terrestrial animal and plant growth risks
|
Aquatic organisms
|
Safety risks of aquatic animals, Aquatic plants, Aquatic microorganisms
|
Socio-economic Subsystem
|
Livelihood security
|
Public satisfaction, Employment opportunities
|
Local economic development
|
Regional industry, Regional agriculture, Urban planning, Surrounding landscape, Regional economic risk
|
Project Entity Subsystem
|
Lel risksga
|
Dispute, Breach, Infringement risks, Planning, standards, and contract change risks
|
Operational risks
|
Construction risks caused by social capital, Operation and maintenance management risks
|
Financial risks
|
Interest rate change risk, Revenue shortfall risk, Social capital change risk
|
Force majeure risks
|
Force majeure due to political and natural conditions
|
Using the equal interval method, the collected data were labeled with risk levels, including low, medium, higher, and high labels. There were collected a total of 927 risk data, including 12 risk features and 37 risk feature evaluation indicators. Among them, there were 789 low-level risk data, 86 medium-level risk data, 37 higher-level risk data, and 15 high-level risk data. The vast majority of the data had a low level of risk, indicating that the collected data belonged to a small sample of unbalanced data. Deep learning strategies exemplified by multilayer neural networks are unsuitable for this scenario (Tsai et al.2023; Abdoli et al.2023). Consequently, a conventional machine learning strategy was chosen for this research.
In order to analyze the relationships between various indicators and levels, the collected data risk was presented in the form of variable scatter matrix plots. In this study, a total of 29 indicator data were collected. Here, only four indicators have been selected: water quality, local industrial economy, local climate, and soil erosion. These indicators were represented by pairwise coordinates to illustrate the connection between the various risk levels and project risk characteristics. Figure 4 demonstrates the outcomes. Under the four risk levels, the distribution of the four indicators, water quality, local industrial economy, local climate, and soil erosion, is relatively concentrated. In other words, there is a correlation between various risk indicators, as well as a correlation between the indicators and the risk classification of water environment treatment PPP projects.
3.2 Risk Feature Contribution Analysis
By analyzing the contributions of different feature indicators, we can determine the level of influence of different features on the model, allowing us to better explain and adjust the model during its construction. Figure 5 illustrates the distribution of SHAP values for risk indicator feature values. It is evident that water environment risk, operation and maintenance management risk, and local economic development have a greater impact on the prediction of risk levels for water environment treatment PPP projects. In other words, the higher these three indicators are, the greater the probability that the project risk level will be high, which is consistent with the real-world scenario (Li et al.2022).From the government's perspective, the primary objective of water environment treatment PPP projects is to strengthen water environment governance to achieve ecological restoration. Clearly, water environment risk is the most influential indicator of project risk, and achieving this objective is contingent upon the project's ability to be effectively operated and maintained. Therefore, operation and maintenance management risk is a prerequisite for the project's sustainable implementation. Moreover, the ultimate objective of the government's implementation of water environment treatment PPP projects is to improve or promote local economic development, which is directly related to the well-being of the entire society. Given the ascending order of objectives, it is reasonable that the contribution value of local economic development risk is ranked last.
3.3 Dataset construction
Based on the determined risk feature set for water environment treatment PPP projects, we filled in the missing data values. The specific method begins with setting a threshold to determine whether a feature is missing or not. If the percentage of missing values for a feature exceeds this threshold, the feature is removed. In this study, the threshold for missing feature deletion is set at 80%. If the threshold is not exceeded, the KNN algorithm is used to locate the k nearest samples to the sample with the missing value, and the average value of their corresponding features.
To ensure model accuracy and eliminate the influence of dimensions, we standardized the original risk indicator data using a standardization algorithm (Wang et al.2019). Eq. (2) represents the formula, where x_i represents the ith evaluation indicator of the nth risk feature and x_std represents the data for the standardized risk evaluation indicator.
$${x}_{std}=\frac{{x}_{i}-\frac{1}{n}\sum _{i=1}^{n} {x}_{i}}{\sqrt{{\left({x}_{i}-\frac{1}{n}\sum _{i=1}^{n} {x}_{i}\right)}^{2}}}$$
2
After processing missing features and dimensionless treatment, the distribution of feature values is between 0 and 1. The dataset is divided into training and testing sets in a 7:3 ratio, and a support vector machine classifier is used to classify the data.
3.4 Model Construction and Training
The final performance of the Stacking ensemble learning model is largely determined by the accuracy and similarity of the base classifiers. A superior classifier based on ensemble learning should adhere to the "good but different" principle (Chung et al.2023). Initially, we conducted experiments with the scikit-learn machine learning library for Python on the Jupyter Notebook platform. Six classification models were independently developed using machine learning: KNN classifier, CART classifier, linear LDA classifier, NB classifier, SVM classifier, and WETPR-SVM classifier. Individually, we trained them on the training set using cross-validation, random search, and learning curves to determine the optimal hyperparameter combination.
The Support Vector Machine (SVM) classifier was used to perform hyperparameter tuning (Chou et al.2013). By adjusting multiple SVM parameters and comparing the use of linear kernel function (LinearSVM), Gaussian kernel function (RBFSVM), and polynomial kernel function (Sigmoid), the classification accuracy was continuously enhanced. The results are depicted in Fig. 6, and the accuracy of the test set is shown in Table 3.
Figure 6 and Table 3 reveal that the Gaussian kernel function (RBFSVM) achieved the highest accuracy on the test set in this experiment, with a value of 0.9043. Comparatively, the test set precisions of the linear kernel function (LinearSVM) and the polynomial kernel function (Sigmoid) were 0.8191 and 0.8297, respectively. These results suggest that the Gaussian kernel function (RBFSVM) provides superior classification performance for this problem. This may be due to the fact that the Gaussian kernel function can map the data to a higher-dimensional space, rendering the data linearly separable in the higher-dimensional space. Given that water environment governance PPP project risk classification issues may involve complex nonlinear relationships, the Gaussian kernel function may be more suited to addressing such issues.
Table 3
Test set accuracy of kernel functions.
Kernel function
|
LinearSVM
|
RBFSVM
|
Sigmoid
|
Test set accuracy
|
0.8191
|
0.9043
|
0.8297
|
3.5 Model Comparison and Evaluation
This study's ensemble learning model ultimately completes the four-classification task for water environment governance PPP project risk. Therefore, we use accuracy (Accuracy), macro-average precision (Macro_P), macro-average recall (Macro_R), and macro-average F1 score (Macro_F1) as four indicators to evaluate the performance of the model.
Accuracy is the ratio of the number of project risk samples correctly classified by the model to the total number of project risk samples, which reflects the overall classification accuracy of the model (Choubin et al.2023). The formula for calculation is depicted in Eq. (3):
$$\text{Accuracy =}\frac{TP+TN}{TP+PN+FP+FN}$$
3
The average precision across all classes is represented by the macro-average precision. Precision is the ratio of the number of risks correctly classified into a particular category to the number of risks actually classified into that category. Eq. (4) demonstrates its formula for calculating:
$${\text{ Macro}}_{-}P=\frac{1}{n}\sum _{i=1}^{n} {P}_{i}$$
4
The macro-average recall is the mean of all recall values across all classes. Recall is the proportion of risk samples correctly classified by the model for a particular project risk category relative to the total number of risk samples in that category. The formula for its calculation is shown in Eq. (5):
$${\text{ Macro}}_{-}R=\frac{1}{n}\sum _{i=1}^{n} {R}_{i}$$
5
To evaluate the performance of a classification model in practical applications, it is frequently necessary to consider the model's precision and recall in depth. As a result, the F1 score, which is the weighted harmonic average of the two, is used as an evaluation metric. The formula for calculating the macro F1 score, which represents the mean of the F1 scores for all classes, is shown in Eq. (6):
$${\text{ Macro}}_{-}F1=\frac{1}{n}\sum _{i=1}^{n} {F}_{i}$$
6
In Equations (3)–(6), \(TP\)represents the number of positive samples predicted as positive by the model; \(FP\)represents the number of negative samples predicted as positive by the model; \(FN\) represents the number of positive samples predicted as negative by the model; \(TN\) represents the number of negative samples predicted as negative by the model; \(n\) represents the number of risk feature categories, and\({P}_{i}\), \({R}_{i}\) and \({F}_{i}\) represent the precision, recall, and F1 scores of the model for different categories, respectively.
To compare the performance of different classification algorithms, we trained the dataset using multiple approaches and evaluated their effectiveness. LDA is the only linear algorithm among them; the others are nonlinear. The pertinent procedures are as follows: (1) divide the training set; (2) evaluate the algorithm models using 10-fold cross-validation; (3) generate six distinct models to predict new data; and (4) compare classification accuracy. The WETPR-SVM model achieves the highest Accuracy, Macro_P, Macro_R, and Macro_F1 scores, which are 0.9025, 0.9055, 0.9026, and 0.9021, respectively, as shown in Table 4. This indicates that the WETPR-SVM model constructed in this study outperforms traditional single machine learning classification models in terms of overall performance. It is able to solve the classification problem of water environment governance PPP project risk with greater accuracy and generalizability, and it has a greater capacity for classification. Therefore, WETPR-SVM is selected as the optimal model for predicting the risk classification of water environment governance public-private partnership projects.
Table 4
Performance evaluation of prediction models.
Classifier
|
Accuracy
|
Macro_P
|
Macro_R
|
Macro_F1
|
KNN
|
0.8532
|
0.8561
|
0.8537
|
0.853
|
CART
|
0.8251
|
0.8252
|
0.8249
|
0.8253
|
LDA
|
0.8469
|
0.8471
|
0.847
|
0.8469
|
NB
|
0.8778
|
0.8781
|
0.8779
|
0.8775
|
SVM
|
0.8467
|
0.8465
|
0.8469
|
0.8462
|
WETPR-SVM
|
0.9025
|
0.9055
|
0.9026
|
0.9021
|
Comparing box plot 7 reveals that the range of prediction accuracy for the WETPR-SVM model is the smallest at 0.07, indicating relatively high stability and consistent performance across various datasets. In contrast, the difference between the upper and lower quartiles of prediction accuracy for the NB model is 0.2636, representing the greatest variation in prediction accuracy, which may be attributable to its inability to effectively manage multi-source heterogeneous data, missing values, and class imbalance issues. The lowest lower bound of the median prediction accuracy for the CART model is 0.8251, which may be because it is a classification algorithm based on a decision tree that is sensitive to noisy data and overfitting issues. These results indicate that the WETPR-SVM model is superior for predicting the risk classification of water environment treatment PPP projects.
In conclusion, the ensemble learning-based approach proposed in this study can better utilize the benefits of various machine learning algorithms, overcome the limitations of single algorithms in dealing with multi-source heterogeneous data, missing values, and class imbalance issues, and improve prediction accuracy and generalization ability.