We carefully assess the performance of the developed models throughout our study and benchmark outcomes of each model. To evaluate how well they operate at recognizing different kinds of network intrusions, we examined their detection accuracy, precision, recall, F1-score, and ROC-AUC. Table 1 illustrates that the models secure high accuracy in detecting network intrusions.
Table 1
Performance metrics of our XAI-empowered architectures
Proposed Architecture
|
Accuracy
|
Precision
|
Recall
|
F1
|
RUC-AUC
|
ExplainDTC
|
93.6%
|
95.0%
|
94.9%
|
95.0%
|
93.0%
|
SecureForest-RFE
|
94.6%
|
95.8%
|
95.8%
|
95.8%
|
94.0%
|
RationaleNet
|
93.7%
|
95.1%
|
95.0%
|
95.0%
|
93.0%
|
CNNShield
|
93.6%
|
95.5%
|
94.4%
|
94.8%
|
93.0%
|
The outcomes of our XAI-enhanced architectures, as depicted in Table 2, demonstrate their superiority over other state-of-the-art approaches. Notably, our framework achieves the highest accuracy and detection rate, as confirmed by the experimental results. These findings underscore the superior performance of our XAI-driven architectures compared to existing works in the field.
Table 2
Comparision of Our XAI-empowered architectures with state-of-the-art ML/DL-based models
Proposed Solution
|
Dataset
|
Accuracy (%)
|
XAI
|
Marwa et al. [50]
|
UNSW-NB15
|
0.866
|
√
|
Sree et al. [52] DT
|
UNSW-NB15
|
0.85
|
√
|
Sree et al. [52] Xgboost
|
UNSW-NB15
|
0.898
|
√
|
Sree et al. [52] MLP
|
UNSW-NB15
|
0.899
|
√
|
[53] Xgboost
|
UNSW-NB15
|
88.13
|
NO
|
[54] MLP
|
UNSW-NB15
|
84.24
|
NO
|
[55] DT
|
UNSW-NB15
|
89.7
|
NO
|
[55] RF
|
UNSW-NB15
|
90.3
|
NO
|
ExplainDTC
|
UNSW-NB15
|
93.6
|
√
|
SecureForest-RFE
|
UNSW-NB15
|
94.6
|
√
|
RationaleNet
|
UNSW-NB15
|
93.7
|
√
|
CNNShield
|
UNSW-NB15
|
93.6
|
√
|
In the following part, we have employed multiple explainability techniques to elucidate the workings of the architectures. The core objective is to provide clarifications and justifications of the outcomes.
5.1 ExplainDTC
We start our investigation with ExplainDTC. The ExplainDTC model that is capable of distinguishing between normal and attack behavior in network traffic data with high accuracy.
We also determine the top 15 features and calculate their relative importance utilizing both the scikit-learn library and ELI5’s Permutation Importance toolkit in Fig. 7. The feature importance metric in use considers the decrease in node impurity weighted by the probability of reaching that node. Our experiment finds that both outputs support very similar feature importance with the "sttl" feature, representing the "source to destination time to live value" as the most important feature. Our feature significance analysis is further supported by the decision tree visualization. The most crucial components, including "sttl," are prominently shown in the upper tiers of the tree-like structure. This finding supports the idea that these important characteristics have a stronger impact on the categorization process, highlighting the importance of these characteristics for network traffic analysis.
ExplainDTC, by nature, provide explainability in machine learning algorithm due to their visualization of resulting trees. They are relatively simple to understand and interpret, this make them an easy choice for human analysts who need to understand how the model is reaching to its predictions. Our research demonstrates great accuracy in identifying Normal and Attack activity by using decision trees for network traffic analysis. It gives us the ability to investigate each decision level, as well as the corresponding feature and splitting value for each condition. By analyzing circumstances for each network traffic sample, the decision tree algorithm directs the categorization process. The decision tree algorithm starts at the root of the tree and works its way down, evaluating each condition along the way. If a condition is met, the sample goes down the left branch or node; otherwise, it goes down the right branch. Furthermore, the classification prediction outcome for each class is governed by the tree's maximum depth. It can be visualized from the Fig. 8 to Fig. 9 that as the depth of the tree increases, the explainability reduces.
Figure 10 illustrates the important features for two different classes in the ExplainDTC. Among the features, "sttl", "synack", "sbytes", "dbytes", and "spkts" have the highest scores, indicating their significance in determining the class predictions. In the case of the DTC, it is a binary classifier capable of distinguishing between two classes, such as "Normal" and "Attack." Therefore, the SHAP feature importance plot for ExplainDTC shows the importance scores of the features for both classes separately. Each feature is evaluated based on its contribution to the prediction of each class, indicating how much it influences the classification decision for each class.
Among these features, sttl stands out as the most important. It has the largest impact on the model's predictions compared to the other features. On average, a change in the value of sttl leads to a substantial shift in the predicted probability of the "Normal" class, with an average change of 28 percentage points (0.28 on the x-axis). The other features such as "synack", "sbytes", "dbytes", and "spkts", also contribute significantly to the model's predictions, but their impacts are relatively lower compared to sttl.
The waterfall plot in Fig. 11 represents the local interpretation of ExplainDTC for the second and thirteen instances in test data using SHAP values. Starting from the baseline value of 0.361 at the bottom for second instance, which is the expected prediction, or in other words, the mean of all predictions, the plot illustrates the breakdown of the prediction for the given instance. At the top of plot a), prediction for the given instance. The model predicts a value of 0, corresponding to the "Normal" class. Analyzing the SHAP values, we observe that the feature "sbytes" has the largest contribution to the model's prediction, with a SHAP value of -0.27. Following that, "sttl," "spkts," "ct_srv_dst," and "sload" have SHAP values of -0.18, + 0.12, -0.07, + 0.06, respectively. Here, "sbytes," "sttl," and "ct_srv_dst" make negative contributions to the prediction, while "spkts" and "sload" make positive contributions. By summing up all the SHAP values, we arrive at the total sum that represents the model's prediction. In this case, the sum − 0.27 − 0.18 + 0.12 − 0.07 + 0.06 − 0.02…… and so on to the last feature value would be the sum would be equal to \(E\left[f\right(x\left)\right] — f\left(x\right)\) resulting in a prediction of 0 or the "Normal" class. Therefore, based on the contributions of the individual features, the model predicts that the first instance in the test dataset belongs to the "Normal" class. Each class, indicating how much it influences the classification decision for each class. In a similar manner, the model result in a prediction of 1 or the "Attack" class for instance thirteen as shown in b).
5.2 SecureForest-RFE
The important features for two different classes in SecureForest-RFE is shown in Fig. 12. The features that have the highest scores, indicating their significance in determining the class predictions, are sbytes, sinpkt, dttl, and proto. Among these features, sbytes stands out as the most important. It has the largest impact on the model's predictions compared to other features. On average, a change in the value of sbytes leads to a considerable shift in the predicted probability of the "Normal" class, with an average change of 17 percentage points (0.17 on the x-axis).
The dependence plot in Fig. 13 for the class prediction of SecureForest-RFE exhibits a partially monotonic pattern between the feature of interest "sload" and the target feature "proto". When the feature value of "sload" is in the range between 0.0–1.5 (on the x-axis), the larger value of "proto" (highlighted in red) leads to a decrease in the SHAP value of "sload" (-0.10 to -0.15). This decrease in SHAP value influences the model's prediction output, pushing it towards the "Normal" class. At the same feature value range of "sload" (0.0–1.5 on the x-axis), the smaller value of "sttl" (highlighted in blue) results in both an increase (0.10 to 0.20) and a decrease (-0.10 to -0.20) in the SHAP value of "sload" in both direction for the majority of the instances available around that region. The decrease in SHAP value influences the model's prediction output, pushing it towards the the model's prediction output, pushing it towards the "Attack" class. For larger "sload" values (greater than 2), the impact of "proto" on the SHAP values of "sload" becomes less pronounced. This suggests that variations in "proto" have a weaker influence on the model's prediction for instances with higher "sload" values. In this region, a larger proportion of instances are classified as "Normal," and the visibility of the "Attack" class becomes minimal. Therefore, the relationship between "proto" and the model's prediction becomes less significant as "sload" increases.
A decision plot is a useful tool for presenting multiple features of a dataset in a local explanation. In Fig. 14, we display the decision plot for 100 observations of Recursive Feature-Eliminated (RFE) test data, using the SecureForest-RFE classifier. The x-axis represents the model's predicted output. In this plot, we have visualized the model output values and corresponding feature values for the 100 observations. The top of the plot indicates the probability of each observation belonging to the "Normal" class or the "Attack" class. Positive values indicate predictions towards the "Attack" class, while negative values indicate predictions towards the "Normal" class. The y-axis lists the features, with a total of 19 features in the RFE observations. The features are ordered by descending importance, calculated over the plotted observations. Based on the importance calculated over the 100 observations, the top 5 features are "sttl," "sjit," "sinpkt," "dinpkt," and "dbytes." These features have higher absolute SHAP values compared to the other 14 features, indicating a stronger contribution to the model's prediction. We can identify that there are small line segments in between features and when the slope or incline of the line segment in between two features are less steep (0–45 degrees for positive slope or 135–180 degrees for negative slope), or the slope has a smaller absolute value, the feature strongly contributes to the model prediction. This explains how the feature values push the prediction towards either the "Normal" or "Attack" class.
At the top of the plot, each sample's predicted value is represented by a colored line striking the x-axis. The color of the line corresponds to the prediction value on a spectrum. In almost half of the observations, we see the features shown in blue that push the probability towards the left (Normal class), while in the other half of observations, we see the features shown in red push the probability towards the right (Attack class). Moving from the bottom to the top of the plot, the SHAP values for each feature are cumulatively added to the model's base value of 0.63, resulting in an output of either 0 or 1. This demonstrates how each feature contributes to the overall prediction.
The Force plot for a single prediction of either "Normal" or "Attack" class prediction using SecureForest-RFE is explained using Fig. 15 and Fig. 16.
The model accurately predicts it as an "Attack" with a probability of 1.00 (based on a base value of 0.3607) as represented in Fig. 15. The majority of the features have a tendency to increase the score towards 1. Features like "sinpkt", "dttl", "sbytes", "spkts", "dpkts", and "dbytes" have the most significant influence in determining the probability of the data sample being classified as "Attack." However, the feature "djit" plays a crucial role in driving the probability of the data sample being classified as "Normal." Similarly, our model identifies the observation in Fig. 16 as a “Normal” traffic with the value of \(f\left(x\right)\) as 0 and the base value is 0.6393. More features are pushing the decision towards while only djit pushing the decision towards “Attack” probability.
SHAP force plot for multiple predictions that demonstrates the model's ability to effectively distinguish between the "Normal" and "Attack" classes in showed Fig. 17. The graphic combines 1000 cases from the test dataset to demonstrate the model's capacity to offer informative justifications of the feature contributions for each instance, assisting in the comprehension of its classification judgments.
5.3 RationaleNet
We showcase some instances of the Lime Tabular Explainer output, highlighting the top 15 features. This illustrative dashboard effectively demonstrates the features and their respective weights that contributed to the accurate classification of a network traffic record as "Class 0" or "Normal" for the instance number 2335 in the network traffic in Fig. 18. On the other hand, "Class 1" or "Attack" for the instance number 1033 is shown in Fig. 19. Features in orange color contributes to ‘attack’ and blue contributes to "normal" category. This visual dashboard offers comprehensive and reliable individual explainability for the predicted classifications, enabling cybersecurity analysts to conduct in-depth analyses and follow-up assessments on the reasoning behind specific network traffic classifications made by the model.
RationaleNet is "black box" due to its complex internal workings, this tool significantly enhances the transparency of predictions. This increased transparency capacity facilitates future cybersecurity research, as analysts can exploit the insights provided by the dashboard to gain valuable understanding of the model's decision-making process. The combination of robust individual explainability and the advantages of neural net’s RationaleNet holds great potential for advancing the field of cybersecurity.
5.4 CNNShield
The feature importance of the CNNShield, ordered from the highest to the lowest effect on the model's predictions is displayed in Fig. 20a). The SHAP feature importance plot might be designed for a single-class classification task, where the focus is primarily on predicting a specific class, such as "Normal" or "Attack," rather than distinguishing between multiple classes. Consequently, the SHAP feature importance plot showcases the importance scores of the features specifically for that single class. For the CNNShield, the highest scoring features are ct_state_ttl, sttl, swin, and ct_dst_sport_ltm. Among these, ct_state_ttl emerges as the most important feature, with an average change in the predicted absolute probability of 13 percentage points (0.13 on the x-axis). This indicates that variations in ct_state_ttl have a significant impact on the model's predictions, primarily influencing the probability of the specific class the model is designed to predict. Figure 20b) shows the top 20 features of “Normal” class extracted through CNNShiled. Higher feature value is indicated by red color, and lower feature value is indicated by blue color. On the x-axis, higher SHAP value to the right corresponds to higher prediction value i.e. "Attack" class, and lower SHAP value to the left corresponds to lower prediction value i.e. "Normal" class.
This means that when the feature values of "ct_state_ttl", "sttl", "service", "smean" are larger, their SHAP values correspond to a larger prediction value. Hence, the model is more likely to consider the data as "Attack" class. The smaller the feature values (the bluer the color), the smaller their SHAP values. Hence, the data gets labeled as "Normal" class from the perspective of those features. On the other hand, the larger the values of swin, ackdat (the redder the color), the smaller the SHAP values. Hence, when the feature values of swin, ackdat are larger, the smaller the SHAP values and the model is more likely to consider the data as "Normal" class.
The instance in Fig. 21 is a "Normal" traffic, and the model correctly detects it. The base value for the model is 0.65 and each feature has contributed to the final prediction of Class 0 or "Normal." The feature values such as "ct_state_ttl", "sttl", "dttl", "tcprtt" are negative whereas "is_ftp_login", "stcpb", "dtcpb", are positive. However, both the positive and negative values exist in the features in predicting the probability of the data in both directions. Among these features, the most contributing features in predicting the probability of the data sample being classified as "Normal" are "ct_state_ttl", "swin", and "sttl."
The instance in Fig. 22 is an "Attack" and the model accurately detects it as an "Attack". The baseline value is 0.6197.
The SHAP force plot displays a combination of 1000 instances in Fig. 23 from the test dataset for the CNN model which demonstrates its effective ability to distinguish between the "Normal" and "Attack" classes. In the first 200 samples, the prominence of blue values in the features "ct_state_ttl" and "sttl" indicates a tendency towards predicting the "Normal" class, representing normal traffic flow. However, from approximately sample 300 to near 900, red values in the features "ct_state_ttl" "swin" and "dttl" become more prominent, indicating a tendency towards predicting the "Attack" class. Around sample 900 until 1000, the prominence of blue values becomes more apparent again in the features "stcpb”, "ct_dst_srv", "swin" and "ackdat." This suggests a tendency towards predicting the "Normal" class. This trend is observed despite the prediction being towards the "Attack" class for the majority of samples.
For the 934th instance, the model accurately detects the attack class in Fig. 24a). On the other hand, 1034th instance is an attack that is flawlessly predicted by our model as represented in Fig. 24b).
The dependence plot in Fig. 25 for the class prediction of the CNNShield reveals an approximately linear and positive relationship between the "service" feature and the target feature "sttl". This suggests that "service" and "sttl" interact frequently in influencing the model's prediction. When the feature value of "service" is less than 0 (on the x-axis), the larger value of "sttl" (highlighted in red) leads to a decrease in the SHAP value of "service" (-0.10 to 0.05). This decrease in SHAP value influences the model's prediction output, pushing it towards the "Normal" class. Conversely, when the feature value of "service" is around 1.5 (on the x-axis), the larger value of "sttl" (also highlighted in red) results in an increase in the SHAP value of "service" (0.05 to 0.10). This increase in SHAP value influences the model's prediction output, pushing it towards the "Attack" class. For even larger feature values of "service" (ranging from 3.0 to 4.0 on the x-axis), the smaller value of "sttl" (highlighted in blue) leads to an increase in the SHAP value of "service" (0.05 to 0.20). This increase in SHAP value further influences the model's prediction, causing it to lean towards the "Attack" class.
5.5 Similarity Analysis of Predicted Result
It is advantageous to provide cases from the training dataset that have commonalities with the test instance in question to improve comprehension of a model's decision-making process. In our study, we concentrate on the eighth test instance that the model predicted as 1 or an 'attack'. Figure 26a) displays similar instances from the training data, with their level of similarity denoted by the weight mentioned in the last row. Furthermore, these tables provide easily interpretable explanations by showcasing feature values in terms of their corresponding weights. Figure 26a) and 26 b) represent twenty three closest instances to the test instance. By analyzing the weights, we can determine that the instance listed under column 0 holds the highest representation of similarity to the test instance, as indicated by its weight of 0.406798. This information equips analysts with a greater level of confidence when making final decisions based on the system's output.