Intelligent wireless sensor networks (WSNs) have emerged with the expansion and development of technologies such as wireless communication, microcomputer electrical systems, microelectronics, signal processing, and computer networks [1]. WSNs are a type of diverse system that consists of small sensing devices equipped with general-purpose computing units[2]. WSNs are self-possessed of hundreds or even thousands of reduced nodes that are wireless, are even self-organizing, low-power, and are deployed to control and monitor the environment[3],[4].
WSNs are currently widely employed in defense, aerospace, military, medical and health, environmental monitoring, and industrial facilities, among other applications [5],[6],[7]. Additionally, in future applications, for example, observing pollution, building security, traffic on the highway monitoring, wildfire monitoring, and the quality of water monitoring is possible to include WSNs values into their systems. WSNs offer numerous recompenses, including the ability to transform raw data into valuable combined and categorized information[8].
Concerns about security have grown particularly severe in systems that use WSN [9],[10], [11], [12], [13]. Indeed, security in WSNs presents a unique set of issues not encountered in other forms of wireless networks. Using security mechanisms such as cryptography, key management, and authentication, WSNs can be made more secure. Nonetheless, these methods will not be enough to thwart all foreseeable threats. As malicious nodes in WSNs, that is, nodes that seem to be genuine members of the network but are operating on behalf of a third party, are capable of launching a wide range of assaults, WSNs are susceptible to a variety of threats [14], [15]. An additional layer of security, like an Intrusion Detection System (IDS), is necessary [16],[17].
Correct implementation of an IDS on wired connections can notice participating nodes' misconduct and notify other network nodes to take necessary countermeasures. Though, an IDS strategy developed for wired connections cannot be easily used for WSNs due to their unique network features, which include restricted processing power, battery, and memory. An IDS is a critical security device against both insider and outsider assaults in WSNs[18]. It involves the detection of malicious nodes or misbehavior. When an IDS discovers an improperly behaving sensor node, it seeks to separate it from the remainder of the network.
The IDS system is classified into two groups based on the detection. The anomaly detection compares all behavior to the average activity, whereas the signature detection identifies harmful traffic patterns, necessitating a database update to store any novel attack patterns [19]. Anomaly-based IDS have been a hot topic in IDS research due to their success in detecting unintentional attacks[20].
WSNs [21], have been developed as a result of advancements in microelectronic systems technology, digital electronics, and wireless communications in recent years. WSN is a self-organizing network composed of dozens to thousands of sensor nodes linked by wireless links [23]. These wireless sensors are small, low-powered, cost-effective, versatile, and communicate across short distances [16], [24]. The sensing, gathering, processing, and connectivity of sensor nodes are autonomous [25]. WSNs are among the most key technologies, and they are being implemented at an unprecedented rate[26]. They have been employed and deployed in a variety of situations and for a variety of goals, resulting in a variety of applications—military, disaster management, habitat monitoring, and environmental[22]. Due to their dispersed nature, cost, size, and power limits, WSNs impose severe constraints on node resources like energy, computation speed, memory, and communication bandwidth.
WSNs, on the other hand, imposes severe constraints on node resources such as energy, memory, computing speed, and communication bandwidth due to their distributed nature, cost, size, and power constraints. These constraints impose many constraints on sensor battery life, distributed signal processing efficiency, network security, and data processing [22]. Nevertheless, the two most critical issues are the sensors' lifetime (i.e., their functioning period) and the network's security [10], [28].
Due to network restrictions, it is difficult to secure applications launched using WSNs from a security perspective. To fulfill their functions, these types of networks are typically located in distant and perilous areas [23], [29]. Nevertheless, hazardous environments are frequently left ignored. As a result, WSNs lack physical protection, such as gateways or switches to monitor data flow, resulting in the possibility of node compromise as well as insufficient protection and network security[13],[23]. Consequently, it is crucial to safeguard these networks from breaches and assaults, especially in programs that rely on security services. Effective security methods are required to safeguard and secure against threats. WSNs frequently have one (or even more) centralized control units referred to as base stations (s). A base station often serves as an entry to another network, acts as a data storage midpoint, and serves as an input method entry point. Moreover, the base station is referred to as the sink. Each sensor node contributes to the creation of a route, with each tree's root serving as a starting point. The base position has a greater capacity for power and storage than another sensor network. Typically, the base location has sufficient battery life to last the life of the sensor nodes, sufficient memory to store cryptographic keys, superior CPUs, and the ability to communicate with other WSNs [9].
Guaranteeing a high standard of confidence for serious applications that utilize WSNs is critical for protecting their infrastructures and data from intrusions. As a result, abnormal actions and intrusions should be detected using an IDS. Sensors collect data from the environment in which they are deployed and then communicate it to the base station node in a WSN. External attackers should be protected from information, as cryptographic security is not completely effective at securing this information. As a result, a secondary layer of defense, such as IDS, is necessary[24]. Network traffic is monitored using IDS and delivers notifications to the base position if any sensor detects malicious attacks, as seen in Fig. 1. The black cycle denotes a sensor node, the red star denotes a cluster head, the white star denotes an intruder, D denotes the distance between the invader and the cluster head, A denotes the sensors area and R denotes the cluster area.
IDS technique can be employed as the first line of defense to minimize potential attacks. Numerous attack types are possible over WSNs, including Sinkhole attacks, Packet dropping attacks, and Sybil attacks[25]. Packet-dropping attacks sometimes referred to as packet loss, are among the most disruptive and devastating threats to WSNs [32]. Packet-dropping attacks interfere with the normal operation of a network by discarding received data packets or control messages rather than forwarding them to other nodes [33]. Feature dimensionality reduction selection is a strategy for identifying the most relevant features and utilizing them to develop robust and accurate IDS models[26]. The purpose of this work is to develop and test a new intrusion detection approach based on firefly and PCA algorithms using RF and NB for classification, that models many forms of security invasions and is suitable for execution on restricted devices. The resultant IDS can be thought of as a robust decentralized decision support tool capable of providing critical information about potential security issues in WSNs. This paper's significant contributions are summarized below.
-
To propose a wrapper-based FFA feature dimensionality reduction in WSNs attacks detection.
-
To propose a filter-based PCA feature dimensionality reduction in the detection of attacks in WSNs.
-
To develop and implement IDS that meet WSNs protocol criteria using the UNSW-NB15 data instead of a dataset that does not match real-world WSNs scenarios.
-
To perform classification with RF and NB models on the reduced dimension of data.
The remaining sections are organized as follows. In Section 2, related work is described. Section 3 then describes the recommended approach, followed by Section 4, which describes the experimental and comparative evaluation outcomes. In Section 5, the conclusion and future work are presented.
Related works
In WSNs, IDS is performed using ML techniques. Anomaly IDS powered by machine learning builds an explicitly or implicitly model of the investigated patterns that is updated periodically to optimize system performance based on prior results[7]. Using the hypergrid KNN method, the authors [34] proposed a web-based methodology for detecting random flaws and cyberattacks. This solution reduces computational and communication complexity by reforming anomaly from the hypersphere detection zone to the hypercube detection region. Garofalo et al. [27] used Decision trees to construct a distributed model for detecting sinkhole attacks in WSNs. The author generates both regular and attacks traffic using network simulator 3 (NS3). IDS is formed of both local and central agents. Ma et al. [28] performed ID on the NSL-KDD data utilizing SC and a DNN.
The authors [29] suggested a dispersed model for intrusion detection in WSNs based on fuzzy Q-learning and game theory. To identify and defend against intrusions at the sink node and BS, a game theory technique was utilized, while a fuzzy Q-learning method was utilized to change the game theory to predict upcoming attacks. The authors [30] proposed a methodology for detecting localization assaults in WSNs. The author employs a layered de-noising autoencoder to detect attacks on the localization program.
The authors [31] suggested a model for identifying flooding and blackhole assaults that incorporates fuzzy C-Means (FCM), one-class support vector machines (SVM), and sliding windows. The authors first standardized the test data using Z-score normalization, then utilized FCM to detect noisy data, one class SVM to identify attack traffic that was comparable to normal traffic, and the sliding window technique to assess if the data is being attacked or not. The authors [32] suggested a cluster-based WSN IDS paradigm. This idea employed two subsystems to identify intrusion in CH: RF, a spectral grouping of applications with noise based on enhanced density, and noise-based spectral grouping of applications (E-DBSCAN). RF is utilized to identify known attacks, whereas E-DBSCAN is utilized to identify unknown attacks. Almomani et al. [41] develop a novel IDS dataset (WSN-DS) for WSNs by simulating five distinct attack scenarios: blackhole assault, normal, flooding assault, gray hole, and scheduling assault.
Otoum et al. [33] compare IDS based on ML versus IDS based on deep learning for WSN. The authors concluded that while deep learning-based IDS is more accurate than machine learning-based IDS, it takes longer to identify threats. Tan et al. [34] performed class imbalance using the SMOTE technique and then utilized the RF algorithm to perform intrusion detection on the KDDCup'99 data. The bulk of researchers validates IDS models offline for WSN using the KDD dataset. However, the KDD dataset has an imbalance of classes. Inaccurate results were obtained because of the imbalanced dataset. The authors [35] suggested an RF algorithm-based model for detecting black holes, floods, scheduling, and gray holes.
Mansouri et al. [36] presented a centralized solution for detecting command injection attacks, response attacks, DoS attacks, and reconnaissance attacks using ANN. The authors employ gray wolf optimization (GWO) and Evolutionary System (ES) techniques to obtain optimal ANN weights. The authors [37] suggested a dispersed approach for spotting cyber threats in WSNs based on swarm intelligence and A.I. Through theoretical analysis, the authors demonstrate that AI with fluid intelligence has a high degree of accuracy and a low proportion of false alarms.
Nithiyanandam et al. [47] simulate the WSN using the NS2 network simulator and gather network traffic under normal and attack settings. The author proposed a strategy based on ACO and PSO for very accurate sinkhole attack detection. Sun et al. [48] proposed a distributed IDS for WSN based on Adaboost, the artificial fish swarm (AFS) algorithm, and a cultural algorithm (CA). Hierarchical AdaBoost is utilized to identify anomalies in sensor nodes, CHs, and sink nodes. CA and AFS with backpropagation are utilized to determine BS misuse. The model is trained using the NSL-KDD dataset. By utilizing GA, Singh et al. [38] developed an energy-efficient IDS for clustered WSNs. While A.I.-based intrusion detection achieves highly accurate in WSN, it is not easy to scale and is susceptible to overfitting during training.
Sedjelmaci et al. [39] presented a hybrid approach for detecting network layer assaults by combining a specification-based approach (selective forwarding, hello flood, wormhole, and blackhole) and an anomaly-based. Yan et al. [40] suggested a hybrid IDS that incorporates anomaly-based backpropagation networks (BPNs) with misuse-based intrusion detection algorithms. The authors applied anomaly and abuse detection algorithms to the KDDCup'99 dataset. Using signature and anomaly detection, the authors [52] proposed a distributed and lightweight technique for detecting energy depletion threats. The authors detect anomalies using an artificial immune system inspired by human white blood cells.
Subba et al. [41] presented a hybrid approach for detecting intrusions into networks with several layers. While a hybrid approach to intrusion detection improves accuracy, it also adds complexity [42]. Analysis of intrusion detection methodologies in WSNs reveals that the great majority of researchers employ machine learning algorithms to detect intrusions in WSNs. However, machine learning approaches require additional time for training and testing, as well as additional memory for deploying the model [42].
The network intrusion detection dataset was used by certain researchers to test the accuracy of their method. A majority of researchers utilized the KDDCup'99 dataset. In contrast, the KDDCup'99 data gathering was designed for wired networks and not wireless networks. As a result, KDDCup'99 is unsuitable for WSNs. Also, as noticed in the literature, some researchers use network simulators to construct their dataset and then use the simulated dataset to perform intrusion detection. Additionally, as seen in the literature, the majority of previous research focused on potential solutions from the perspective of certain WSN attack types. In WSN, the majority of the researchers used ML methods to detect intrusions. An ML technique necessitates more time and memory space for training and testing, as well as additional memory space for sensor node deployment. As a result, there is potential to construct a compact ML model for conducting intrusion detection in WSNs to reduce the amount of memory required to install a model. The majority of available strategies focus on a single form of attack on a single layer of the WSN, with no attention paid to assaults on other layers. Consequently, it is crucial to develop a cross-layer IDS capable of detecting a variety of threats that may vary at different WSN levels. Another key issue in the literature is that the majority of datasets used for the experimental study are KDDCUP 99 and NSLKDD, which lack real-life features and are incapable of adjusting to network changes. This is why the majority of the IDS-WSNs aren't suitable for use in a production environment.
Unlike, the existing studies, we present an IDS in WSNs based on wrapper FFA and filter PCA feature dimensionality reduction. To satisfy, the time and memory requirements issues, we used FFA and PCA approaches to eliminate the redundant features and to create a faster learning and training time for the RF and NB models. In addition, as against the existing studies that focused on a single attack on a layer in WSNs and the use of datasets that do not represent real-life WSN scenarios. In the context of this, we used a recent dataset that can adapt to network changes and comprises real-life properties. The attacks in this dataset include generic, exploits, DoS, fuzzers, backdoors, reconnaissance, and worms’ attacks in WSNs layers.