The success of the Honeypot early detection system depends on the correct choice of factors and features used in tracking attacks. This paper presents Honeypot EIDS technology using DL and SL algorithms because of their suitability in identifying attackers by extracting and collecting the most salient features from the attackers' performance logs. All models are implemented using Python programming language and the open-source scikit-learn library.
Data used for testing the models should be easily obtained for the proposed EIDS and reflect the host or network's behavior. Consider that building a dataset is a complex and time-consuming process. Therefore, using a benchmark dataset helps to facilitate the diagnosis time. Because the benchmark data sets are valid, they produce and extract the experimental results in the laboratory research more convincing and allow the results in the proposed method to be compared with previous studies. To determine the most optimal and efficient detection model for the stored data from the Honeypot, EIDS logs are used in the laboratory for this research and ensure its results and accuracy. Three world-famous datasets are used in this study: NSL-KDD dataset, CIC-IDS-2017 dataset, and Kyoto 2006 dataset explained in the following sections.
Therefore, an executable implementation model is designed to classify the datasets mentioned according to Fig. 5, which can be run and done with this method, such as importing the dataset, data preprocessing, data analysis, etc.
A. NSL-KDD
The NSL-KDD data has one test data and one train data, which are network records. The NSL-KDD version has 43 features; 41 are related to incoming traffic, the normal tag or attack, and the other items related to traffic intensity points.
The most important thing in this benchmark database is the attribute tag that specifies the normal or attacks label. The attribute tag tells whether this record is a normal record or an attack record, and all the records in these attributes are data. Also, there are more attacks on test data, and there are cases of unknown attacks but do not deal with this data in this way. Given the number of attacks mentioned and the normal state, five classes for work are considered. Normal, Dos, R2L, U2R, Probe.
Here are five tags for data. The preprocessing work is expanded, and the algorithms used in the DL and SL discussion are chosen to develop a simple process that makes testing and scaling easier.
B. CIC-IDS 2017
CIC-IDS2017 is a dataset with 78 features and respective class labels that include various 14 diversity of attacks, such as brute-force, Denial of Service (DoS), web attacks, etc. By summarizing the above example, it can be concluded that CICIDS2017 counts eight categories: benign, brute-force, DoS, web attacks, infiltration, botnet, port scan.
C. Kyoto 2006
The Kyoto 2006 dataset is the actual network traffic logs extracted from Honeypot sensors. These logs contain data collected from different types of Honeypots that consists of 23 attributes plus a tag attribute. This data log also includes normal traffic and abnormal traffic (types of attacks).In this data set, several columns are defined as primary columns that consist of known attacks, such as the Identification of all attacks in the label column, detected attacks with the exploit code in the Ashula detection column, the detection of Malware attacks in the Malware detection column, and the attack detection with the IDS firewall in the IDS detection column.
D. EIDS Database
As the logs are generated in the industrial network, IDS Snort completes the dataset of the EIDS. The proposed EIDS database is a lightweight and potent tool that authorizes the system to detect intrusion of malicious network traffic early. So, almost any threat that crosses the network can be identified by defining flexible and robust rules. To provide the mentioned needs, a solution to process the alert data of this huge dataset is needed.
Therefore, the CSV format for processing alert data is used, which is the most flexible and compatible method for data collection. To configure IDS Snort to use the CSV output format, add the following command to the Snort. conf file:
output alert_csv: alert.csv default
This command configures IDS Snort to create a CSV log file called alert.csv in the configuration log using the default output, and 30 features can be extracted from IDS Snort in the following as Tab. 2.
Table 2
Generated features for the EIDS database.
Feature
|
Feature
|
Feature
|
time
|
icmpseq
|
icmpid
|
icmpcode
|
date
|
sig_generator
|
icmptype
|
iplen
|
dgmlen
|
id
|
tos
|
ttl
|
tcpwindow
|
tcpln
|
tcpack
|
tcpseq
|
tcpflags
|
ethlen
|
ethdst
|
ethsrc
|
dstport
|
dst
|
srcport
|
src
|
proto
|
msg
|
sig_rev
|
sig_id
|
timestamp
|
|
Honeypot EIDSs are used to detect cyberattacks in a network of the ICS. Thus, various studies have been conducted on high-performance datasets based on ML techniques. In the field of IDS, famous datasets are available for evaluations like NSL-KDD intrusion detection datasets, CIC-IDS 2017, and Kyoto 2006 datasets. However, these datasets do not reflect recent cyberattack trends in the proposed research. For this reason, the EIDS dataset from the same traffic data from the study with the latest Snort Log is refined. Besides, the new dataset is evaluated by applying several ML techniques and comparing the datasets' classification results.
E. Metrics
Methods and criteria are needed to measure accuracy, Recall (R), Precision (P), and F1-Score to evaluate the methods used by ML for the proposed design in this paper to achieve the most optimal model for analyzing data properties. Therefore, the following criteria used in this article are briefly explained, along with the relevant formula and equations.
1) Accuracy
The accuracy parameter expresses the number of correct predictions made by the category divided by the number of total predictions made by the same category. That is the ratio of accurate diagnoses True Positive (TP) + True Negative (TN) to the total data that included TP + TN + False Positive (FP) + False Negative (FN). This criterion is very effective for many real-world classification problems because it considers both unintended data (denominator) and identifying data (deduction form) as equation 1. The goal of the proposed method is to reach accuracy=1 or 100%.
$$Accuracy=\frac{(TP + TN)}{(TP + TN + FP + FN)}{(1)}$$
2) Recall (R)
Accuracy parameter is not suitable for unbalanced data, data whose number of positive and negative labels in the real world are very different numerically. Many real-world issues fall into this category. This significant difference between different data sets makes the accuracy criterion inefficient. Therefore, a more objective benchmark is needed for measuring the proposed classification algorithms' accuracy and efficiency. In such cases, it is better to focus on the number of TP identified as the total number of positive samples. The R parameter for this purpose is defined as equation 2.
$$R=\frac{TP }{TP +FN }{(2)}$$
3) Precision (P)
If any positive sample in the R formula cannot be detected, its fraction becomes zero, which indicates that the proposed model is weak and can therefore be quickly rejected. To solve this problem, in addition to the retrieval criterion, another benchmark is defined called Precision, equal to the number of TP diagnostic samples to the total positive samples declared as equation 3 to consider the number of FP.
$$P=\frac{TP }{TP +FP }{(3)}$$
4) F1-Score
If a combination of these two criteria, R and P, can be obtained for measuring classification algorithms, it would be more appropriate to focus on one criterion instead of examining the two simultaneously. An average of the two as a new criterion can be used in order to raise the arithmetic mean. Therefore, the usual average of the two R and P measures is considered a criterion for high R and low P (or vice versa). In that case, the normal numerical average will be accepted if the proposed algorithm does not get a passing score.
According to equation 4, this average harmony for the two values of R and P is known as F1-Score, which according to the above procedure, is equal to:
$$F1=2\frac{P\times R}{P+R}{(4)}$$
5) False-Negative Rate (FNR)
According to the specified type of action in equation 5, FNR means when the sensor detects healthy traffic as malicious traffic and acts on this traffic. According to the applied signatures on the system, the proposed healthy traffic will be blocked, and it will not be allowed to pass, or if there are actions for logging, it will generate logs and alerts. Therefore, it would be difficult for the system administrator to root the reasons for the created FNs when checking logs and alerts. In general, having many FNs in the network hurts network performance and must be identified, investigated, and managed.
$$\text{F}\text{N}\text{R} = \frac{FN}{TP+ FN}{(5)}$$
6) False-Positive Rate (FPR)
The FPR in equation 6 means when the sensor does not detect malicious traffic. In this case, the proposed network is endangered because malicious traffic passes through the proposed network without being detected and blocked and can damage the network resources. This non-detection of malicious traffic can be due to various reasons. For example, the sensor signatures have not been updated, and new signatures have not been received, or the sensor settings have not been done correctly. Therefore, the sensor has not been able to function correctly, or this malicious traffic is a new method that has not yet been addressed.
$$\text{F}\text{P}\text{R} = \frac{FP}{FP+ TP}{(6)}$$
F. Monitoring and data mining application
This program uses 1056 lines of code and some other codes, such as web recall applications, algorithm connectors, etc., to execute the learning model algorithms in the EIDS project program. As shown in Fig. 6, the design is done in a convenient, simple, and user-friendly way for learning model systems, so that the steps of uploading CSV files can be done directly from the EIDS system log storage, and real-time analysis can be done to detect new attacks. The advantage of this is that it increases the percentage of reliability and reliability in detecting the early intrusion system alongside the detection system and gives us a deeper and more comprehensive understanding of the detection and investigation of various attacks to analyze the behavior and future actions of attackers.