Resampled Correlation-Based Feature Descriptors: A Novel Approach to Enhancing Malware Detection Capabilities

doi:10.21203/rs.3.rs-4219089/v1

Download PDF

Research Article

Resampled Correlation-Based Feature Descriptors: A Novel Approach to Enhancing Malware Detection Capabilities

https://doi.org/10.21203/rs.3.rs-4219089/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

The study addresses the pressing need for improved malware detection in cybersecurity, leveraging a novel approach that combines deep learning with feature selection techniques. By analyzing network traffic patterns, the research aims to distinguish between benign and malicious activities, crucial for protecting networks and devices from cyber threats. Through training and evaluating Dense and LSTM neural network models on extensive malware datasets, the study demonstrates a significant enhancement in detection accuracy, surpassing previous methodologies by 0.98%. This innovative method not only provides valuable insights for developers but also contributes to advancing the capabilities of malware detection systems, ultimately bolstering cybersecurity defenses against evolving threats.

Artificial Intelligence and Machine Learning

Malware

Vulnerability

Detection

Benign

Correlation

Optimization

Cybersecurity.

Malicious software, commonly referred to as malware, poses a significant threat to computer systems and networks worldwide. Created by threat agents, malware takes various forms such as worms, viruses, trojans, ransomware, spyware, and adware, with the intent to compromise victims' devices and extract financial gains (Sahin & Bahtiyar, 2020; Alomari et al., 2023; Sahin & Bahtiyar, 2020). Malware can be activated through user actions like clicking on links or pop-ups, often disguising itself as legitimate programs offering enticing features such as file conversion or storage (Yavanoglu & Aydos, 2017). As technology advances, malware has adapted to infect different operating systems, including Windows, Apple, Android, and cloud systems, causing symptoms like decreased performance, crashes, and unexpected pop-ups. With the proliferation of IoT devices, which enable seamless data exchange between systems, malware poses an even greater risk, necessitating robust detection mechanisms.

To address the evolving threat landscape, our study proposes a comprehensive approach integrating correlation-based feature selection, dense layer models, and LSTM models for malware detection. Leveraging two separate datasets—one comprising a significant number of records and the other containing multiple attributes—we aim to identify the most effective features through feature selection techniques. These selected features will be utilized in training scenarios involving both Dense and LSTM models, with various feature selection criteria, dataset splitting methods, and architectures explored. By harnessing the power of deep learning and feature selection, our study seeks to develop a robust and efficient malware detection system capable of accurately identifying and classifying malicious software while minimizing computational resources. Through rigorous testing and comparison of model configurations, our research aims to contribute to the advancement of malware detection mechanisms, enhancing cybersecurity defenses in an increasingly digital world (Appati, Kumah, & Yaokumah, 2021).

In the realm of cybersecurity, researchers continually strive to combat the escalating threat posed by malicious software (malware) infiltrating networks and devices. Shao, Xiong, and Cai (2021) pioneered a family-based bagging algorithm tailored for android malware detection. Their innovative approach scrutinized the distribution of malicious family samples within the training set and optimized features using a sophisticated feature selection algorithm. By evaluating the effectiveness of optimal weights in a weighted voting strategy, they conducted a comprehensive comparison with traditional classification methods, leveraging a dataset comprising both malicious and benign samples. Through a meticulous filtering process, they successfully reduced feature redundancy, enhancing detection efficiency and offering a promising strategy against evolving malware threats. Meanwhile, Riaz et al. (2022) proposed a deep learning-based solution for detecting malware in Internet of Things (IoT) devices. Their method involved preprocessing techniques and an ensemble classifier combining Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) models. With superior performance over existing methods, their approach achieved an average accuracy of 99.5% on standard datasets. By harnessing the power of deep learning algorithms, their research contributed significantly to bolstering cybersecurity defenses in IoT ecosystems. In another breakthrough, Yang, Zhang, Zhang, and Fan (2022) developed FF-MICNN, a novel malware detection approach tailored for the IoT domain. This solution integrated feature fusion technology with deep learning, transforming malicious code into grayscale image features and extracting opcode sequence features using the n-gram technique. Their algorithm showcased improvements in detection speed, feature comprehensiveness, and accuracy compared to conventional methods. By fusing global and local features, their model demonstrated robust detection capabilities, strengthening cybersecurity measures in IoT environments. Furthermore, Gao, Zhao, Li, and Chen (2021) proposed a malware classification framework grounded in semi-supervised learning and malware visualization. Their innovative framework leveraged visual methods for processing binary files and extracting global and local features from grayscale images. By incorporating an enhanced collaborative learning algorithm that utilized inexpensive unlabeled samples, they consistently improved model performance and achieved higher classification accuracy compared to traditional machine learning algorithms. This research underscored the efficacy of semi-supervised learning in bolstering malware detection capabilities. Lastly, Hemalatha et al. (2021) adopted a visualization-centric approach to analyze malware binaries and developed an efficient Dense-Net deep learning model for malware detection. Their methodology involved representing binaries as 2D images and modifying the loss function in the final classification layer of the DenseNet model to address data imbalance issues. Their approach exhibited higher accuracy rates in detecting new malware samples across multiple datasets, reducing false-positive rates while maintaining low computational time. This research marks a significant stride in enhancing malware detection through visualization techniques and deep learning methodologies.

The research methodology adopted in this study follows the design science approach outlined by Alomari et al. (2023), encompassing three pivotal stages. Firstly, the selection, preprocessing, and partitioning of datasets, namely the Malware and Android Dataset, were meticulously executed to facilitate subsequent analysis. Secondly, feature selection was conducted based on correlations with the target attribute, followed by training deep learning models, including Dense and LSTM, using various feature subsets and data splitting configurations. Lastly, the performance of the models was rigorously assessed using metrics such as accuracy, training time, and precision.

The study incorporates two distinct datasets with contrasting characteristics to provide comprehensive insights. The Malware Dataset, sourced from Kaggle, comprises 100,000 observations, evenly split between malware and benign samples. This dataset, created in a Unix/Linux-based virtual machine, features thirty-five attributes tailored for Android device classification. Conversely, the Android Dataset, originally curated by Tiwari (2018), contains feature vectors extracted from 15,036 Android applications. Among these, 5,560 apps are categorized as malware from the Drebin project, while the remaining 9,476 apps are benign. Figure 1 visually depicts the distribution of data points across malware and benign classes within the Android malware dataset, providing a clear representation of the dataset's composition.

In this study, a variety of deep learning techniques were introduced and utilized. To effectively train the deep learning models on the two datasets, a crucial preprocessing step was undertaken. This involved encoding (numbering) the classification (target) column and addressing any special characters or missing values within the data. Given the distinct characteristics of each dataset, the preprocessing steps applied to them inherently differed. Subsequent to the preprocessing phase, data splitting was executed on both datasets, dividing them into separate train and test sets. Multiple scenarios for data splitting were incorporated, allowing for comprehensive analysis. Before commencing training of the deep learning models, a feature selection process was implemented to streamline computational efficiency. The study employed various training scenarios, encompassing diverse splitting criteria, deep learning architectures, and the flexibility to include or exclude feature selection during the training phase. Figure 2 outlines the proposed methodology for both datasets, providing a visual representation of the study's approach.

During the preprocessing stage, several steps were undertaken to ensure data quality:

- Special characters and missing values were addressed by replacing them with "NaN."

- As a result, the "Hash" attribute, which exclusively contained special characters, was dropped from the dataset.

- The target class (classification) was designated with labels: zero for benign instances and one for malware instances.

Following the preprocessing phase, the dataset was partitioned into training and testing sets according to various scenarios:

- 80% of the data was allocated for training, with the remaining 20% reserved for testing.

- Another scenario involved a split of 75% for training and 25% for testing.

- Lastly, a split of 70% for training and 30% for testing was also considered.

Dealing with high-dimensional data presents a significant challenge in machine learning, increasing computational complexity and storage requirements. To mitigate this challenge, feature selection techniques prove invaluable by eliminating irrelevant and redundant data. This not only reduces computational overhead but also enhances learning accuracy and provides deeper insights into the model or data. Thus, a feature selection technique was employed in this study to identify the most pertinent features that bolster the model's predictive capabilities. Figure 3 visually illustrates the correlation-based feature selection approach adopted in this research (Dhal & Azad, 2021; Jie, Jiawei, Shulin, & Sheng, 2018).

Equation 1 is applied to calculate the correlations between all independent attributes and the target or dependent feature.

$$\begin{array}{c}{Corr}_{x,y}=\frac{\sum \left({x}_{i}-\stackrel{-}{x}\right)\left({y}_{i}-\stackrel{-}{y}\right)}{\sqrt{\sum {\left({x}_{i}-\stackrel{-}{x}\right)}^{2}{\left({y}_{i}-\stackrel{-}{y}\right)}^{2}}} \#\left(1\right)\end{array}$$

Where, ${Corr}_{x,y}$ is the correlation between feature ${x}_{i}$ and target feature y. $\stackrel{-}{x}$ and $\stackrel{-}{y}$ are the mean value of x and y, respectively. After obtaining the K number of required features, a potential list of features to be dropped was prepared. Various selection scenarios were created considering that the correlation values range between 0 and 1. The same methodology was used for the second dataset, except for the selection step. Due to the larger number of columns (215), specific correlation thresholds were utilized to eliminate columns from consideration.

Dense Layer Model. The dense layer model was constructed with varying configurations based on the characteristics of the datasets used. For scenarios involving the first dataset, hidden layers were designed with fifty neurons, while for scenarios involving the second dataset with a larger attribute count of 215, one hundred neurons were employed. The input layer of the deep learning model was adaptable to different input sizes corresponding to the number of selected features derived from the feature selection process. To ensure optimal performance, five hidden layers were incorporated, as further enhancements were not observed beyond this configuration. Activation functions (AFs) play a crucial role in neural networks by facilitating the learning of abstract features through non-linear transformations (Dubey, Singh, & Chaudhuri, 2022). Hence, the "relu" non-linear activation function was selected for the first layer. Subsequently, the following five hidden layers, each comprising fifty neurons, also utilized the "relu" activation function. Finally, the output layer, serving as the last dense layer, employed the "softmax" activation function. The model was intentionally designed to prioritize simplicity and minimize the number of learnable parameters. Unlike previous studies that conducted experiments to determine the optimal number of neurons and hidden layers for specific problems, the focus here was on streamlining the model architecture and avoiding unnecessary complexity.

Long Short-Term Memory (LSTM) Model. When training deep neural networks, a significant challenge arises from the vanishing or exploding gradient problem, which impedes the learning of long-term dependencies. In response to this issue, the LSTM (Long Short-Term Memory) architecture was introduced. LSTM networks are specifically engineered to alleviate the vanishing or exploding gradient problem and excel at capturing and learning long-term dependencies in sequential data (Houdt, Mosquera, & Nápoles, 2020). In the proposed LSTM model, the initial dense layer was substituted with an LSTM layer, utilizing a "relu" activation function. However, the remaining dense layers and the output layer remained unchanged from the previous dense model. This substitution led to an increase in the number of learnable parameters and consequently extended the training time.

Training Scenarios. During the training phase, a series of experiments were conducted, exploring various scenarios that involved feature selection, the inclusion of an LSTM layer, different feature selection thresholds, and varied data splitting criteria. For the first dataset, the training scenarios included:

- Training a deep learning model with a dense layer, using both the original dataset and different subsets of selected features. This involved four feature groups in addition to the original dataset, resulting in five unique scenarios.

- Modifying the deep learning model by adding an LSTM layer and training it with the first set of features.

- Training a deep learning model with three different data splitting criteria, resulting in two additional scenarios.

For the Android Malware dataset, the training scenarios were as follows:

- Training a dense layer model using the original dataset and three distinct subsets of selected features, resulting in four scenarios.

- Incorporating an LSTM layer into the deep learning model and training it with the original dataset.

- Training a deep learning model with various data splitting criteria.

In total, twelve different training scenarios were executed to evaluate the effects of utilizing different datasets, employing various feature selection methods, implementing diverse data splitting criteria, and employing different deep learning architectures.

Evaluation Criteria. The evaluation step serves as the final phase, where the performance of the model is assessed using various metrics, including validation accuracy, training time, precision, recall, and F1-score. During the training process, validation accuracy is computed by setting aside a holdout set, or validation set, to assess the model's performance. On the other hand, test accuracy is calculated after the completion of training. It is employed to evaluate the trained model's capability to accurately classify new instances as either malware or benign samples. The precision, recall, and F1-score were calculated using four different metrics, where TP, FP, TN, and FN denote true positive, false positive, true negative and false negative, respectively. The four metrics are computed as follows:

- Precision: The precision is determined by dividing the number of true positives (TP) by the sum of true positives and false positives (TP + FP). The equation is expressed below as:

$$\begin{array}{c}Precision=\frac{TP}{\left(TP+FP\right)} \#\left(2\right)\end{array}$$

- Recall: The recall is calculated by dividing the number of true positives (TP) by the sum of true positives and false negatives (TP + FN). It is computed using the equation below:

$$\begin{array}{c}Recall=\frac{TP}{\left(TP+FN\right)} \#\left(3\right)\end{array}$$

- F1-score: The F1-score is a harmonic mean of precision and recall. It is computed using the equation below:

$$\begin{array}{c}F1score=2 \times \frac{Precision\times Recall}{\left(Precision+Recall\right)} \#\left(4\right)\end{array}$$

TP (true positives) refers to the cases where the model accurately predicts malware samples among all the samples. FN (false negatives) represents cases where the model incorrectly predicts benign samples instead of identifying them as malware. TN (true negatives) denotes the instances where the model correctly identifies and rejects benign samples as benign. FP (false positives) signifies the cases where the model incorrectly predicts malware samples instead of correctly identifying them as benign. The optimal performance is achieved when TP and TN have the highest values or when FP and FN have the lowest values. This indicates that the model has a high level of accuracy in correctly classifying both malware and benign samples. Precision, also known as confidence in the context of data mining, represents the proportion of predicted malware samples that are correctly classified as malware. It indicates the accuracy of the model in identifying malware instances among the predicted positives. On the other hand, recall, also referred to as sensitivity in psychology, is the proportion of actual malware samples that are correctly predicted as malware. It measures the model's ability to identify all the true positive malware instances among the actual positives (Powers, 2011). A high precision value indicates that the trained model has a high sensitivity to correctly rejecting benign samples, meaning that it is effective in avoiding false positives. It reflects the model's ability to accurately classify samples as either malware or benign, minimizing the misclassification of benign samples as malware. When both precision and recall have high rates, it leads to a high F1-score value. The F1-score combines the concepts of precision and recall, providing a balanced evaluation metric that considers both false positives and false negatives. A high F1-score indicates a model that achieves a good balance between precision and recall, effectively managing both types of misclassifications.

The experimental setup involved a laptop connected to the Google Compute Engine Backend, equipped with 13GB of RAM. The model was trained using Google Colaboratory, often called Google Colab. This platform offers hassle-free usage and granting free access to computational resources essential for training and constructing machine learning models.

4.1 Results of The First Six Training Scenarios of The First Malware Dataset

Training was executed using twenty epochs, employing a batch size of one hundred. The model was compiled using the "Adam" optimization algorithm, and the sparse categorical cross-entropy function was adopted as the loss function. To verify the training process and ensure its accuracy, we selected a validation set that encompassed 20% of the training set. This validation set was consistently used in the entire training process.

Original dataset (33 features).

The figures vividly illustrate the model's training proficiency and its generalization performance through validation accuracy and loss. In all scenarios, except for the one involving fourteen features out of thirty-three, the accuracy curve reached a final value of 95%. Figure 5. illustrates the accuracy and loss curves for both training and validation sets across the five scenarios within the android malware dataset.

Original data (215 features).

Excluding the final scenario which consisted of fourteen features out of 215, the performance curves for the different scenarios consistently exhibited enhanced outcomes. This last scenario experienced a performance decrease, achieving just a slightly higher accuracy than 0.885, despite showcasing a higher training accuracy.

4.2 Outcomes Obtained by Employing Various Splitting Criteria for Both Malware Datasets:

In this section, we assessed the model's performance through various data splitting scenarios applied to the malware dataset for partitioning into training and testing sets. Specifically, we focused on the fourth scenario of the initial malware dataset, comprising twenty-one features, and conducted experiments with test set allocations of 20%, 25%, and 30%. Previous research has explored diverse feature selection methodologies. For example, Gumaa (2021) utilized graph-based feature selection, dividing the dataset into three segments based on permission-only instances, API calls, and combined permission and API calls. Our study, utilizing the same dataset, investigated three distinct scenarios with varying feature counts ranging from 39 to 14. Our model utilized correlations for feature selection, achieving a recall metric of 91.77%. This result closely approached Gumaa's findings, despite our study employing fewer features. Smmarwar, Gupta, & Kumar (2022) introduced a wrapping feature selection method applied to the CIC-InvesAndMal2019 malware dataset, achieving accuracy rates of 82.33%, 91.32%, and 91.8% with SVM, Random Forest, and Decision Tree models, respectively. Additionally, Smmarwar, Gupta, Kumar, & Kumar (2022) proposed a meta-heuristic feature selection algorithm named Binary Grey Wolf Optimization (BGWO) for identifying optimal feature combinations in malware datasets. However, this approach required considerable computational time, contrasting with our simple and reliable method that computed correlations in less than a second. While Smmarwar, Gupta, Kumar, & Kumar's (2022) approach demonstrated notable effectiveness, it proved time-intensive. Their accuracies for the features-selected variant of the malware dataset ranged from 59.93–83.49%. The Table 1 below compares our current findings with previous research, encompassing various machine learning approaches, including those utilizing the same dataset.

Table 1

Difference between our work and related works.
Study	Methodology	Dataset	Results
Ban, Lee, Song, Cho, & Yi (2022)	CNN	28,179 records	Accuracy = 98%, F1-score = 82%
Our study	Dense model, LSTM model	50,000 malware and 50,000 benign records (35 attributes) 9476 benign and 5560 malware records (215 attributes)	Accuracy = 99.99%, F1-score = 100% (original dataset) Accuracy = 99.77%, F1-score = 99.73% (63.3% selected feature) Accuracy = 98.75%, F1-score = 98.97% (original dataset) Accuracy = 94.30%, F1-score = 94.19% (18.22% selected features)
Taha & Barukab (2022)	SVM, logistic regression (LR), gradient boosting (GB), decision tree (DT), and AdaBoost, Ensemble Learning	Second Dataset	Accuracy = DT: 93.75% LR: 92.86% GB: 93.53% AdaBoost: 90.96% Ensemble Learning: 94.15%
Our Study	Dense model, LSTM model	Second Dataset	Accuracy = 98.75%, F1-score = 98.97% (Original dataset) Acc = 94.30%, F1-score = 94.19% (18.22% selected features)

The above table illustrates our work, achieving outstanding results in comparison to similar research efforts. We utilized a larger dataset than many other studies. Our study introduced the innovation of employing two datasets with distinct characteristics under various feature selection scenarios. In prior studies, evaluations relied on the accuracy metric, while some also used f1-score and recall. In contrast, our study incorporated f1-score, accuracy, recall, and precision. Furthermore, we integrated prior studies. Our deep learning LSTM and Dense model outperformed all previous machine learning methods. As shown in Table 1 above, the highest accuracy achieved in previous studies was 97.77%, which was achieved using a logistic regression model (Masum & Shahriar, 2019). This score falls short by 0.98% compared to our performance.

4.3 Challenges and Contributions:

This section addresses the difficulties we faced while achieving the results outlined in Table 4.1 and Table 4.2. They are as follows:

- Alomari, et al.’s (2023) research article lacked explicit instructions on how to handle special characters and missing values, except for the suggestion of replacing them with "NaN." Consequently, in the context of the second dataset, the attribute "TelephonyManager.getSimCountryIso" presented five occurrences of "NaN" values following the preprocessing stage. Given that the attribute encompassed binary values, it was justifiable to replace these instances with the mode value of the attribute. Additionally, despite the attribute's values being binary in nature, certain values were classified as "object" data type. We uniformly transformed all these values into the "integer" data type. Upon executing each stage in the training process, our outcomes were notably unsatisfactory. The model exhibited poor training performance and failed to generalize effectively. This outcome primarily stemmed from the disparate scales of the data points and the absence of randomness. To address this concern, we implemented a solution by conducting random sampling and scaling all data points to a uniform scale. The subsequent table illustrates the model's performance prior to the implementation of random sampling and scaling techniques.

Table 2

Model performance prior to random sampling and scaling.
Scenario	Validation accuracy %	Test accuracy %	Precision %	Recall %	F1-Score%	Step time ms/step
Original Dataset (33 features)	50.16	69.59	49.68	48.10	48.88	5
Using twenty-seven out of thirty-three features	49.84	50.15	50.57	45.98	48.17	4
Using twenty-seven out of thirty-three features (with LSTM layer)	50.16	69.32	49.62	49.46	49.54	6
Using twenty-four out of thirty-three features	49.84	50.15	49.65	44.52	46.95	4
Using twenty-one out of thirty-three features	49.84	50.15	49.90	50.21	50.0	4
Using fourteen out of thirty-three features	49.84	50.15	50.09	45.66	47.77	4

Based on Table 2 above, it was evident that the model's performance only marginally surpassed random guessing. This observation strongly suggested that the model faced difficulties in grasping the underlying patterns or associations within the data—a clear manifestation of underfitting.

- In the case of the android dataset, an issue arose concerning the imbalanced distribution of classes. The imbalanced nature of a dataset introduced bias towards the majority class (class 0) due to the model's optimization process aiming to minimize overall error, potentially resulting in diminished performance on the minority class (class 1). To address this, we managed the imbalanced data by employing the SMOTE for resampling purposes.

- Following the application of the identical deep learning architecture from the first malware dataset to the android dataset, we noticed that the model underwent successful training but exhibited poor generalization capabilities. Visualizations indicated a substantial disparity between training and validation accuracy, as well as between training and validation loss, as evident in the provided in Fig. 6 below:

From the observation indicated that our model was suffering from overfitting. Hence was not generalizing well. In addition to performing well on training data, a well generalized model also exhibits proficiency on brand-new, unseen data (validation and testing data). Consequently, it was imperative to minimize the gap between training and validation performance, with the validation loss closely aligning with (though not necessarily equating) the training loss. To achieve this, we implemented regularization and Dropout techniques to mitigate overfitting.

In our study, we developed a deep neural network model using two distinct sets of malicious software data. The initial dataset comprised numerous entries but had only a handful of characteristics, while the subsequent dataset had a limited number of cases but extensive attributes. Our study aimed to investigate how the performance of datasets with varying dimensions is affected by the feature selection process. To achieve this objective, we implemented multiple training approaches for each dataset, incorporating specific feature selections and diverse criteria for data division. During feature selection, we computed the correlation of each attribute with the target column, selecting features with the strongest correlations for integration into four distinct scenarios for the initial dataset. Conversely, the reduction of dimensions significantly impacted the performance of the second dataset, which contained 15,036 records and 215 attributes (excluding the target attribute). The dataset's substantial dimensionality resulted in close correlation values between the dependent attribute and other predictor columns, leading to the exclusion of more columns during the feature selection process based on a threshold compared to the malware dataset, which had fewer columns. While our study demonstrated superior performance compared to previous research, a notable limitation is the exclusion of certain malware datasets. Malware exhibits variations across platforms and devices beyond Android and Linux, each with unique characteristics. Therefore, our future research will consider diverse malware types, including PDFs and Windows operating systems, to assess their suitability for developing malware detection systems. Additionally, we will explore feature engineering techniques and compare their effectiveness against our current approach in subsequent studies.

Appati, J. K., Kumah, D., & Yaokumah, W. (2021). Machine Learning Methods for Detecting Internet-of-Things (IoT) Malware. International Journal of Cognitive Informatics and Natural Intelligence.
Ban, Y., Lee, S., Song, D., Cho, H., & Yi, J. H. (2022). FAM: Featuring Android Malware for Deep Learning-Based Familial Analysis.
Darabian, H., Homayounoot, S., Dehghantanha, A., Hashemi, S., Karimipour, H., Parizi, R. M., & Choo, K.-K. R. (2020). Detecting Cryptomining Malware: a Deep Learning Approach for Static and Dynamic Analysis. Journal of Grid Computing.
Dhal, P., & Azad, C. (2021). A comprehensive survey on feature selection in the various fields. Applied Intelligence.
Dubey, S. R., Singh, S. K., & Chaudhuri, B. B. (2022). Activation functions in deep learning: A comprehensive survey and benchmark. Neurocomputing.
Engelen, J. E., & Hoos, H. H. (2019). A survey on semi-supervised learning.
Forecast number of mobile users worldwide from 2020 to 2025. (2023, January). Retrieved from https://www.statista.com/: https://www.statista.com/statistics/218984/number-of-global-mobile-users-since-2010/
Gao, T., Zhao, L., Li, X., & Chen, W. (2021). Malware detection based on semi-supervised learning with malware visualization. Mathematical Biosciences and Engineering.
Gumaa, M. A. (2021). GRAPH APPROACH FOR ANDROID MALWARE DETECTION. Humanitarian & Natural, 190-202.
Hemalatha, J., Roseline, S. A., Geetha, S., Kadry, S., & Damaševičius, R. (2021). An Efficient DenseNet-Based Deep Learning Model for Malware Detection. Entropy.
Houdt, G. V., Mosquera, C., & Nápoles, G. (2020). A review on the long short-term memory model. Van Houdt, G., Mosquera, C., & Nápoles, G. (2020). A review on the long short-term memory model. Artificial Intelligence Review.
Hwang, C., Hwang, J., Kwak, J., & Lee, T. (2020). Platform-Independent Malware Analysis Applicable to Windows and Linux Environments. Electronics.
Jeon, S., & Moon, J. (2020). Malware-Detection Method with a Convolutional Recurrent Neural Network Using Opcode Sequences. Information Science.
Jie, C., Jiawei, L., Shulin, W., & Sheng, Y. (2018). Feature selection in machine learning: A new perspective. Neurocomputing.
Masum, M., & Shahriar, H. (2019). Droid-NNet: Deep Learning Neural Network for Android Malware Detection. 2019 IEEE International Conference on Big Data (Big Data).
P., V., Zemmari, A., & Conti, M. (2018). A machine learning based approach to detect malicious android apps using discriminant system calls. Future Generation Computer Systems.
Powers, D. M. (2011). EVALUATION: FROM PRECISION, RECALL AND F-MEASURE TO ROC, INFORMEDNESS, MARKEDNESS & CORRELATION. Journal of Machine Learning Technologies.
Riaz, S., Latif, S., Usman, S. M., Ullah, S. S., Algarni, A. D., Yasin, A., . . . Hussain, S. (2022). Malware Detection in Internet of Things (IoT) Devices Using Deep Learning. Sensors .
Sahin, M., & Bahtiyar, S. (2020, November 4-7). A Survey on Malware Detection with Deep Learning. 13th International Conference on Security of Information and Networks. Retrieved from https://sci-hub.ru/10.1145/3433174.3433609
Shafiqa, S., Mashkoo, A., Mayr-Dorn, C., & Egyed, A. (2020). Machine Learning for Software Engineering: A Systematic Mapping.
Shao, K., Xiong, Q., & Cai, Z. (2021). A Novel Malware Family-Based Bagging Algorithm for Android Malware Detection. Security and Communication Networks, 13.
Smmarwar, S. K., Gupta, G. P., & Kumar, S. (2022). A Hybrid Feature Selection Approach-Based Android Malware Detection Framework Using Machine Learning Techniques. Cyber Security, Privacy and Networking.
Taha, A., & Barukab, O. (2022). Android Malware Classification Using Optimized Ensemble Learning Based on Genetic Algorithms. Sustainability.
The State of Ransomware in 2023. (2023, July 3). Retrieved from BlackFog: https://www.blackfog.com/the-state-of-ransomware-in-2023/
Tiwari, S. (2018). Android Malware Dataset for Machine Learning. Retrieved from Kaggle: https://www.kaggle.com/datasets/shashwatwork/android-malware-dataset-for-machine-learning
Vinayakumar, R., Alazab, M., Soman, K. P., Poornachandran, P., & Venkatraman, S. (2019). Robust Intelligent Malware Detection Using Deep Learning.
Xiao, G., Li, J., Chen, Y., & Kenli Li. (2020). MalFCS: An effective malware classification framework with automated feature extraction based on deep convolutional neural networks. Journal of Parallel and Distributed Computing, 49-58.
Xiao, X., Zhang, S., Mercaldo, F., Hu, G., & Sangaiah, A. K. (2019). Android malware detection based on system call sequences and LSTM. Multimedia Tools and Applications.
Yang, J., Zhang, Z., Zhang, H., & Fan, J. (2022). Android malware detection method based on highly distinguishable static features and DenseNet. PLoS ONE.
Yavanoglu, O., & Aydos, M. (2017). A Review on Cyber Security Datasets for Machine Learning Algorithms. IEEE International Conference on Big Data, Symposium on Data Analytics for Advanced Manufacturingpub. Boston, USA.
Yazdinejad, A., HaddadPajouh, H., Dehghantanha, A., Parizi, R. M., Srivastava, G., & Chen, M.-Y. (2020). Cryptocurrency malware hunting: A deep Recurrent Neural Network approach. Applied Soft Computing.
Zamani, A. S., Anand, L., Rane, K. P., Prabhu, P., Buttar, A. M., Pallathadka, H., . . . Dugbakie, B. N. (2022). Performance of Machine Learning and Image Processing in Plant. Journal of Food Quality, 7.

The authors declare no competing interests.

Download PDF

Version 1

posted

You are reading this latest preprint version

Resampled Correlation-Based Feature Descriptors: A Novel Approach to Enhancing Malware Detection Capabilities

Status:

Version 1

Abstract

Figures

1 Introduction

2 Related Works

3 Methods and Materials

4 Results and Discussions

4.1 Results of The First Six Training Scenarios of The First Malware Dataset

4.2 Outcomes Obtained by Employing Various Splitting Criteria for Both Malware Datasets:

4.3 Challenges and Contributions:

5 Conclusion

References

Additional Declarations

Status:

Version 1