Intelligent Methods in Phishing Website Detection: A Systematic Literature Review

doi:10.21203/rs.3.rs-2518632/v1

Download PDF

Research Article

Intelligent Methods in Phishing Website Detection: A Systematic Literature Review

https://doi.org/10.21203/rs.3.rs-2518632/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

"Phishing" is a well-known cyberattack in which Internet users are targeted and directed to a fake website, similar to a legitimate and valid one. In such attacks, users are deceived into entering their sensitive information, such as passwords and credit card details, into these fake websites, which can be subject to further abuse by attackers, such as money and identity theft. Phishing has been causing problems for end users in network security for nearly three decades. In recent years, with the expansion of the Internet, it has become one of the most significant security issues in cyberspace, which needs to be addressed. To this end, researchers have provided many approaches to detect phishing websites, among which intelligent-based solutions have attracted more attention due to their adaptability to new samples. This research investigates intelligent methods for detecting phishing websites by examining 71 selected papers using a Systematic Literature Review (SLR) approach. It starts with an overview of phishing, including history, life cycle, statistics, and causes of user entrapment. Then, it presents kinds of methods for phishing website detection, as well as the steps of implementing machine learning methods, including data collection, feature extraction and selection, model creation, and evaluation. Next, it examines intelligent approaches to detecting phishing websites and compares them with their advantages and limitations, and finally, it discusses several challenges in this field to pave the way for further work.

cybersecurity

phishing

phishing website detection

Intelligent

machine learning

With the expansion of the Internet and the rapid growth of communication technologies, the number of active users in the web environment is increasing day by day. As shown in Fig. 1, the number of active Internet users in 2021 include about 62% of the world's population (Statista Research Department, 2022). In addition, the prevalence of COVID-19 in late 2019 brought about changes in our lifestyle; Many organizations turned to providing electronic services, and most people's daily activities such as business, banking, and social communication were transferred to the Internet platform (Tang and Mahmoud, 2021; Rameem Zahra et al., 2022). Such conditions have made Internet users more vulnerable to cyberattacks, and by committing crimes such as money and identity theft, attackers endanger the network security not only for users but also for organizations, which may lead to economic and credibility losses (Mohammad et al., 2015a).

"Phishing" is a type of cyberattack based on social engineering (i.e. network communication techniques) to trap internet users, where attackers use technical tricks to deceive them into revealing their confidential and personal information (such as login and credit card details) (Tang and Mahmoud, 2021; Mohammad et al., 2015a). In other words, phishers use human vulnerabilities rather than software vulnerabilities to carry out these types of attacks and don't even need to have much technical skills (Cranor et al., 2007; Gupta et al., 2018). Thus, a system may be technically secure enough, but end users are inadvertently deceived into revealing their information, which ultimately compromises the overall security of the system (Khonji et al., 2013). The nature of phishing attacks can also be found in the name given to them; In the 1990s, the term "phishing" was derived from the word "fishing" and its first letter was replaced by the first letter of the word "Phreak", which refers to cell phone hackers (Oxford Dictionaries 1990).

Phishing has been a security threat in cyberspace for nearly three decades and is considered the most serious of them today (Jain and Gupta, 2021). During this time since the emergence of such attacks, attackers have always improved their methods and designed new types. In one of the most common phishing techniques, attackers pose as reputable organizations and design a fake website with the same design as the legitimate website of that organization, then they encourage users to visit this website to update or verify their personal information through various communication channels such as email, SMS, Twitter, Facebook, and e-Fax (Gupta et al., 2021; Chiew et al., 2018). In fact, due to the simplicity of executing such attacks, phishing websites have increased in recent years, reaching a peak in 2022 (APWG, 2022). On the other hand, most users do not have a proper understanding of phishing threats and do not know how such attacks are implemented or how complex they can be (Wu et al., 2006). All in all, there is a critical need for an effective mechanism to protect users from phishing attacks.

As shown in Fig. 2, there are different types of methods for detecting phishing websites, which are broadly organized into two categories: Promotional-educational and technical solutions to detect phishing attacks manually and automatically manners, respectively. Technical solutions itself can examines in two categories: comparison-based methods and intelligent methods, and each of them includes different approaches (Fig. 2). The purpose of this study is to review intelligent methods based on Machine Learning (ML) for detecting phishing websites and to investigate the challenges in this field.

According to the above points, the rest of this paper is organized as follows: Section 2 provides an overview of phishing attacks, including the history, lifecycle, statistics, and reasons for phishing strategies are working. Section 3 is dedicated to presenting the method used to select 71 examined papers. Section 4 presents the related works, solutions in phishing website detection, and the implementation process of intelligent methods, and section 5 examines some recent intelligent approaches. Section 6 discusses the current challenges, and finally, section 7 concludes the paper.

2.1. History

The first known phishing attack dates back to 1996 when hackers stole America On-Line (AOL) users' credentials by accessing their passwords. Also, the first public mention on the Internet of the so-called phishing was made in January 1996 on the “alt.2600” hacker newsgroup, then in 1997, a media publication first warned users about such a threat. In 1998, phishers were using Internet newsgroups and forums as a vector to attack and trap victims (Ollmann, 2004). After 2000, as users became aware of such attacks, phishers switched to sending phishing emails because it was difficult to track the sender this way. Moreover, emails can easily be distributed to a large number of users (Gupta et al., 2021; Alabdan, 2020). Emails are still the most important vector and today they are used to send link of phishing websites, which are designed exactly like legitimate ones.

According to the life cycle of attacks through phishing websites shown in Fig. 3, the stages that phishers go through are as follows:

1- Targeting: Each phishing attack itself depends on the purpose of the phishers, as they use different approaches to target an organization or potential victims to improve the chances of success of the attack and also to be less suspicious (Alabdan, 2020). These types of approaches can broadly be investigated in two categories (Oest et al., 2020): The first category is dedicated to "spear phishing", where the phisher targets specific people or groups and appears as an organization or website that the potential victims already know. Such an attack is called "whaling" when phishers are looking to target prominent employees such as senior-level executives or other high-ranking employees of a company to steal their sensitive information with their privileged access to such data, which requires obtaining information about the target and there are different ways to do this (Chiew et al., 2018; Ollmann, 2004). For example, one of these ways is the target's browser sniffing, which means that by assessing the access time through analyzing URLs, cookies, and the Domain Name System (DNS) cache, phishers can obtain the websites most often used by the target (Felten and Schneider, 2000). The second category is considered "large-scale" attack, where phishers attack a wide range of users by targeting a reputable organization or website.

2- Website Creation: At this stage, phishers create a fake website suitable for their purpose, completely similar to a legitimate and valid website. There are several ways to design such a website that doesn't require much programming skills, for example, the "HTTrack" tool provides a complete copy of the website, through which information needed for designing the website (such as source code, images, and scripts) can be downloaded in full (HTTrack, 2017). As another example, using the "SET'' tool in Kali-Linux can quickly create a cloned version of a website (Ramadhan, 2017). It is worth noting that usually in phishing attacks, the entire website is not created, but only the login pages that require user information are forged (Kalaharsha and Mehtre, 2021).

3- Website Distribution: After targeting and creating a website, phishers lead the target user/users into visiting the website using direct or indirect methods: In direct methods, phishers persuade users to visit the website and reveal their information by exploiting social engineering techniques in sending the website link to them through different vectors (i.e. communication channels) such as Email, SMS, and e-Fax as well as voice message, instant message, and post on social networks. It should be noted that most phishing attacks begin with sending mass emails (Gupta et al., 2018). According to the report published by Avanan website on cyberattacks in 2021, 5% of all emails are phishing ones (Avanan, 2021). In addition, attackers use various technical approaches to lead users to websites through indirect methods. For example, some techniques may use to optimize the search engine's indexing; So that potential victims who use search engines to search for a specific website may click on the phishing link in the search results, thinking it to be a legitimate one (Nagunwa, 2014; Chiew et al., 2018). In another approach, attackers exploit vulnerabilities in websites and may inject scripts into them; So that the victim may visit a legitimate website, but an element of this page, such as the login form, is referred to a fake website (Ollmann, 2004). Moreover, sometimes phishers may forge the URL of a legitimate website in such a way that if the user makes a typo or enters mistakenly the name of a website that just heard, would be redirected to a fake website, which looks like the original website (Chiew et al., 2018).

4- Defraudation: After the user visits the website and enters her information, phisher gains access to them and may commits various crimes such as credit and identity theft, and sometimes even sells this information on the Internet black-market (Mohammad et al., 2015a). In addition, some users use the same login information for multiple accounts, in which case phishers may gain access to them as well (Tang and Mahmoud, 2021).

2.2. Statistics

In 2021, the "Internet Crime Complaint Center" showed in a report that received nearly 850,000 complaints of cyberattacks from the American public; About 40% of them included types of phishing attacks, which has the highest count among cybercrimes and its damages are estimated at more than 44 million dollars (IC3, 2021). The number of complaints received by this center in 6 consecutive years, related to various types of phishing attacks, is shown in Fig. 4.

Anti-Phishing Working Group (APWG) is a not-for-profit industry association focused on reducing phishing scams, founded in 2003. The statistics published by this association from 2019 to 2021 are shown in Fig. 5, where the number of unique phishing websites increased by approximately 1.9 times each year. However, the number of phishing emails with a unique subject in 2021 compared to 2020 has decreased by almost 2.1 times, which can be concluded that most of the phishing emails had duplicate subjects. Moreover, according to the latest statistics, in the first three quarters of 2022, APWG observed nearly 3.4 million unique phishing websites. This is the worst year for phishing that APWG has ever observed (APWG, 2022).

Furthermore, this association examines the extent to which different industries are targeted in these attacks, and the reports in Q1-Q4 2021 show that financial institutions, which include banks, have become the biggest target of phishers, accounting for 24% of attacks (Fig. 6) (APWG, 2022).

2.3. Why Phishing Works

Attackers exploit the vulnerabilities of end users to carry out many cyberattacks, including phishing; This has made users the weakest link in the security chain (Khonji et al., 2013). The question that arises here is why phishing works and people get trapped in the face of fake websites. In the rest of this section, some of the research conducted in this field have been examined.

One of the first papers on why users are trapped in phishing websites was published in 2006. According to that, the participants were asked to examine several websites and determine which one is fake. The results of this study show that there is no correlation between a person's technical skill and performance, which means even professional users may be deceived. In addition, most users consider the content of the website as a detection criterion, and the security indicators and warning signs of the browsers are often ineffective in their decision-making. Finally, the researchers organized the reasons for users are deceived and trapped in phishing websites in three dimensions, which are: Lack of computer and security knowledge, visual deception, and bounded attention to security indicators (Dhamijia et al., 2006).

In another research in 2015, it was found that most users make decisions based on the appearance and content of the website; Participants spend only 6% of their time checking security indicators (in Chrome browser) and 85% of their time viewing website content. However, researchers found that time spent looking at elements of the Chrome browser made people perform better in detection (Alsharnouby et al., 2015). Thus, this study confirmed the results of the previous one in 2006.

In another 2021 study, researchers sought to find out whether users' performance improved at detecting fake websites over time. In this paper, like the results of previous studies, the participants had an average of 69% success. Also, the results showed that almost all users check the website content, and the few users who only used this strategy achieved 44% success. But 80 percent of participants who also used the website address evaluation performed much better, and 75 percent were successful in fakeness detection. Those who paid attention to security indicators in the browser in addition to the previous two strategies achieved 80% success (Loxdal et al., 2021).

It's worth noting that in all these studies, users were tested to detect fake websites, so they probably did it better than users who are attacked daily without any preparation.

In this research, a systematic review with a critical view of the intelligent approaches in phishing website detection has been applied. As mentioned earlier, with the creativity of phishers in carrying out such attacks, phishing websites change over time, and detecting them has become a dynamic problem. For this reason, it is necessary to improve mutual approaches. In recent years, with the expansion and development of intelligent methods, approaches of this kind have attracted the attention of many researchers. Therefore, the time frame for reviewing papers is considered between 2018 and 2022 and the strategy adopted to conduct this research was obtained from the review of (Kitchenham & Brereton, 2013), (Shahrivar et al., 2018), and (Do et al., 2022) studies. Therefore, the research procedure can be described as follows: posing research questions, searching for studies, examining research criteria, selecting relevant papers, and analyzing the findings.

3.1. Research Questions

To direct the research in the investigation of intelligent approaches (based on ML) in phishing website detection, the following questions have been raised:

RQ1: What intelligent methods are used to detecting phishing websites and how are they implemented?

RQ2: What types of approaches have been more successful and what are the advantages and disadvantages of them?

RQ3: What challenges still exist in this field and what are the proposed future research directions?

3.2. Searching for Studies

To answer the RQs and obtain the most relevant studies, the following term has been searched in four different databases including Scopus, IEEEXplore, Web of Science (WOS), and Science Direct: "phishing" AND "detection" AND ("machine learning" OR "ml") AND (website OR URL).

The database search was conducted in November 2022, and a total of 1,402 studies were extracted in the period from 2018 to 2022, after removing duplicates, this number was reduced to 933 studies. Figures 7 and 8, show distribution of these studies based on the year of publication and the type of them, respectively.

3.3 Selecting Relevant Papers

In the initial selection by reviewing titles and abstracts of studies, 140 papers were identified. Then, to obtain the most relevant ones to the research objectives, by evaluating the full text and scoring based on the Quality Assessment (QA) questions, which are listed below, 59 papers have been selected, and finally, by backward snowballing, 12 studies were added to examine in this paper. The procedure for selecting these papers is shown in Fig. 9, and the quality of them according to the scores of "1" for "yes", "0.5" for "somewhat" and "0" for "no" is presented for each QA question in Table 4 of appendix.

QA1: Is the algorithm used described?

QA2: Are the selected features and how to extract them explained?

QA3: Is there innovation in the use of algorithms or features?

QA4: Are tests and evaluations of model completely performed?

QA5: Are limitations of the approach and challenges explored?

QA6: Has future work been proposed?

4.1. Related Works

So far, and especially in recent years, various papers have been published related to phishing attacks. However, since these attacks still exist and are constantly changing, there is always a need for studies that examine the latest countermeasures in phishing detection and present the existing challenges. This paper is conducted for this purpose and compared in Table 1 with most related works, which are as follows.

Basit et al. (2021) provided an overview of phishing attacks and their detection methods. They investigated communication channels and targeted devices in phishing attacks along with attackers' approaches and countermeasures in four categories: ML-based, DL-based, scenario-based, and hybrid. Then in each category, a comparison of approaches is presented, and finally, the existing challenges are discussed (Basit et al., 2021).

Kalaharsha and Mehter (2021) briefly reviewed the types of phishing attacks and methods of detecting phishing websites. In particular, they investigated the most widely used ML algorithms and discussed the existing challenges (Kalaharsha and Mehtre, 2021).

Jain and Gupta (2021) provided a comprehensive review of phishing attacks, detection methods, and existing challenges. They investigated phishing attacks by computer and mobile and countermeasures by educational, ML-based, search engine-based, list-based, visual similarity-based, and mobile phishing detection approaches. They also compared some approaches from 2015 to 2016 with their advantages and limitations and presented the existing challenges (Jain and Gupta, 2021).

Tang and Mahmoud (2021) reviewed phishing website detection methods with a focus on ML-based. they presented the process of implementing ML models and available resources for collecting the URLs of legitimate and phishing websites. Moreover, they analyzed the advantages and limitations of some recent approaches based on the type of ML algorithm in three categories: single, hybrid, and DL algorithms, and finally discussed challenges (Tang and Mahmoud, 2021).

Do et al. (2022) presented a systematic and comprehensive review of DL methods in detecting phishing attacks and reviewed 81 papers, as well as the advantages and disadvantages of DL models. Furthermore, this paper discusses the various issues of these approaches and offers suggestions to overcome them in future research (Do et al., 2022).

Catal et al. (2022) performed a systematic literature review (SLR) to investigate the results of DL approaches for phishing detection. They examined 43 journal articles (until 2020) in detail and presented DL algorithms in brief. Moreover, challenges and open solutions in DL-based phishing detection models were provided in this paper.

4.2. Phishing Detection Methods

Since controlling all the internet and websites from the server's side cannot be possible, most of the existing mutual methods in phishing website detection get applied from the client's side (Gupta et al., 2018). These methods are broadly organized into two main groups: Promotional-educational solutions aimed at increasing the level of awareness of end users and technical solutions aimed at developing software-based approaches.

Promotional-educational solutions through educational content such as workshops and games try to reduce the number of victims of phishing attacks by increasing the users' awareness about how to face and detect them (Gupta et al., 2018; Sahingoz et al., 2019). Research results show that these kinds of content could reduce users' willingness to enter their sensitive information on phishing websites by 40%. However, users could not detect 28% of phishing attacks because these websites had a high similarity to compare their legitimate targets. On the other hand, some users did not remember the features of legitimate websites. For these reasons, although user education is a necessary task, it can't be enough, and using such methods alone won't be effective (Sheng et al., 2010).

According to the above explanations, technical solutions are preferred for creating a decision support system (Gupta et al., 2018; Sahingoz et al., 2019). These solutions are classified into two categories: "Comparison-based" and "Intelligent" methods.

4.2.1. Comparison-based methods

In this method, components of the suspicious website (under detection) are compared somehow with detected legitimate websites, then its status is determined. Approaches based on comparison fall into three categories: list-based, heuristic-based, and visual similarity-based approaches.

a. List-Based Approaches

List-based approaches determine the status of websites by comparing them to black/ white lists that contain pre-detected phishing/ legitimate website information (such as URLs, IP addresses, and domain names) (Wang et al., 2019). These approaches can achieve low false-positive rates (i.e. legitimate instances incorrectly as phishing) and provide simplicity in implementation and fast execution. However, the main drawback of most of them is an inability to detect unknown or zero-hour attacks (i.e. attacks not seen previously). Besides, lists need to be updated frequently, which requires human intervention and verification. Due to the limitations, recommended combining list-based methods with others that can mitigate zero-hour attacks while keeping low false-positive rates (Khonji et al., 2013; Do et al., 2022).

For example, "Google Safe Browsing" Application Programming Interface (API) provided by Google allows client applications to check whether a particular URL is present in blacklists that are constantly updated by Google (Khonji et al., 2013). Another example is "Fishnet", which generates almost all possible variants of a URL that come from a blacklist using various heuristic techniques. Then if the generated URL is related to an active website and at the same time similar to a legitimate one, it will be added to this blacklist (Prakash et al., 2010). "Automated Individual White List (AIWL)" is another approach based on whitelists that store the Login User Interface (LUI) and IP address of websites that the user has logged into previously. Then warns if the user submits their credentials to untrusted websites that are not on this list (Cao et al., 2008).

b. Heuristics-Based

Heuristics-based approaches depend on the characteristics of phishing attacks, as they extract, examine, and analyze various features of the website’s structure, such as URL-based and HTML-based, to detect fake and suspicious websites. Unlike list-based, heuristics approaches are effective against zero-hour attacks. However, they tend to cause higher false-positive rates (Khonji et al., 2013).

As one of the most famous exploratory approaches, we can mention "Carnegie Mellon Anti-phishing and Network Analysis (CANTINA)" Tool, which obtains the top five terms on the website by calculating "Term Frequency-Inverse Document Frequency (TF-IDF)" and enters it into a search engine. Then if the domain name of the website is the same as the domain name of one of the top results, the website is considered legitimate, otherwise, it is considered phishing (Zhang et al., 2007). "Spoof Guard" is another example and one of the extensions of the Internet Explorer browser that uses heuristic methods to detect anomalies in the content of websites written in HTML language. Then if the heuristic results cross the threshold (a preset value), it alerts the user. One of the things this method check is whether the website address is similar to a whitelisted address or not. Additionally, for links on the page, it checks whether the URL written in the text is the same as the URL it refers to. It also assigns specific values to password fields to increase the level of caution; Because they may be misused to forge login forms (Chou et al., 2004).

c. Visual Similarity-Based

Visual similarity-based approaches attempt to identify fake websites by comparing their visual appearance with legitimate websites in terms of content such as page layout, page style, etc. Similar to the heuristic-based, they can mitigate zero-hour attacks and tend to have high false-positive rates. On the other hand, due to the need of storing snapshots of websites, these approaches require large storage space and higher computational costs (Khonji et al., 2013).

For example, Chen et al. (2009) presented an approach that detects phishing websites with distinctive key features in website images. In this approach, a snapshot of each suspicious website is taken and converted to grayscale. Then, using the Harris-Laplace algorithm, its key features are extracted and matched with the key features of the photos of legitimate websites in the list, and if it has a high match (more than 60%), the website is considered phishing (Chen et al., 2009).

4.2.2. Intelligent (ML based)

In ML-based approaches, phishing website detection is considered a classification task, which includes "phishing" and "legitimate" classes. In this method, to classify unseen websites and predict whether a website is phishing by training ML algorithms, models are created using a dataset containing features extracted from phishing and legitimate websites already identified and labeled. These approaches can achieve low false-positive rates and detect zero-hour phishing attacks, and they have superior adaption for new types of phishing attacks, so they are mainly preferred (Buber et al., 2017; Khonji et al., 2013).

In general, to create and implement an ML-based model, there are five steps as follows (Tang and Mahmoud., 2021; Ali, 2017): Data collection, feature extraction, feature selection, model creation, and model evaluation. In the continuation of this section, each of these steps is examined.

a. Data Collection

Since phishing website detection can be considered a classification problem, there is a need to have labeled data correctly as "fraudulent" and "legitimate" in the training phase. For this purpose, it is possible to use public resources that include the addresses of previously identified websites. For example, phishtank and openphish websites are among the most famous sources that provide phishing URLs (Tang and Mahmoud., 2021). Also, to collect legitimate URLs, there are sources such as CommonCrrawl, DMOZ, and yandex websites (Bahnsen et al., 2017; Sahingoz et al., 2019; Zuraiq et al., 2019). In addition, some sources collected URLs of legitimate and phishing websites and made them available to the public for greater convenience. For example, "Ebbu2017" dataset, which includes 36,400 legitimate and 37,175 phishing URLs (Sahingoz et al., 2019). Also, the dataset published by the University of Brunswick in Canada called "ISCX-URL-2016", which includes more than 35,000 legitimate and almost 10,000 phishing URLs (Mamun et al., 2016).

b. Feature Extraction

After collecting the dataset, including phishing and legitimate websites, the next step is to process these data and also extract the required features and information from them (Buber et al., 2017). there are various features, such as URL-based and HTML-based, that can be extracted from a website to detect fakeness, where quality and how the extraction of them play an essential role in the performance, and the response speed of intelligent models, respectively (Ali, 2017; Gupta et al., 2021).

Some researchers have provided datasets containing extracted features for use in related research areas. It is worth noting using these public-made datasets to create an efficient and real-time system for detecting phishing depends on the provision of a feature extraction process for detecting the status of new websites (Tang and Mahmoud., 2021). One of the most frequently used is the one published in the UCI Machine Learning Repository in 2015, which contains 11,055 instances with 30 features (Mohammad et al., 2015b). This dataset became the basis for phishing detection on ML-based approaches (Buber et al., 2017). In addition, in 2018, Mendeley published a dataset containing 10,000 samples with 48 features that used phishtank and openphish resources for phishing websites and Alexa and CommonCrawler for legitimate ones (Tan, 2018). It should note that the Alexa website provided services related to the ranking of web pages, which ended its activity on May 1, 2022.

c. Feature Selection

It is impractical to train a model using all features available on websites. Therefore, there is a need for a feature selection step, which is the selection process of a feature subset that can effectively describe the whole input instances while reducing overfitting and increasing model performance. Various feature selection algorithms fall into three categories: filter, wrapper, and embedded methods (Tang and Mahmoud, 2021).

Feature filter methods are applied before classification, where scores and ranks the features using ranking algorithms and finally selects the items whose score is higher than a threshold. The advantages of feature ranking are light calculations and the prevention of overfitting. Of course, the selected subset may not be optimal (Chandrashekar and Shahin, 2014).

Feature wrapper methods, unlike filter methods, rely on classification (model building), which selects the features that provide the best performance according to the chosen ML algorithm. In this way, at each stage, selects a subset of features, then evaluates the model performance on it, and finally selects the best subset (Ali, 2017). Wrapper methods generally fall into two groups: sequential selection and heuristic search. Sequential selection algorithms start with an empty set (or complete set) of features, then continue by adding some features (or removing some features) to improve model performance. In heuristic search algorithms, evaluation is performed on different feature subsets to achieve the best performance. It is critical to point out that wrapper methods are prone to overfitting and have a high computational complexity because each subset of features must go through the model training and testing phase (Chandrashekar and Shahin, 2014).

Embedded methods perform feature selection as part of the model training process without dividing the data into training and testing sets. These methods select the best features using ranking algorithms and model performance evaluation and try to compensate for the disadvantages of filter and wrapper methods (Chandrashekar and Shahin, 2014).

d. Model Creation

Once the dataset with the main features is collected, it is divided into a training and a testing set that usually contain 80% and 20% of the total instances, respectively. The creation of the model consists of two phases: training and testing. In the training phase, the classifier (ML algorithms) is trained using the training set to classify and predict the label of websites. Section 5 examines different types of these algorithms. In the testing phase, classifier performance is evaluated using the testing set. Then if the results are acceptable, the model can be used in real-world applications. Otherwise, some techniques can be applied to improve the classification (readjusting the model parameters or further processing the data) (Ali, 2017).

e. Model Evaluation

Only four classification possibilities can exist in any two-class classification problem, which is shown in the presented confusion matrix in Table 2. This matrix comprises four basic measures as follows (Do et al., 2021):

True Positive (TP) is the number of instances that are correctly detected as phishing,

True Negative (TN) is the number of instances that are correctly detected as legitimate.

False Positive (FP) is the number of legitimate instances that are incorrectly detected as phishing.

False Negative (FN) is the number of phishing instances that have incorrectly been detected as legitimate.

Table 2

Confusion matrix
		Predicted Labels
		phishing (Positive)	legitimate (Negative)
True Labels	phishing (Positive)	True Positive (TP)	False Negative (FN)
True Labels	legitimate (Negative)	False Positive (FP)	True Negative (TN)

Based on this matrix, there are various metrics to evaluate the performance of the model, the most common of which is defined according to the following equations (Do et al., 2021):

i “False Negative Rate (FNR)” indicates the number of phishing instances that are incorrectly classified as legitimate, relative to all phishing instances:

FNR = FN/(FN + TP) (1)

ii. “False Positive Rate (FPR)” demonstrates the number of legitimate instances that are incorrectly classified as phishing, relative to all legitimate instances:

FPR = FP/(FP + TN) (2)

iii. “Accuracy” measures the overall rate of correctly detected phishing and legitimate instances to the total number of instances. This metric is one of the most essential and popular ones to evaluate the performance of ML algorithms, which indicates the overall effectiveness of the model. Accuracy value in phishing detection shows how a classifier can effectively and efficiently distinguish between phishing and legal classes. However, it has certain limitations when an unbalanced dataset is used, but when the classes are balanced, valuable insights can be gained by calculating it (Do et al., 2021), as shown in the following equation:

Accuracy =\(\frac{TP + TN }{TP + TN + FN + FP}\) (3)

iv. “Precision” represents the rate of correctly detected phishing instances to the total number of detected phishing instances. Precision can be used as a valuable measure when using imbalanced datasets, or measuring accuracy alone is not sufficient for decision-making by security experts (Do et al., 2021), which is calculated as follows:

Precision =\(\frac{TP }{TP + FP}\) (4)

v. “Recall” denotes the rate of correctly detected phishing instances to the total number of existing phishing instances and shows the ability of the model to detect phishing attacks, as shown below:

Recall =\(\frac{TP }{TP + FN}\) (5)

vi. “F1-Score” is a harmonic means of precision and recall, which represents the balance of both these metrics, and there is no more need to compare them at the same time. F1- Score can be used to estimate the effectiveness and overall performance of the model (the ability to detect phishing attacks), and its high value means that both false positives and false negatives are minimized. This metric is calculated from the following equation:

F1-Score =\(\frac{2\times Precision\times Recall }{Precision + Recall}\) (6)

So far, researchers have presented many intelligent approaches to detect phishing websites, which are examined in the rest of this section in three categories: traditional algorithms, deep learning, and fuzzy logic.

5.1. Traditional algorithms

In creating models based on traditional ML algorithms, the feature extraction and selection process requires human skill, which is done separately from the classification process and cannot be combined to improve the model performance in one phase (Do et al., 2022). In the following, some frequently used algorithms in phishing website detection are introduced, including Naive Bayes, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF) (Do et al., 2022; Kalaharsha and Mehtre, 2021; Tang and Mahmoud., 2021).

a. Naive Bayes is a simple probabilistic algorithm based on Bayes theory, assuming that features are independent of each other and that the presence (or absence) of a particular feature of a class is unrelated to the presence (or absence) of any other feature (Kaur and Oberai, 2014). One of the advantages of this algorithm is that it requires little training data, and due to its simplicity often used in text classification and spam detection fields (Kalaharsha and Mehtre, 2021; Kaur and Oberai, 2014).

b. KNN is a non-parametric classification algorithm that makes predictions by finding similar labeled data points. To predict any given x, k neighbors with the shortest distance calculated from the training instances whose class is known, then whose majority label of them is selected as class x. There are some functions for distance calculation according to the data type, such as Euclidean and Hamming distance, which are used for continuous and discrete values, respectively. This algorithm does not have a training process, and each prediction may take a long time, so it is not suitable for implementation in a real-time environment, especially if the input data size is large (Kalaharsha and Mehtre, 2021; Tang and Mahmoud., 2021).

c. SVM is one of the most popular classifiers in solving linear and non-linear problems. The main idea of this algorithm is to find the optimal discriminating hyperplane between classes with the maximum margin from the closest points of each class (support vectors) (Abu-Nimeh et al., 2007). In the case of binary classification, SVM finds an n-1 dimensional hyperplane (n is the number of features) that classifies the data points into two sections (Tang and Mahmoud., 2021). Although SVM is very powerful and usually used in classification problems, it requires high computational costs in training the model and is not very suitable for large datasets. Also, SVM is sensitive to noisy data and therefore, prone to overfitting and does not perform well if there is a lot of noise (Abu-Nimeh et al., 2007).

d. DT is one of the most popular algorithms used in classification and regression. Each node and stem in the decision tree represents a feature and its value, respectively, and also, the first node is called the root and the last one is known as the leaf that provides the result. This classifier divides the training dataset until it reaches a leaf node, which is the label. It is worth noting that a straightforward tree structure performs better. When trees grow too deep, it likely leads to overfitting training samples, where small changes in them may lead to large changes in the model (Kalaharsha and Mehtre., 2021; Tang and Mahmoud., 2021).

e. RF is a type of ensemble classification, which consists of decision trees that are created based on randomly selected sets of training samples, then to predict the label of any given input, the decisions of these trees are collected by averaging or majority voting (Tang and Mahmoud., 2021). Random forest handles mass features in a dataset and produces an unbiased estimate of the generalization error during the forest construction process. It can also estimate missing samples well. On the other hand, the averaging of individual tree decisions in the training process reduces the problem of overfitting. The main drawback of this algorithm is its lack of repeatability because the process of creating a forest is random. Furthermore, the final model and subsequent results are hard to interpret because it involves many independent decision trees (Abu-Nimeh et al., 2007).

For some examples of using traditional ML algorithms, Kurkmaz et al. (2020) extracted 48 URL-based features from a dataset including 83,857 URLs, 40,668 phishing websites, and 43,169 legitimate websites. They used eight different ML algorithms for classification, and finally, the best result was obtained with RF algorithm, with an accuracy of 94.59% (Korkmaz et al., 2020). Sahingoz et al. (2019) applied Natural Language Processing (NLP) to extract 27 features from URLs and compared the performance of seven different classifiers on a dataset containing 73,575 URLs, which finally achieved 97.98% accuracy with RF (Sahingoz et al. 2019). Jain and Gupta (2018) presented an approach that achieves 99.09% accuracy by extracting 19 features based on URL and source code on a dataset including 4,059 websites with RF algorithm (Jain and Gupta, 2018). Gupta et al. (2021) proposed a model that extracts only nine lexical features based on URLs independent of third-party services and evaluated the performance of four classifiers on a dataset containing 19,964 URLs, including 9,964 phishing samples and 10,000 legitimate samples. The accuracy and response time of this model with RF is estimated at 99.57% and 51 milliseconds, respectively (Gupta et al. 2021).

5.2. Deep Learning (DL)

DL is a subset of ML, in which architecture is built based on neural networks. In recent years, deep learning models have become a promising solution in phishing detection and are preferable for creating a real-time prediction system. Among the features that distinguish this approach from traditional algorithms is the ability to learn and adapt to data, automatically extract features from raw data (such as URLs), and explore hidden correlations between them. However, DL models are less interpretable than traditional algorithms, and it is often impossible to explain the logic behind the assumptions, decisions, and conclusions that neural network makes. Therefore, it would be hard to understand the correlation between the input features and the output results and consequently to diagnose the cause of the underlying errors. In addition, training the model requires a longer time and more training samples than traditional algorithms, which may be expensive and time-consuming to collect (Do et al., 2022).

There are various algorithms in DL, and three of the most frequently used ones in phishing detection, which include Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), and Long Short-Term Memory (LSTM), have been introduced below.

a. CNN is a type of Feed-Forward Neural Networks (FFNN) in which there is no feedback between nodes. CNN architecture consists of input, hidden, and output layers, where hidden layers usually contain convolutional, pooling, and fully-connected layers (Tang and Mahmoud., 2021). This network can effectively extract features from raw data and handle complex tasks, it is also scalable and requires relatively little training time (Do et al., 2021). Due to its compatibility with multidimensional data, this network has achieved tremendous success in computer vision problems and is broadly used in image processing and classification. In addition, it has been used in several types of research in cybersecurity, especially the detection of phishing websites, because they contain multidimensional data such as text, images, or both. However, this network architecture requires high computational power and a large dataset when dealing with image data (Do et al., 2021; Do et al., 2022).

b. RNN is derived from FFNN that introduces a concept of time to the model. A prominent attribute in the architecture of this network is the feedback connection that enables the network to update the current state based on current input data and past states (Yu et al., 2019). In this network, the link between the nodes is a directed graph based on a time sequence that allows the network to have temporal dynamic behavior (Kalaharsha and Mehtre., 2021); This means that the input at time 1-t can affect the output at time t through these links (Lipton et al., 2015). One of the successful applications of RNN is text mining because it can recognize and process patterns and sequential sequences of input (such as text) (Tang and Mahmoud., 2021). However, this network is unable to learn correlations between data more than 5 or 10 steps apart (Bahnsen et al., 2017).

c. LSTM is a type of neural network architecture to solve the problems of RNNs and deal with sequential data with long-term dependencies. This network includes a loop structure between neurons in each layer and can maintain the continuity of information and correlation between input data up to a distance of more than 1000 steps (Yu et al., 2019; Do et al., 2021; Bahnsen et al., 2017). LSTM has a high ability to learn and can explore features automatically without manually extracting them. Also, it has a strong power in dealing with complex data with high dimensions (Su, 2020). Despite these advantages, training LSTM requires a longer time compared to other DL algorithms. In addition, it only can handle the forward information and does not consider the backward information; This issue can be addressed by using Bidirectional Long Short-Term Memory (BiLSTM) network (Do et al., 2021).

For example, Adebowale et al. (2020) presented a model based on LSTM and CNN using 35 combined features based on text, frame, and image of websites, which classified websites into phishing, suspicious, and legitimate. Accuracy on a dataset containing one million URLs and 10,000 photos is estimated to be 93.28%, and the response time is 25 seconds (Adebowale et al., 2020). In another research, Wang et al. (2019) proposed a model called PDRCNN based on RNN and CNN, which first encodes the received URL information in a two-dimensional tensor and extracts its global features with RNN, and then feeds it to CNN to classify. PDRCNN on a dataset including 500,000 URLs achieved a detection accuracy of 97% and consumed 4426 seconds for training and 0.4 milliseconds for classifying per URL (Wang et al., 2019). Tang and Mahmoud (2021) developed an approach as a browser extension that extracts the semantic features of URLs in the form of a matrix by NLP, which has achieved 99.18% accuracy by combining RNN and Gated Recurrent Unit (GRU) networks on a dataset containing 120,000 URLs (Tang and Mahmoud, 2021).

5.3. Fuzzy Logic

Fuzzy logic provides a framework for modeling of approximate reasoning methods, which enable logical reasoning and decision-making in an uncertain and imprecise environment with incomplete information (such as human conversation with natural language). Fuzzy logic is based on the theory that absolute "black and white" does not exist in the real world and calculates intermediate truth values of propositions. One of the main applications of this approach is processing and quantifying fuzzy linguistic variables whose values are words and mental concepts (Zadeh, 1988; Abuzuraiq et al., 2020). For this purpose, fuzzy systems can be used, which are created based on the knowledge of human experts in the field in question (Montazer and ArabYarmohammadi, 2015). In addition, to benefit from the advantages of neural networks, such as learning ability, they can be combined with fuzzy systems, which are named neuro-fuzzy systems (Sharma, 2016).

For example, Abdul Hossein et al. (2022) proposed a fuzzy logic-based approach that extracts six numerical features of websites and then transfers them to the corresponding linguistic value, which consists of three values: High(H), Medium(M), and Low(L). Finally, by applying a set of "if-then" rules optimized by Differential Evaluation (DE), classified websites. They achieved 97.6% accuracy on a dataset including 20,000 websites (Abdul-Hussein et al., 2022). Adebowale et al. presented a model based on Adaptive Neuro-Fuzzy Inference System (ANFIS), which by extracting 35 combined features based on text, frame, and image of websites, classifies them into phishing, suspicious and legitimate. They evaluated this model on a dataset including 13,000 websites and achieved an accuracy of 98.55% and an FPR of 1.45% (Adebowale et al., 2019).

The comparison of accuracy obtained in some of the selected approaches in this research along with their advantages and limitations is given in Table 3.

Table 3

Intelligent approaches
Algorithm	Type	Dataset	Advantages	Limitations	Acc (%)
RF (Zhu et al., 2022)	Traditional	UCI dataset 30 features	- Using the multi-objective evolution algorithm to increase accuracy and recall metrics - Comparing with five different public datasets	- No feature extraction process - Not examining the response time	98.37
CNN + Highway Deep Pyramid (Zheng et al., 2022)	DL	Imbalanced dataset containing 420,000 instances with ratio 1:5 (legitimate vs phishing) based on character-level and word-level features	- Getting raw URLs as input	- Not using balancing methods - Not examining the response time - Max URL length up to 120 characters	98.3
MLP + RNN + CNN (Yu et al., 2022)	DL	balanced dataset containing 6,000 instances with features based on URL, HTML, text, and image	- Hybrid architecture and features	- Not examining the response time - Using powerful processor (GPU) - Max URL length up to 16 characters and HTML text up to 256 - Using small dataset	97.75
RF (Wei & Sekiya, 2022)	Traditional	58,645 legitimate 88, 647 phishing with selecting 14 features from 111	- Using three algorithms for feature selection	- Not presenting feature extraction process - Not examining the response time	97
Hybrid network (Wang & Chen, 2022)	DL	ISCX-URL2016 dataset	- High accuracy - Getting raw URLs as input - Integrating convolution branches (local correlation analysis) and transformer (encoding)	- Not examining the response time - Max URL length up to 200 characters	99.77
CNN + LSTM (Shaiba et al., 2022)	DL	Ebbu2017 dataset based on character-level features	- Using optimization algorithm for parameter-tuning	- Not examining the response time	99.01
LightGBM (Sanchez-Paniagua et al., 2022)	Traditional	67,000 phishing 67,000 legitimate Based on 54 features: 21 URL, 8 HTML, 14 hybrid, and web technology	- instances of legitimate and phishing include 62% and 41% of login pages, respectively − 27 new features (including web technology) - Fast in feature extraction (43.56 milliseconds after loading website)	- Dependent on English language (in 5 copyright features)	97.95
RF (Rao et al., 2022)	Traditional	5,400 phishing 5,000 legitimate Feature extraction through tokenization and lemmatization of the domain-specific name in the source code	- Using 5 different word embedding algorithms - High accuracy - Response in 1.56 seconds - Implemented as a plugin	- Inefficient when not accessing the source code - Dependent on the English language	99.34
SVM and NB (Orunsolu et al., 2022)	Traditional	Balanced dataset containing 5,000 instances with 15 features	- High accuracy	- Small dataset - Dependent on third-party services - Without any new feature - Not examining the response time	99.96
KNN (Minocha & Singh, 2022)	Traditional	UCI dataset Selecting 27 features	- Using Modified Equilibrium Optimizer (MEO) for feature selection	- No feature extraction process - Not examining the response time	97.46
DT (Marimuthu et al., 2022)	Traditional	9,500 legitimate 13,500 phishing with 20 features	- Implementation as a plugin	- Using Alexa ranking	99.4
Hybrid classifier (Hevapathige & Rathnayake, 2022)	Traditional	476,000 legitimate 273,000 malwares (including 142,000 phishing) 53 URL-based features	- Combining 6 classification algorithms - Large dataset	- Not examining the response time - Using powerful processor (GPU) - Low accuracy - High computational complexity - Not tuning the parameter	95.14
MLP (Alsaedi et al., 2022)	DL	428,000 legitimate 223,000 malwares (including 94,000 phishing) with features based on URL, Google Search, and Whois	- Classification of each group of features with RF and final decision making with MLP	- Dependent on third-party services - Not examining the response time - No comparing with existing methods - Not explaining the features	96.8
Fuzzy system (Abdul-Hussein et al., 2022)	Fuzzy Logic	Balanced dataset containing 20,000 instances with 6 features	- Optimizing the set of rules with Differential Evolution algorithm	- Extracting a feature based on page rank from the Alexa	97.6
XGBoost (Das Guptta et al., 2022)	Traditional	Balanced dataset containing 6,000 instances from ISCX-URL2016 with 15 URL-based and 10 HTML-based features (hyperlink in the source code)	- High accuracy	- Small dataset	99.17
RF (Bustio-Martinez et al., 2022)	Traditional	Balanced dataset containing 52,000 instances with 9 features (6 new)	- High accuracy - Response in 100 milliseconds - Using feature selection algorithm (among 46 features)	- Need to implement as a plugin and evaluate the model with different datasets	99.57
LRCN + GCN (Ariyadasa et al., 2022)	DL	Balanced dataset containing 50,000 instances with automatic feature extraction based on URL and HTML	- Combining LSTM and CNN to check URL and using GCN to check HTML - Response in 1.8 seconds	- Low accuracy - Using powerful processor (Xeon with 4 cores) - No comparison with existing methods	96.42
Deep Autoencoder (Alqahtani et al., 2022)	DL	UCI dataset 30 features	- Using Artificial Algae algorithm to remove unimportant samples - Using Invasive Weed Optimization algorithm for parameters-tuning - Highest accuracy on UCI	- Not examining the response time and complexity of model - No feature extraction process	99.28
RNN + GRU (Tang and Mahmoud, 2021)	DL	Balanced dataset containing 120,000 instances	- High accuracy - Feature extraction with NLP - Implementation as a browser extension	- Max URL length up to 200 characters - Not supporting short URLs	99.18
CNN + Bi-LSTM (Ray & Kusshwaha, 2021)	DL	7,500 instances with character and word level features + 15 manual features from URL	- Combining CNN and Bi-LSTM with manual features	- Dependent on third-party services including Alexa in manual features - Small dataset - Not examining the response time and complexity of model	97.5
DNN + BiLSTM (Ozcan et al., 2021)	DL	Balanced dataset containing 28,000 instances 27 NLP (old) features with DNN and character embedding with LSTM	- Combining two networks in the output layer - High accuracy - Examining the computing time	- Extracting one feature from Alexa - Not examining the response time - Small dataset - Max URL length up to 150 characters	99.21
CNN (Mourtaji et al., 2021)	DL	Imbalanced dataset containing 30,000 legitimate 10,000 phishing with 37 features	- Using combined features and black lists - Comparing several algorithms	- Dependent on third party services and target website address - Long running time (4 hours)	97.94
RF (Lakshmanarao et al., 2021)	Traditional	Imbalanced data set containing 393,000 legitimate 146,000 phishing with URL features by using Hashing Vectorizer	- Applying three techniques for text feature extraction (TF-IDF Vectorizer, Count Vectorizer, and Hashing Vectorizer) - Implemented as a WebApp - No limit on the number of URLs' characters	- Not examining the response time	97.5
GB + RF (Indrasiri et al., 2021)	Traditional	Balanced dataset containing 75,000 with selecting 22 features (from 46)	- Hybrid classifier	- Two features dependent on third-party services - Computing time 170 seconds	98.27
PART (Barraclough et al., 2021)	Traditional	10,000 legitimate 20,500 malwares with 3,000 features	- High accuracy	- Many and unclear features	99.33
MLP (Deval et al., 2021)	DL	38,500 legitimate 40,000 phishing with 11 features (5 new)	- Presenting the first collaborative approach with the ability to add and remove features for the first time	- Dependent on English language (copyright feature) - Not examining the response time	95–97
RF (Gupta et al., 2021)	Traditional	20,000 instances from ISCX-URL2016 with 9 URL-based features	- High accuracy - short response time (51 milliseconds) - Limited features	- Not evaluating the robustness of the model with different datasets - Old dataset	99.57
RF (Gandorta and Gupta, 2020)	Traditional	2,500 phishing 2,700 legitimate With 20 features based on URL, source code, and page rank	- High accuracy - Comparison of six different algorithms	- Small dataset - Not evaluating the robustness of the model - Dependent on third party services	99.5
RF (Stobbs et al., 2020)	Traditional	20,000 legitimate 10,000 phishing with 27 features	- Using optimization algorithms for tuning parameter and feature selection - High accuracy	- Dependent on third party services - Not examining the response time	99.33
LSTM (Somesha et al., 2020)	DL	3,500 instances with 10 features: 3 URL-based, 6 HTML-based, and 1 page rank	- High accuracy	- Small dataset - Dependent on third party services (Alexa) - Not examining the response time	99.57
ensemble-based (Sameen et al., 2020)	Traditional	Balanced dataset containing 100,000 with lexical features based on URL and HTML	- Containing AI-generated phishing URLs in dataset - Detecting tiny URLs using DeepPhish - Calculating computational complexity	- Not examining the response time	98
RF (Sadique et al., 2020)	Traditional	Imbalanced data set containing 60,000 legitimate 38,000 phishing with 36 features	- Prioritizing feature extraction according to their computing time - Collecting legitimate websites from PhishTank	- Low accuracy - Dependent on third party services - No comparison with existing methods - Not explaining features	87
RF (Nagunwa et al., 2020)	Traditional	Imbalanced data set containing 9,000 phishing 1,700 legitimate with 20 features + black list + sensitive words (i.g. login)	- Detecting tiny URLs	- Dependent on third party services -8.5 seconds per website	98.45
RNN (Feng & Yue, 2020)	DL	800,000 legitimate 760,000 phishing with 17 URL-based features	- High accuracy - Large dataset - Automatic feature extraction	- Using powerful processor (GPU)	99
Gradient Boost (Arora & Misra, 2020)	Traditional	5,500 legitimate 5,000 phishing with 11 URL-based features	- Accuracy obtained without 3 features dependent on third-party services: 98.42% - Response in of 2 seconds	- Small dataset - Not evaluating the robustness of the model	99.93
CNN (Aljofey et al., 2020)	DL	158,000 phishing 161,000 legitimate based on character-level features	- Evaluating robustness of model (extraction of four different groups of features to compare test results on multiple sets) - Classification time 0.47 milliseconds	- Max URL length up to 200 characters	95.02
LSTM + CNN (Adebowale et al., 2020)	DL	Dataset containing one million URLs and 10,000 phishing pictures with 35 features based on text, frame, and images	- Large dataset - Combination of two networks - Using hybrid features	- Low accuracy - High response time (25 seconds) - Dependent on third-party services - Extract features based on source code only in JavaScript language	93.28
RNN + CNN (Wang et al., 2019)	DL	Balanced dataset containing 490,000 instances	- Large dataset - Evaluating the robustness of the model - Getting raw URLs as input - Combining two neural networks	- Low accuracy - Long response time (40 sec) - Max URL length up to 255 characters	95.79
ANFIS (Adebowale et al., 2019)	Fuzzy Logic	13,000 instances: 5,000 phishing 2,000 suspicious 6,000 legitimate with 35 features based on text, frame, and images	- High accuracy - Using hybrid features	- Dependent on third-party services - Extracting features based on source code only in JavaScript language	98.55
LR (Yan et al., 2019)	Traditional	Balanced dataset containing 800,000 instances	- Large dataset - Automatic feature extraction by Stacked Denoising Autoencoders network - Testing time for per URL 0.083 milliseconds	- Using powerful processor (Xeon with 16 cores + GPU) - Long training time (85 minutes)	98.25
RF (Sahingoz et al., 2019)	Traditional	Ebbu2017 dataset with 40 features	- Extracting 27 features using NLP	- Dependent on third-party services (Alexa) - Not examining the response time	97.98
RF (Jain and Gupta, 2018)	Traditional	2,100 phishing 1,900 legitimate with 19 features based on URL and source code	- Comparing five different algorithms - Response in 5.8 seconds	-Small dataset - Some features based on comparison with reputable websites	99.09

In this section, following the limitations observed by examining each approach, the main challenges in detecting phishing websites are described.

6.1. Dataset

The data set is one of the main components of any intelligent approach, and its role is even more pronounced in phishing website classification. Because phishers always try to improve their methods and design new websites that the existing mutual approaches cannot detect. As investigated in (Sanchez-Paniagua et al., 2022), the older presented models, tested on newer data, no longer have the previous performance and become less accurate with time. For this reason, it is necessary to always use the latest and most up-to-date labeled websites in presenting mutual approaches. They can also be proposed with the capability of continuous and automatic retraining to maintain or even increase performance over time. For this purpose, it is possible to use the storage of newly received samples. Of course, it is worth noting that in practice, the number of fake websites is much less than legitimate ones that each user visits. Therefore, it is suggested to use methods for balancing datasets or generating new samples, such as a Generative Adversarial Network (GAN).

6.2. Feature Selection

So far, researchers have presented and evaluated various features to detect phishing websites, and choosing the most effective ones is always considered one of the most significant challenges in this field. For this reason, approaches that use optimization algorithms in feature selection have achieved better results. Because the type of features and how to extract them affect the performance and response time of the model. The features that have been broadly used in recent approaches are URL-based (lexical and NLP-based), source code-based (such as HTML), dependent on third-party services (such as page rank, domain lifetime, DNS records, network traffic), and page content-based (text and images).

Extraction of URL-based features can be done in two ways: One is by extracting lexical features manually from the URL in a separate stage before the classification, and the other is by extracting features automatically from the URL with the help of neural networks and NLP methods. In general, approaches that use URL-based lexical features have less computational cost and shorter response time. However, these types of features may not cover all the characteristics of phishing well. On the other hand, phishers will have a much easier task in bypassing them. The results of (Shirazi et al., 2019) show that phishers can bypass phishing detection mechanisms by manipulating up to four features. Therefore, it is impossible to define such features with certainty, and it is suggested to present more approaches based on fuzzy logic since they can overcome ambiguity. In addition, all these approaches must be able to detect the original URL of websites, which may sometimes be displayed only on the user's side with the help of URL shortener services (tiny URLs). In fact, such challenges can be seen in other types of features as well. For example, the features based on text or source code, if they depend on a specific language, are not any more effective in facing websites that are written in another language or use images and embedded objects instead. In addition, features whose extraction depends on third-party services may lead to system instability; For example, the approaches used to extract the page rank feature from the Alexa website have lost their effectiveness by retiring this website.

According to the above explanations, feature selection for detecting phishing websites is a task that always needs to be updated and discover new and efficient ones.

6.3. Parameter Tuning

Among other influential factors in the training and performance of intelligent models for classification, we can mention the values assigned to the parameters of the used algorithm. Due to the fact that with an optimal set of them, the algorithm achieves the best possible results; Since there is no specific and formulated method to this end, most of the researchers tune the parameters of the model manually and with several experiments and trial and error on different combinations of values. However, in a few approaches that use optimization algorithms in parameter-tuning, such as (Shaiba et al., 2022), (Alqahtani et al., 2022), and (Stobbs et al., 2020), it is visible that this process can be facilitated, and better results can be obtained.

6.4. Response Time

Phishing websites are often short-lived, and on the other hand, one of the main reasons for users getting trapped is inattention to security issues, so they may reveal the requested information within a short period after visiting the website. Therefore, in phishing detection systems, response time plays a vital role, which refers to the time interval between sending the URL of a website and receiving its type. In fact, for an anti-phishing system to be efficient and effective, it should be able to respond within a very little time, which is affected by various factors, including the time of feature extraction and classification by the used algorithm. However, most papers have not considered and examined the response time.

This paper presented a systematic review of phishing website detection, mainly focusing on intelligent approaches, which has caught the attention of many researchers, and along with the development of ML methods, they have achieved many successes, including more than 98% accuracy in some of them. However, this problem still faces main challenges because none of these approaches can prevent all vulnerabilities. Among the most limitations in them are: using small and old datasets, not evaluating model strength, computational complexity, and response time, and not using dataset balancing methods and optimization algorithms in feature selection and parameter settings. In addition, since attackers ever improve their attack strategy, in the future, it is necessary to pay more attention to retraining and automatic parameter tuning in mutual approaches.

Abdul-Hussein RM, Mohammed AH, Kadhim AA ‘Detecting Phishing Cyber Attack Based on Fuzzy Rules and Differential Evaluation’, Journal TEM (2022) ISSN: 2217–8309, Vol: 11, Page: 543–551
Abu-Nimeh S, Nappa D, Wang X, Nair S (2007) ‘A comparison of machine learning techniques for phishing detection’, In Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit, Page: 60–69
Aburrous MM, Hossain A, Thabatah F, Dahal K (2008) ‘Intelligent phishing website detection system using fuzzy techniques’, In 2008 3rd International Conference on Information and Communication Technologies: From Theory to Applications, IEEE, Page:1–6
Abuzuraiq A, Alkasassbeh M, Almseidin M (2020) ‘Intelligent methods for accurately detecting phishing websites’, In 2020 11th International Conference on Information and Communication Systems (ICICS), IEEE, Page:085–090
Adebowale MA, Lwin K, Sanchez T, E., and, Hossain. MA (2019) ‘Intelligent web-phishing detection and protection scheme using integrated features of Images, frames and text’, Expert Systems with Applications, Vol:115, Page: 300–313
Adebowale MA, Lwin K, T., and, Hossain. MA (2020) ‘Intelligent phishing detection scheme using deep learning algorithms’. Journal of Enterprise Information Management
Adewole KS, Akintola AG, Salihu SA, Faruk N, Jimoh RG (2019) ‘Hybrid rule-based model for phishing URLs detection’, In International Conference for Emerging Technologies in Computing, Page: 119–135
Alabdan R (2020) Phishing Attacks Survey: Types, Vectors, and Technical Approaches. Future Internet, Vol: 12(10), Page: 1–37
Al-Ahmadi S, Alotaibi A, Alsaleh O (2022) ‘PDGAN: Phishing Detection with Generative Adversarial Networks’, IEEE Access, Vol: 10, Page: 42459–42468
Ali W (2017) Phishing website detection based on supervised machine learning with wrapper features selection. International Journal of Advanced Computer Science and Applications, Vol:8(9), Page: 72–78
Aljofey A, Jiang Q, Qu Q, Huang M, Niyigena JP (2020) ‘An EffectivePhishing Detection Model Based on Character Level Convolutional Neural Network from URL’, Electronics, Vol: 9(9), 1514
Almuhaideb AM, Aslam N, Alabdullatif A, Altamimi S, Alothman S, Alhussain A, Alissa KA (2022) ‘Homoglyph Attack Detection Model Using Machine Learning and Hash Function’, Journal of Sensor and Actuator Networks, Vol: 11(3), 54
Alotaibi B, Alotaibi M (2021) ‘Consensus and majority vote feature selection methods and a detection technique for web phishing’, Journal of Ambient Intelligence and Humanized Computing, Vol: 12(1), Page: 717–727
Alqahtani H, Alotaibi SS, Alrayes FS, Al-Turaiki I, Alissa KA, Aziz ASA, Duhayyim A (2022) M. ‘Evolutionary Algorithm with Deep Auto Encoder Network Based Website Phishing Detection and Classification’, Applied Sciences-Basel, Vol: 12(15). doi:10.3390/app12157441
Alsaedi M, Ghaleb FA, Saeed F, Ahmad J, Alasli M (2022) ‘Cyber Threat Intelligence-Based Malicious URL Detection Model Using Ensemble Learning’, Sensors, Vol: 22(9). doi:10.3390/s22093373
Alsariera YA, Adeyemo VE, Balogun AO, Alazzawi AK (2020) ‘Ai meta-learners and extra-trees algorithm for the detection of phishing websites’, IEEE Access, Vol: 8, Page: 142532–142542
Alsharnouby M, Alaca F, Chiasson S (2015) ‘Why phishing still works: User strategies for combating phishing attacks’, International Journal of Human-Computer Studies, ISSN: 1071–5819, Vol: 82, Page: 68–82
Anupam S, Kar AK (2021) ‘Phishing website detection using support vector machines and nature-inspired optimization algorithms’, Telecommunication Systems, Vol: 76(1), Page: 17–32
APWG (2022) Phishing Activity Trends Report. 1th quarter 2019: 3th quarter 2022. https://apwg.org/trendsreports/(Acceced 4 Dec. 2022)
Archana Janani K, Vetriselvi V, Parthasarathi R, Rao SV (2019) ‘An Approach to URL Filtering in SDN’, In International Conference on Computer Networks and Communication Technologies, Page: 217–228
Ariyadasa S, Fernando S, Fernando S ‘Combining Long-Term Recurrent Convolutional and Graph Convolutional Networks to Detect Phishing Sites Using URL and HTML’, Access IEEE (2022) Vol: 10, Page: 82355–82375. doi:10.1109/ACCESS.2022.3196018
Arora V, Misra M (2020) ‘A Novel Machine Learning Methodology for Detecting Phishing Attacks in Real Time’, LNCS, Vol. 12386, Page: 39–54
Avanan (2021) ‘1H Cyber Attack Report’, https://www.avanan.com/resources/white-papers/1h-cyber-attack-report (Acceced 26 Aug. 2022)
Bahnsen AC, Bohorquez EC, Villegas S, Vargas J, González FA (2017) ‘Classifying phishing URLs using recurrent neural networks’, In 2017 APWG symposium on electronic crime research (eCrime), IEEE, Page:1–8
Balogun AO, Akande NO, Usman-Hamza FE, Adeyemo VE, Mabayoje MA, Ameen AO (2021) ‘Rotation Forest-Based Logistic Model Tree for Website Phishing Detection’, In International Conference on Computational Science and Its Applications, Page: 154–169
Barraclough PA, Fehringer G, Woodward J (2021) ‘Intelligent cyber-phishing detection for online’, Computers & Security, Vol: 104. doi:10.1016/j.cose.2020.1
Basit A, Zafar M, Liu X, Javed AR, Jalil Z, Kifayat K (2021) ‘A comprehensive survey of AI-enabled phishing attacks detection techniques’, Telecommunication Systems, Vol: 76(1), Page: 139–154
Buber E, Demir O, Sahingoz O (2017) ‘Feature selections for the machine learning based detection of phishing websites’, International Artificial Intelligence and Data Processing Symposium (IDAP), Page: 1–5
Bustio-Martinez L, Alvarez-Carmona MA, Herrera-Semenets V, Feregrino-Uribe C, Cumplido R (2022) ‘A lightweight data representation for phishing URLs detection in IoT environments’, Information Sciences, Vol: 603, Page: 42–59. doi:10.1016/j.ins.2022.04.059
Cao Y, Han W, and Yueran Le (2008). ‘Anti-phishing based on automated individual white-list’, In Proceedings of the 4th ACM workshop on Digital identity management, Page: 51–60
Catal C, Giray G, Tekinerdogan B, Kumar S, Shukla S (2022) ‘Applications of deep learning for phishing detection: a systematic literature review’, Knowledge and Information Systems, Vol: 64, Page: 1457–1500. doi:10.1007/s10115-022-01672-x
Chandrashekar G, Sahin F (2014) ‘A survey on feature selection methods’, Computers & Electrical Engineering, Vol:40(1), Page:16–28
Chen YH, Chen JL (2019) ‘Machine Learning Mechanisms for Cyber-Phishing Attack’. IEICE TRANS. INF. & SYST
Chen KT, Chen JY, Huang CR, Chen CS (2009) ‘Fighting phishing with discriminative keypoint features’, IEEE Internet Computing, Vol: 13(3), Page: 56–63
Chiew KangLeng, ShengChekYong K (2018) and Tan Choon.Lin. ‘A survey of phishing attacks: Their types, vectors and technical approaches’, Expert Systems with Applications, ISSN: 0957–4174, Vol: 106, Page: 1–20
Chiew KL, Tan CL, Wong K, Yong KS, Tiong WK (2019) ‘A new hybrid ensemble feature selection framework for machine learning-based phishing detection system’, Information Sciences, Vol: 484, Page: 153–166
Chou N, Ledesma R, Teraguchi Y, Boneh D, Mitchell JC (2004) ‘Client-side defense against web-based identity theft’. in NDSS, The Internet Society
Cranor L, Egelman S, Hong J, Zhang Y (2007) ‘Phinding Phish: Evaluating Anti-Phishing Tools’, In Proceedings of The 14th Annual Network and Distributed System Security Symposium (NDSS '07)
Das Guptta S, Shahriar KT, Alqahtani H, Alsalman D, Sarker IH (2022) ‘Modeling Hybrid Feature-Based Phishing Websites Detection Using Machine Learning Techniques’, Annals of Data Science. 10.1007/s40745-022-00379-8
De Souza CHM, Lemos MOO, Silva FSD, Alves RLS (2020) ‘On detecting and mitigating phishing attacks through featureless machine learning techniques’, Internet Technology Letters, Vol: 3(1). doi:10.1002/itl2.135
Deval SK, Tripathi M, Bezawada B, Ray I (2021) ‘X-Phish: Days of Future Past. Adaptive Privacy Preserving Phishing Detection’
Dhamija R, Tygar JD, Hearst M (2006) ‘Why Phishing Works’, In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Page: 581–590. New York, NY, USA: ACM
Ding Y, Luktarhan N, Li K, Slamu W (2019) ‘A keyword-based combination approach for detecting phishing webpages’, computers & security, Vol: 84, Page: 256–275
Do NQ, Selamat A, Krejcar O, Herrera-Viedma E, Fujita H (2022) ‘Deep Learning for Phishing Detection. Taxonomy, Current Challenges and Future Directions’, IEEE Access
Do NQ, Selamat A, Krejcar O, Yokoi T, Fujita H (2021) ‘Phishing webpage classification via deep learning-based algorithms: an empirical study’, Applied Sciences, Vol: 11(19)
Felten EW, chneider MA (2000) ‘Timing attacks on web privacy’, In Proceedings of the 7th ACM conference on computer and communication security, Page: 25–32. New York, NY, USA: ACM
Feng T, Yue C (2020) ‘Visualizing and interpreting RNN Models in URL-based phishing detection&#8217
Gandotra E, Gupta D (2020) ‘Improving Spoofed Website Detection Using Machine Learning’, Cybernetics and Systems
Gupta BB, Nalin AG, Arachchilage, Psannis KonstantinosE (2018) ‘Defending against phishing attacks: taxonomy of methods, current issues and future directions’, Telecommunication System, ISSN: 1018–4864, Vol: 67(2), Page: 247–267
Gupta BB, Yadav K, Razzak I, Psannis K, Castiglione A, Chang W (2021) ‘A novel approach for phishing URLs detection using lexical based machine learning in a real-time environment’, Computer Communications, Vol: 175, Page: 47–57
He S, Li B, Peng H, Xin J, Zhang E (2021) ‘An effective cost-sensitive XGBoost method for malicious URLs detection in imbalanced dataset’. IEEE Access, Vol: 9, Page: 93089–93096
Hevapathige A, Rathnayake K (2022)‘Super Learner for Malicious URL Detection’
HR MG, MV A (2020) ‘Development of anti-phishing browser based on random forest and rule of extraction framework’, Cybersecurity, Vol: 3(1), Page: 1–14
Huang Y, Yang Q, Qin J, Wen W (2019), August ‘Phishing URL detection via CNN and attention-based hierarchical RNN’, In 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), Page: 112–119
HTTrack (2017) ‘Website Copier - Free Software Offline Browser’, https://www.httrack.com/ (Acceced 26 Aug 2022)
IC3 (2021) Federal Bureau of Investigation. https://www.ic3.gov/Media/PDF/AnnualReport/2020_IC3Report.pdf (Acceced 10 Nov. 2022)
Indrasiri PL, Halgamuge MN, Mohammad A (2021) ‘Robust Ensemble Machine Learning Model for Filtering Phishing URLs: Expandable Random Gradient Stacked Voting Classifier (ERG-SVC)’, IEEE Access, Vol: 9, Page: 150142–150161
Jain AK, Gupta BB (2021) ‘A survey of phishing attack techniques, defense mechanisms and open research challenges’, Enterprise Information Systems, Page: 1–39
Jain AK, Gupta BB (2018) ‘Towards detection of phishing websites on client-side using machine learning based approach’, Telecommunication Systems, Vol:68(4), Page:687–700
James L (2005) Phishing Exposed. Published by Syngress
Kaur G, Oberai EN (2014) ‘A review article on Naive Bayes classifier with various smoothing techniques’, International Journal of Computer Science and Mobile Computing, Vol: 3(10), Page: 864–868
Kalaharsha P, Mehtrea BM (2021) ‘Detecting Phishing Sites - An Overview’, arXiv 2021, arXiv:2103.12739
Kitchenham B, Brereton P (2013) ‘A systematic review of systematic review process research in software engineering’, Information and software technology, Vol: 55(12), Page: 2049–2075
Khonji M, Iraqi Y, and Andrew Jones (2013). ‘Phishing detection: a literature survey’, IEEE Communications Surveys & Tutorials, Vol: 15(4), Page: 2091–2121
Lakshmanarao A, Babu MR, Krishna B, M. M (2021) ‘Malicious URL Detection using NLP. Machine Learning and FLASK’
Lipton ZC, Berkowitz J, Elkan C (2015) ‘A critical review of recurrent neural networks for sequence learning’, arXiv preprint arXiv:1506.00019.
Loxdal J, Andersson M, Hacks S, Lagerström, Robert (2021) ‘Why Phishing Works on Smartphones: A Preliminary Study’, In Proceedings of the 54th Hawaii International Conference on System Sciences, ISNN: 1530–1605
Mahdavifar S, Ghorbani AA (2020) ‘DeNNeS: deep embedded neural network expert system for detecting cyberattacks’, Neural Computing and Applications, Vol: 32(18), Page: 14753–14780
Mamun MSI, Rathore MA, Lashkari AH, Stakhanova N, Ghorbani AA (2016) ‘Detecting Malicious URLs Using Lexical Analysis’, Network and System Security. Springer International Publishing, Page, pp 467–482
Marimuthu SK, Gopalasamy KS, Ben-Othman J (2022) ‘Intelligent antiphishing framework to detect phishing scam: A hybrid classification approach’, Software - Practice and Experience, Vol: 52(2), Page: 459–481. doi:10.1002/spe.3031
Minocha S, Singh B (2022) ‘A novel phishing detection system using binary modified equilibrium optimizer for feature selection’, Computers & Electrical Engineering, Vol: 98, 107689. doi:10.1016/j.compeleceng.2022.107689
Mohammad RM, Thabtah F, McCluskey L (2015a) ‘Tutorial and critical analysis of phishing websites methods’, Computer Science Review, Vol: 17, Page: 1:24
Mohammad RM, Thabtah F, McCluskey L (2015b) ‘Phishing Websites Dataset’, UCI Machine Learning Repository, https://archive.ics.uci.edu/ml/datasets/phishing+websites (Acceced 6 Aug. 2022)
Montazer GA, ArabYarmohammadi S (2015) ‘Detection of phishing attacks in Iranian e-banking using a fuzzy–rough hybrid system’, Applied Soft Computing, Vol: 35, Page: 482–492
Mourtaji Y, Bouhorma M, Alghazzawi D, Aldabbagh G, Alghamdi A (2021) ‘Hybrid Rule-Based Solution for Phishing URL Detection Using Convolutional Neural Network’. Wireless Communications & Mobile Computing. doi:10.1155/2021/8241104
Nagunwa T (2014) Behind Identity Theft and Fraud in Cyberspace: The Current Landscape of Phishing Vectors. International Journal of Cyber-Security and Digital Forensics (IJCSDF), Vol: 3, Page: 72–83
Nagunwa T, Naqvi S, Fouad S, Shah H (2020) ‘A Framework of New Hybrid Features for Intelligent Detection of Zero Hour Phishing Websites’, Vol: 951, Page: 36–46
Oest A, Zhang P, Wardman B, Nunes E, Burgis J, Zand A, Thomas K, Doupé A, Ahn GJ (2020) ‘Sunrise to Sunset: Analyzing the End-to-end Life Cycle and Effectiveness of Phishing Attacks at Scale’, In Proceedings of the 29th USENIX Security Symposium, Page: 361–377
Ollmann G (2004) ‘The Phishing Guide - Understanding & Preventing Phishing Attacks’. IBM Internet Security Systems
Orunsolu AA, Sodiya AS, Akinwale AT (2022) ‘A predictive model for phishing detection’, Journal Of King Saud University-Computer And Information Sciences, Vol: 34(2), Page: 232–247. doi:10.1016/j.jksuci.2019.12.005
Ozcan A, Catal C, Donmez E, Senturk B (2021) A hybrid DNN–LSTM model for detecting phishing URLs. Neural Comput Appl. doi:10.1007/s00521-021-06401-z
Oxford Dictionaries (1990) http://www.oxforddictionaries.com/definition/english/phishing (Acceced 2 Mar. 2022)
Patil DR, Patil JB (2006) ‘Survey on Malicious Web Pages Detection Techniques’, International Journal of u- and e- Service, Science and Technology, Vol: 8, Page: 195–206
Pham C, Nguyen LA, Tran NH, Huh EN, Hong CS (2018) ‘Phishing-aware: A neuro-fuzzy approach for anti-phishing on fog networks’, IEEE Transactions on Network and Service Management, Vol: 15(3), Page: 1076–1089
Prakash P, Kumar M, Kompella RR, Gupta M (2010) ‘Phishnet: predictive blacklisting to detect phishing attacks’, In 2010 Proceedings IEEE INFOCOM, IEEE, Page: 1–5
Rao RS, Pais AR (2019) ‘Detection of phishing websites using an efficient feature-based machine learning framework’, Neural Computing and Applications, Vol: 31(8), Page: 3851–3873
Ramadhan BF (2017) Kali Linux: Social Engineering Toolkit. https://linuxhint.com/kali-linux-set/ (Acceced 26 Aug. 2022)
Rameem Zahra S, Ahsan Chishti M, Iqbal Baba A, Wu F (2022) ‘Detecting Covid-19 chaos driven phishing/malicious URL attacks by a fuzzy logic and data mining based intelligence system’, Egyptian Informatics Journal, Vol: 23(2), Page: 197–214. doi:10.1016/j.eij.2021.12.003
Rao RS, Umarekar A, Pais AR (2022) ‘Application of word embedding and machine learning in detecting phishing websites’, Telecommunication Systems, Vol: 79(1), Page: 33–45. doi:10.1007/s11235-021-00850-6
Ray KS, Kusshwaha R (2021) ‘Detection of Malicious URLs Using Deep Learning Approach’, Lecture Notes in Networks and Systems, Vol: 163, Page: 189–212
Saravanan P, Subramanian S (2020) ‘A framework for detecting phishing websites using GA based feature selection and ARTMAP based website classification’, Procedia Computer Science, Vol: 171, Page: 1083–1092
Sadique F, Kaul R, Badsha S, Sengupta S ‘An Automated Framework for Real-time Phishing URL Detection’, Paper presented at the 2020 10th Annual Computing and Communication, Workshop (2020) and Conference (CCWC)
Sahingoz OK, Buber E, Demir O, Diri B (2019) ‘Machine learning based phishing detection from URLs’, Expert Systems with Applications, Vol: 117, Page: 345–357
Sameen M, Han K, Hwang SO (2020) ‘PhishHaven - An Efficient Real-Time AI Phishing URLs Detection System’, IEEE Access, Vol: 8, Page: 83425–83443. doi:10.1109/ACCESS.2020.2991403
Sanchez-Paniagua M, Fernandez EF, Alegre E, Al-Nabki W, Gonzalez-Castro V (2022) ‘Phishing URL Detection: A Real-Case Scenario Through Login URLs’, IEEE Access, Vol: 10, Page: 42949–42960. doi:10.1109/ACCESS.2022.3168681
Sanchez-Paniagua M, Fidalgo E, Alegre E, Alaiz-Rodriguez R (2022) ‘Phishing websites detection using a novel multipurpose dataset and web technologies features’, Expert Systems with Applications, Vol: 207. doi:10.1016/j.eswa.2022.118010
Shahrivar S, Elahi S, Hassanzadeh A, Montazer G (2018) ‘A business model for commercial open source software: A systematic literature review’, Information and software technology, Vol: 103, Page: 202–214
Shaiba H, Alzahrani JS, Eltahir MM, Marzouk R, Mohsen H, Hamza MA (2022) ‘Hunger Search Optimization with Hybrid Deep Learning Enabled Phishing Detection and Classification Model’, Computers Materials & Continua, Vol: 73(3), Page: 6425–6441. doi:10.32604/cmc.2022.031625
Shirazi H, Bezawada B, Ray I (2018) ‘" Kn0w Thy Doma1n Name" Unbiased Phishing Detection Using Domain Name Based Features’, In Proceedings of the 23nd ACM on symposium on access control models and technologies, Page: 69–75
Shirazi H, Bezawada B, Ray I, Anderson C (2019) ‘Adversarial sampling attacks against phishing detection’, LNCS, Vol: 11559, Page: 83–101
Shaikh A, Shabut A, Hossain A (2016) ‘A literature review on phishing crime, prevention review and investigation of gaps’, In 10th International Conference on Software, Knowledge, Information Management & Applications (SKIMA), Page: 9–15
Sharma MK (2016) Neuro-Fuzzy Systems: A Hybrid Intelligent Approach. International Journal OF Engineering Sciences & Management Research
Sheng S, Wardman B, Warner G, Cranor L, Hong J, Zhang C (2009) ‘An empirical analysis of phishing blacklists’.
Sheng S, Holbrook M, Kumaraguru P, Cranor LF, Downs J (2010) ‘Who falls for phish? A demographic analysis of phishing susceptibility and effectiveness of interventions’, In Proceedings of the SIGCHI conference on human factors in computing systems, Page: 373–382
Sonowal G, Kuppusamy KS (2018) ‘MMSPhiD: a phoneme based phishing verification model for persons with visual impairments’, Information & Computer Security, Vol: 26(5), Page: 613–636. doi.org/10.1108/ICS-12-2017-0091
Somesha M, Pais AR, Rao RS, Rathour VS (2020) ‘Efficient deep learning techniques for the detection of phishing websites’, Sadhana-Academy Proceedings In Engineering Sciences, Vol: 45(1). doi:10.1007/s12046-020-01392-4
Statista Research Department (2022) Number of internet users worldwide from 2005 to 2021. https://www.statista.com/statistics/273018/number-of-internet-users-worldwide/ (Acceced 26 Nov. 2022)
Stobbs J, Issac B, Jacob SM (2021) ‘Phishing Web Page Detection Using Optimised Machine Learning’, Paper presented at the 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)
Su Y (2020) ‘Research on website phishing detection based on LSTM RNN’, In2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), IEEE, Vol: 1, Page: 284–288
Tan CL (2018) Phishing Dataset for Machine Learning: Feature Evaluation. Mendeley. https://data.mendeley.com/datasets/h3cgnj8hft/1 (Acceced 5 Nov. 2022)
Tang L, Mahmoud QH (2021) ‘A Survey of Machine Learning-Based Solutions for Phishing Website Detection’, Machine Learning and Knowledge Extraction, Vol: 3, Page: 672–694
Tang L, Mahmoud QH (2021) ‘A Deep Learning-Based Framework for Phishing Website Detection’, IEEE Access, Vol:10, Page: 1509–1521
Van Dooremaal B, Burda P, Allodi L, Zannone N (2021) ‘Combining text and visual features to improve the identification of cloned webpages for early phishing detection. In The 16th International Conference on Availability, Reliability and Security, Page: 1–10
Vinayakumar R, Soman KP, Poornachandran P (2018) ‘Evaluating deep learning approaches to characterize and classify malicious URLs’, Journal of Intelligent & Fuzzy Systems, Vol: 34(3), Page: 1333–1343
Vrbančič G, Fister Jr I, Podgorelec V (2018) ‘Swarm intelligence approaches for parameter setting of deep learning neural network: case study on phishing websites classification’, In Proceedings of the 8th international conference on web intelligence, mining and semantics, Page: 1–8
Wang W, Zhang F, Luo X, Zhang S (2019) ‘Pdrcnn: precise phishing detection with recurrent convolutional neural networks’. Security and Communication Networks
Wang C, Chen Y (2022) TCURL: Exploring hybrid transformer and convolutional neural network on phishing URL detection. Knowledge-Based Syst Vol: 258. doi:10.1016/j.knosys.2022.109955
Wei Y, Sekiya Y (2022) ‘Feature Selection Approach for Phishing Detection Based on Machine Learning’, Proceedings of the International Conference on Applied CyberSecurity (ACS) 2021. ACS 2021. Lecture Notes in Networks and Systems, Vol: 378, Page: 61–70
Wu M, Miller RC, Garfinkel SL (2006) ‘Do security toolbars actually prevent phishing attacks?’, In Proceedings of the SIGCHI conference on Human Factors in computing systems, Page: 601–610
Yan H, Zhang X, Xie J, Hu C (2019) ‘Detecting malicious urls using a deep learning approach based on stacked denoising autoencoder’, Vol: 960, Page: 372–388
Yang P, Zhao G, Zeng P (2019) ‘Phishing website detection based on multidimensional features driven by deep learning’ IEEE access, Vol: 7, Page: 15196–15209
Yazhmozhi VM, Janet B (2019) ‘Natural language processing and Machine learning based phishing website detection system’, In 2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC), Page: 336–340, IEEE
Yerima SY, Alzaylaee MK (2020) ‘High accuracy phishing detection based on convolutional neural networks’, In 2020 3rd International Conference on Computer Applications & Information Security (ICCAIS), IEEE, Page: 1–6
Yu X (2020) Phishing websites detection based on hybrid model of deep belief network and support vector machine. In IOP Conference Series: Earth and Environmental Science, Vol: 602(1), IOP Publishing
Yu S, An C, Yu T, Zhao Z, Li T, Wang J (2022) ‘Phishing Detection Based on Multi-Feature Neural Network’, Paper presented at the 2022 IEEE International Performance, Computing, and Communications Conference (IPCCC)
Yu Y, Si X, Hu C, Zhang J (2019) ‘A review ofrecurrent neural networks: LSTM cells and network architectures’, Neuralcomputation, Vol:31(7), Page:1235–1270
Zadeh LA (1988) Fuzzy logic. Computer, Vol: 21(4), Page: 83–93
Zhang Y, Hong J, Cranor L (2007) ‘Cantina: a content-based approach to detecting phishing web sites’, In Proceedings of the 16th International Conference on World Wide Web, Page: 639–648
Zheng F, Yan Q, Leung VCM, Yu R, F., and, Ming Z (2022) ‘HDP-CNN: Highway deep pyramid convolution neural network combining word-level and character-level representations for phishing website detection’, Computers & Security, Vol: 114. doi:10.1016/j.cose.2021.102584
Zhu E, Chen Z, Cui J, Zhong H (2022) MOE/RF: A Novel Phishing Detection Model based on Revised Multi-Objective Evolution Optimization Algorithm and Random Forest. IEEE Trans Netw Serv Manage. doi:10.1109/TNSM.2022.3162885
Zhu E, Chen Y, Ye C, Li X, Liu F (2019) ‘OFS-NN: an effective phishing websites detection model based on optimal feature selection and neural network’, IEEE Access, Vol: 7, Page: 73271–73284
Zuraiq AA, Alkasassbeh M (2019) ‘Phishing detection approaches’, In 2019 2nd International Conference on new Trends in Computing Sciences (ICTCS), Page: 1–6. IEEE

Table 4 is available in the Supplementary Files section.

No competing interests reported.

Appendix.docx

Download PDF

Version 1

posted

You are reading this latest preprint version

Intelligent Methods in Phishing Website Detection: A Systematic Literature Review

Status:

Version 1

Abstract

Figures

1. Introduction

2. Phishing Attack Overview

2.1. History

3. Research Method

4. Background

4.1. Related Works

4.2. Phishing Detection Methods

4.2.1. Comparison-based methods

4.2.2. Intelligent (ML based)

5. Intelligent Approaches In Phishing Detection

5.1. Traditional algorithms

5.2. Deep Learning (DL)

5.3. Fuzzy Logic

6. Current Challenges

6.1. Dataset

6.2. Feature Selection

6.3. Parameter Tuning

6.4. Response Time

7. Conclusion

References

Tables

Additional Declarations

Supplementary Files

Status:

Version 1