Uncovering the Truth: A Deep Learning Ensemble Model for Identifying Fake News

doi:10.21203/rs.3.rs-3946276/v1

Download PDF

Research Article

Uncovering the Truth: A Deep Learning Ensemble Model for Identifying Fake News

https://doi.org/10.21203/rs.3.rs-3946276/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

The rapid outspread of misinformation and its continuous spreading on digital platforms have raised a serious concern due to its ability to create harmful effects. Over the past ten years, fake news has become increasingly popular in Pakistan. Now it’s a challenging task to identify or differentiate among fake news and real news. Several researchers have made tremendous advancements to detect misleading information in previous years, but due to the nature of the problem, there are still several unresolved problems. The main goal of this research is to create the detection dataset for Pakistani news by semantically extracting news data from various sources and through social media platforms. We have categorized the textual properties of news article. To evaluate our proposed dataset, we used various learning algorithms namely Naive Bayes, Support Vector Machine (SVM), Random Forrest, Logistic Regression, Recurrent neural network (RNN), Long short-term memory (LSTM) and Bidirectional Long short-term memory (Bi-LSTM). By merging them, we built an Ensemble Learning classifier (Meta Model) to achieve higher accuracy. Our findings proved that our proposed Deep Ensemble Learning model outperformed others with an accuracy of 89 percent. The outcomes also demonstrated that an ensemble model outperformed the individual Base model. The findings suggested that an ensemble model can maximize not only accuracy but it can also be used on small datasets. The metrics like accuracy, Precision, F1-score and recall are used to measure an effectiveness of each applied model.

Artificial Intelligence and Machine Learning

Fake News Identification

Neural Networks

Meta Classifier

Ensemble Learning

Online Social Networks (OSNs) has been growing rapidly. Social media platform is the main source from where people get and share any type information. The primary benefit of social media is that it is cost-free and fast. Its rapidly growing popularity has also facilitated the wide spread of misinformation causing serious negative effects on society. Hence, it is essential to identify fake news on these platforms to maintain social peace and ensure that users receive accurate information.

There are different terms related to false information or fake news for instance satire, clickbait, biased content, fabricated news, propaganda, conspiracy theory, hoaxes, and propaganda [1]. The term ‘’Fake News’’ has many definitions. Authors of this (Cody Buntain, 2017) [2] paper described fake news as “a news story that is purposefully and patently untrue. And (Vasu Agarwal, 2019) [3] defines the term false news as “a false story created with the intent to mislead”.

Many people on social media are mostly innocent, they can be affected by this false information published on social networking sites. Without knowing the fact of this information or intention of publisher, users may unintentionally share the false information and support that fake news by hit-like and write comment on them. [4] established that false information spread faster and widely than truth in all kind of information. One well-known example spreading fake news famous on internet is fake news played vital role during the US presidential election in 2016. Most of Analytical believed that success of Donald Trump in (2016) United States presidential election was influenced by false information or rumors [5]. In the month of March 2020, there was yet another instance of fake news., lot of Iranians intake ethanol because they got the news on some social media platform that consuming alcohol prevents this COVID-19. Due to this fake news or rumors, around 900 people were hospitalized and 296 died after consuming ethanol in Iran [6].

News differs from country to country and from region to region in terms of its nature and dynamism. Therefore, a dataset of fake news relevant to a specific region or nation is required. A Pakistani dataset can assist in identifying false information about the nation more precisely as compared to other generic datasets.

An automated system is necessary to detect and classify false news on the internet, providing readers with a straightforward way to avoid it. A model can be built by using natural language processing (NLP) and various Artificial intelligence (AI) techniques which can help to identify the fake or real news. Deep learning techniques detect fake news more accurately and effectively than other machine learning methods. In this paper, we have presented an ensemble learning model to identify misleading information. Ensemble learning involves combining multiple machine learning models to improve predictive accuracy and reduce overfitting. Deep learning architectures now outperform classic or shallow models in terms of performance.

A dataset is one of the most essential aspects for an automated model that uses any artificial intelligence (AI) algorithm to quickly identify fake news. Few well-known datasets are LIAR [7], PHEME and Weibo News [8] etc. These well-known datasets include textual news on various subjects with specific attributes like news title, news text, etc. These textual attributes are not enough to define the authenticity of news. And none of these datasets have concentrated on news from a particular area or country. Hence, there is a demand to create a detailed dataset that contains news related to Pakistan. Major contributions of this research are following:

1) We developed a detailed fake news detection dataset that covers Pakistani news.

2) We proposed an Ensemble Deep Learning model to identify fake news. The results of Ensemble Learning models are found to be more acceptable than individual Deep Learning techniques, as shown in Fig. 1.

3) We categorized various textual characteristics of news articles to more accurately classify news as false or authentic, as shown in Fig. 6.

4) We conducted detailed experimentation and evaluation of various Deep Learning methods and compared the performance of all applied methods for evaluating the developed dataset.

5) We used precision, F1-score, accuracy, and recall to assess the performance of each implemented model.

This paper is organized into various sections. Section II reviews the literature on Fake News Detection, while Section III describes the development of the dataset, feature extraction, and experimental setup of the applied techniques. Section IV discusses the results of the study, and the final section presents the findings, conclusion, and future work.

In existing papers, researchers have been proposed various Artificial Intelligence (AI) techniques for fake news detection. Some techniques are briefly described below.

In this research paper [10], author’s classified fake news using RNN-CNN based model. Authors used combination of two models to analyzed linguistics and image news data to determine their authenticity. The use of hybrid approach (RNN-CNN) proved more effective than the use of individual RNN and CNN as it blends their advantages. The proposed model uses a LIAR dataset consists of text data, image data and author information. Authors also mentioned that the proposed model will be 5% more accurate than individual RNN and CNN. The authors did not run any experiment and did not report any results is the limitation of this paper.

This research paper [11], utilized three methods namely BERT, ELMo, and Word2vec for the identification of hyperpartisan news articles. To create the binary classification, deep neural networks were integrated with a machine learning method (Random Forest classifier) and trained using the bi-article dataset. BERT was found to have an advantage over ELMo and Word2vec, as it did not require complete sentences (input), enabling it to decode local terms’ context, which was particularly useful when working with lengthy news articles. However, the BERT model had the lowest accuracy of 0.75 when tested with 400 input sentences, as reported by the author.

[12] This study proposes a novel approach for detecting false news by incorporating speaker’s profiles into an attention-based LSTM approach. Authors used Profile information in two ways. The first method uses profiles as an attention factor while the second method uses profiles as additional input data. For model evaluation author used LIAR dataset. According to their experimental findings, both strategies of using speaker profiles can help to increase the accuracy of false news detection. Actual accuracy and performance measures are missing.

[13] The author in this paper used four deep Neural Networks namely CNN, LSTM, BiLSTM and CLSTM. This attention based Deep Neural Network performed a binary classification of fake news. This novel technique was proposed to address issues such as long training times and overfitting in Deep Neural Networks. These models' effectiveness was examined using a LIAR dataset to validate their effectiveness. According to experimentation results, BiLSTM outperforms the other deep learning models with 70% accuracy.

[14] The author of the paper introduced a machine learning-based architecture to identify fake political speech. The architecture utilized various algorithms such as Naive Bayes, Decision Tree, Support Vector Machine (SVM), and classifiers such as Perception Classifier (MLP), Convolutional Neural Network (CNN), Fasttext, and BERT. The presented model used classification methods for extraction of linguistics features from political speech phrases and from the metadata like subject of speech, location, profile of speaker and speech context information etc. And these extracted features were utilized to trained proposed model. Author used “Liar” dataset to performance evaluation of their model. Author found that the traditional Support Vector Machine (SVM) method has a 74% accuracy rate, 66% accuracy was observed with BERT model using LIAR dataset and 61% with CNN model.

[15] explore unimodal and multimodal approaches for binary classification of fake news. Author used the Fakeddit dataset that contained posts. The posts were divided into six different classes. Author studied various deep learning classifiers for text classification like Convolutional neural network (CNN), Bidirectional encoder representations from transformers and Bidirectional long short-term memory (BiLSTM). Author introduced a CNN architecture that blend both texts and images to classify the fake news for multimodal approach. The accuracy of proposed model was about 87%. And Bidirectional Encoder Representations from Transformers (BERT), which demonstrated accuracy about 78%, is optimum architecture for the unimodal.

[16] Author created fake news detection dataset that contains 1356 news collected from various Twitter’s users and media sources such as Poltifact. This paper compared several techniques, including convolutional neural networks (CNNs), long short-term memory (LSTMs), ensemble methods and attention mechanism. Their results proved that the CNN + bidirectional LSTM ensembled model along with attention mechanism has attained 88.78% accuracy. Authors also proved that ensemble methods give higher accuracy when compared to simpler Deep Learning architectures.A higher accuracy can be achieved by expanding dataset and adding more parameters in the research.

[17] Author developed fake news detector based on multi-model mechanism that utilized a technique known as Majority Voting to get higher accuracy. Some traditional classifiers namely Decision Tree, Logistic Regression, XGBoost, Random Forest, Extra Trees, AdaBoost, SVM, SGD, and Naive Bayes were used. The evaluation results proved that the proposed Majority Voting method attained better outcomes as compared to single traditional classifier.

[18] Author proposed novel ensemble voting approach for identification of misinformation in social media. Author utilized blend of machine and deep learning methods. Their proposed ensemble classifier composed of three techniques such as Convolution Neural Network (CNN), Recurrent Neural Network (RNN), Gated Recurrent Unit (GRU) and Random Forest. Firstly, author estimated the accuracy of four models namely Vanilla, RNN, GRU, LSTM and Random Forest. Then the model with higher accuracy is combined with CNN to achieved better results. Using the LIAR dataset, their model produced positive results with an accuracy of 0.410.

[19] In this paper, social context and news content-based methodologies have been compared and analyzed. For the purpose of detecting fake news, this study employed a deep neural network (Deepfake). The suggested methodology takes into account both the linguistic characteristics of news articles and their interactions with users on social media. To evaluate the proposed method, the authors used the PolitiFact and BuzzFeed datasets. The proposed method achieved 83.33% accuracy.

In this analysis, we separate faux news items from credible news items. We establish, train, and examine the accuracy of different models using 3500 news instances. This section gives detailed information about dataset and various applied approaches.

3.1 Need of Dataset

Different benchmarked datasets for detecting such as LIAR [7] and Fake-or-Real, have been developed. Mostly datasets covered labeled news data across a variety of topics, most of the datasets covered political news [8] [14]. However, there is no detailed dataset which contains news from any specific region, like Pakistan. The first dataset related to Pakistan news was developed [20], but it is not made available to the public and this dataset only contain 344 labeled news samples that were collected from the different websites. As this dataset only comprises a small number of news data, this dataset should not be used as a starting point for developing and testing AI systems that identify fake news. Consequently, a complete dataset must be presented that may contribute in the development and assessment of various machine and deep learning classifiers. This study created the first detailed dataset of fake news, which contains news related to Pakistan as shown in Fig. 7,8.

3.2 Dataset Description

3.2.1 Key Terms used to Collect Data

The data is gathered from variety of sources. While finding Pakistani news some common phrases are utilized. The list of search terms for Pakistani news data is shown in Table 1.

Table 1

Common Terms used to Collect Data Relevant to Pakistani News.
Pakistan PTI Peshawar India-Pakistan Maryam Nawaz Arif Alvi Political Party	Pakistani News Imran Khan Nawaz Sharif MNA Murree Economic Corridor Khyber Pakhtunkhwa	Politics News Zardari Assembly Bani Gala Hyderabad Malala Yousaf Zai Bilawal Bhutto	Islamabad Karachi Pervaiz Musharraf Gilgit Kashmir Pakistan Economics Khyber Pakhtunkhwa

3.2.2 Sources used to Collect Data

Data for the news is compiled from a variety of sources. Several sources are utilized to find Pakistani news, including news websites and fact-checking websites. The list of sources for Pakistani news data is shown in Table 2.

Table 2

Sources Used to Collect Relevant to Pakistan News.
Geo News Bol News PolitiFact BBC News	The News API AFP Factcheck Kaggle Express News	AFP Pakistan Twitter HUM news Pakistan Factchecking Website	Other News Channels Google Fact Checker Facebook 92 News

3.2.3 Method used to Identify Fake News from Multiple Sources

Here's a possible methodology diagram for identifying fake news from an online source:

1) Source evaluation: Verify the credibility of the source by checking the URL, domain, and website.

2) Content analysis: Read the entire article and analyze the content to determine the accuracy and reliability of the information presented.

3) Fact-checking: Verify the accuracy of the facts, quotes, and statistics presented in the news by cross-checking them with other credible sources.

4) Image and video verification: Verify the authenticity of the images and videos presented in the news by using reverse image search and other verification tools.

5) Bias detection: Detect any biases or opinions presented in the news that may influence the accuracy or objectivity of the information.

6) Social media analysis: Check social media platforms for any discussion or analysis of the news to see if it has been widely accepted or rejected.

7) Expert opinion: Consult with experts in the relevant field to gain additional perspective and insight on the news.

3.2.4 Dataset Range

The collection primarily contains political news from Pakistan from 2018 to 2023 as mentioned in Fig. 3.

3.3 Data Preprocessing using NLP

The actions taken to clean and transform raw data into a format that machine learning algorithms can easily understand and evaluate are referred to as preprocessing of data:

3.3.1 Remove Null Values

The content of the news is the most crucial component in the news data for identification of false news. As dataset only contains distinct text data entries so it is important step to eliminate any news records where the text of the news is absent.

3.3.2 Remove Stop Words

The term "stop word" refers to a collection of regularly used words in literature, such as "a," "the," "is," "are," and similar words. This raw text comprises original data that may have an impact on the performance of classification models because we directly obtained the data from tweeter and other sources. The raw news text should therefore be pre-processed before being input into the classification models.

3.3.3 Tokenization

It involves splitting a text into individual words or tokens and removing special characters like punctuation.

3.3.4 Lemmatization

It is the process of stripping words of extraneous letters to get to their core, generally a suffix.

3.3.5 Data Transformation

This entails translating the data into a structure that the machine learning or deep learning algorithm can quickly understand. This can entail scaling the data to make sure that all features are on the same scale or turning qualitative information into numerical values.

3.4 Feature Extraction

This involves selecting the most relevant features that are important for the learning algorithms. The text that has processed is then given to feature extraction, which extracts features like Count Vectorizer and TF-IDF weights.

The process of choosing and converting significant features from raw data into new features that may be utilized as input for machine learning algorithms is referred as extraction of features. The objective of this technique is to generate a more effective representation of the news data that captures the most important information while removing any irrelevant or redundant features. Following techniques has been used for extraction of features.

1. TF-IDF: Term Frequency-Inverse Document Frequency (TF-IDF) is a widely utilized approach in machine learning for retrieving text features, especially to identify false news. This [21] study used TF-IDF combined with a random forest algorithm to identify unreal news.

Using TF-IDF, each term in the text is given a weighted significance score based on how frequently it occurs in the document and across all documents. This helps to identify important terms and reduce the impact of general terms that may not be useful for distinguishing between false and authentic news. To identify phony news, machine learning algorithms can use the TF-IDF scores as input features.

2. COUNT VECTORIZER: Count Vectorizer is another commonly used technique for extraction of features. In this research, the textual data is represented as bag of words using Count Vectorizer, with each document illustrated as a vector of word counts. Then, to identify fake news, machine learning algorithms might employ the word counts as input features.

3. WORD2VEC: Word2Vec is a popular technique used for embedding words as vectors. This approach learns vector representations of words based on their context in a large corpus of text. Word2Vec represents each word as a vector in a high-dimensional region, where words with same contexts are located closer to one another.

In this paper, the text data is transformed using Word2Vec into a numerical format so that it can be utilized as input for machine learning techniques.

The semantic links between words are captured by the Word2Vec embeddings and can be utilized to identify similar words or phrases in the text. As a result, it is easier to recognize patterns and anomalies in the news data, which can be helpful for recognizing false information.

3.5 Dataset Statisics

There are 5k labeled samples in the dataset. The statistics of the generated dataset is presented in table 3. This table demonstrates a disparity between the classes.

Fig.7. Fake News Dataset Sample

True News	False News	Total News	Training Data	Testing Data
3500	1500	5k	3500	1500

20% of the news instances are used for Testing, whereas 80% of the news instances is used for Training.

To evaluate the dataset, numerous traditional and Neural Networks are developed. The experimental setting and detail of models are explained in this section.

4.1 Machine Learning Classifiers

We assessed the effectiveness of seven traditional learning classifiers namely Decision Tree (DT), Passive Aggressive (PS), Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), AdaBoost (AB) and Naive Bayes [22][23]. The detail of these approaches is given below.

1) Decision Tree Classifier

It is a well-known machine learning classifier that follow a tree-like structure to perform detection. We used decision tree algorithm, trained on a labeled fake and real news dataset, where the features and their corresponding labels are used to build the tree. Decision tree classifiers have the advantage of being easy to interpret and explain, which can be important in applications such as fake news detection where transparency and accountability are important [24].

2) Naïve Byes Classifier

It is a commonly used algorithm for text classification tasks. In identification of misinformation, Naive Bayes classifier works by analyzing the text of news articles and determining the probability that a given article is authentic or false based on the occurrence of certain words or phrases [25].

3) Logistic Regression

For binary classification issues, the supervised learning algorithm logistic regression is used. This classifier works by modeling the likelihood of the output variable (e.g., fake or real) given the input features. The output of the classifier is a probability score either 0 or 1, with scores near to 0 suggesting a high likelihood of news article being real and scores near to 1 suggesting a high likelihood of news article being fake.

4) Random Forest

A supervised learning classifier called a Random Forest [24] is used for both classification and regression tasks. In simple terms, a Random Forest classifier works by creating multiple decision trees, each been trained on a different random news sample and then merging the outcome of these trees to make an actual output. Such technique increases the model's accuracy while reducing the effects of overfitting.

5) SVM Classifier

Finding a hyperplane that could most effectively classify the data points into different groups is the basic goal of Support Vector Machine (SVM). The hyperplane is selected in such a way that the margin which is the distance between the hyperplane and the closest data points is maximized. By applying a kernel function to convert the samples into a higher-dimensional space, SVM is capable of handling both linear and non-linear classification issues.

6) Passive Aggressive Classifier

A passive aggressive (PA) can be used for fake news identification by training the classifier on a labeled news articles dataset, in which each article is categorized either real or fake. This algorithm will then learn to differentiate among real and fake news based on attributes such as language used, the sources cited, and the structure of the article. During training, the algorithm uses a passive aggressive update strategy to adjust its weights and biases to better classify each article.

7) AdaBoost

An ensemble learning algorithm known as AdaBoost that merge various base classifiers to build a meta classifier. It gives maximum weights to the examples that were misclassified by the previous weak classifiers, and it focuses on the examples that are difficult to classify.

4.2 DEEP LEARNING CLASSIFIER

Few neural networks like Recurrent neural network (RNN), Bi-LSTM, and Long short - term memory (LSTM) is implemented in this research. The models are built using the ADAM optimizer with a loss function of binary cross entropy and a learning rate of 0.001. The last output layer contains neurons and the sigmoid activation function. Over the course of 10 epochs, these neural networks are trained using a batch size of 64.

1) RNN

A sort of neural network called a recurrent neural network (RNN) is designed to handle sequential data such as time series or natural language. Unlike feedforward neural networks, which process each input independently, RNNs maintain an internal state that allows them to process sequences of inputs and capture temporal dependencies.

Layers used in RNN model.

Embedding Layer: Pass each token through an embedding layer to convert it into a dense vector representation. This can be a pre-trained embedding or an embedding layer that learns the embeddings during training.
Recurrent Layer: Pass the sequence of embeddings through a recurrent layer (e.g., LSTM, GRU) that processes the sequence and captures interrelationship between words.
Attention Layer: Add an attention mechanism to the recurrent layer to bring attention to important aspects of the input sequence. This can strengthen the model's capability to recognize crucial data for identifying between false and authentic news.
Dense Layers: Pass final hidden state of the recurrent layer through one or more dense layers to make a prediction. The output can be a binary classification (fake or real) or a probability score indicating the likelihood of the input being fake news.
Training: Train the neural network using a labeled news dataset optimizing a suitable loss function like binary cross-entropy.

2) LSTM

The issue of vanishing gradients in traditional RNNs was tackled by the creation of the LSTM architecture, which can make it difficult to train these networks on long sequences of data. The key innovation of the LSTM is the addition of memory cells, which can store information over long periods of time and selectively forget or remember information as needed.

Our strategy is to train a LSTM approach on a developed dataset of news articles, both real and fake, and use this model to classify new articles as real or fake. The LSTM model can take in the text of the article and produce a probability score indicating the likelihood that the article is authentic or unauthentic.

The Dropout unit receives an output of the embedding layer and performs calculations as shown in Fig. 11. The Sigmoid activation function was assigned to the main output layer. With batch sizes of 64, the model was trained across 10 epochs. This model's final (average) accuracy is 91.8%.

3) BiLSTM

A subtype of Recurrent neural network (RNN) known as a Bi-LSTM model employs two LSTM layers, one of which analyses the input sequence in a forward direction and the other in a backward direction. The output of two LSTM layers is concatenated to produce a final output that takes into account both the previous and future context of the input sequence.

This model has been found to be efficient to detect fake news, due to their ability to capture both previous and future context. The Bi-LSTM unit receives the output of the embedding layer and performs calculations as shown in Fig. 12. The Sigmoid activation function was assigned to the main output layer. With batch sizes of 64, the model was trained across 10 epochs. This model's final (average) accuracy is 89.9%.

4.3 Ensemble Learning Approach

To create the best possible detection mechanism, we construct an ensemble approach that integrate multiple learning classifiers or models. This detection model is called meta model. When compared to the base learners alone, this meta model performs more effectively. The primary types of ensemble methods are bagging, boosting, and stacking. For deep and machine learning classifiers, we have adopted the ensemble stacking approach.

Ensemble learning approach can be effective for detection of false news, as they combine the predictions of multiple classifiers to improve accuracy and reduce overfitting.

4.4 Stacking Approach

In this approach, a classifier known as meta-classifier receives the final outcomes of base classifiers as input and seeks to discover the best way to combine the input results to get an improved output result.

1) Ensemble Machine Learning Approach

We trained seven machine traditional algorithms namely Decision Tree (DT), Passive Aggressive (PS), Logistic Regression (LR), Support Vector Machine (SVM), AdaBoost (AB), Random Forest (RF) and Naive Bayes [22][23][25] on our developed news dataset and examined the performance as shown in Fig. 13. For implementation of each classifier, we have utilized the scikit-learn python library.

2) Ensemble Deep Learning Approach

We combined three neural networks and trained them on developed dataset as shown in Fig. 14.

This part provided a thorough overview of each model that was utilized, along with performance comparisons. We trained various machine and deep learning classifiers and choose the best classifier. Several performance measures, including accuracy, precision, recall, and F1-score, are utilized to compare and assess the performance measures of different learners. The top-performing classifiers are bold in Table 4,5.

As previously suggested, many classification approaches and methods were used to develop models. Machine learning techniques were implemented using Python, while TF-IDF was used as feature encoding method. For the deep learning model, GloVe embedding was used. Two sections of the dataset were separated, with 80% dedicated to training and 20% dedicated to testing. Figure 1 depicts this (Proposed Diagram). This investigation focuses on binary categorization. [26].

5.1 Machine Learning Classifier Evaluation

Machine Learning classifiers performed well on our developed dataset. The outcomes of each classifier are displayed in Table 4. The performance comparison of classifiers is highlighted in Fig. 15.

After considerable testing, it was found that the Support Vector Machine (SVM) is superior than any other classifiers. SVM is a popular model for situations with binary and multiple categories [27], [28]. By getting the greater Precision and accuracy of 88.9%, the results demonstrate that Random Forrest is the best learning classifier in terms of performance. The outcomes further demonstrate, the performance of the traditional Learning classifiers can be improved by combining them to make ensemble model. Ensemble model performs best as compare to individual machine learning model.

5.2 Deep Learning Classifier Evaluation

In this section the analysis of deep learning approaches is explained in detail. Words were converted into vectors using 200-dimension built-in Glove Embedding. The max_len 1000 and max word 200 parameters were utilized for the deep learning classifiers. Dropout and Dense layer were also included in the LSTM model. The LSTM model has a batch size of 66 and 10 epochs. After training, the model concludes that the LSTM model achieved 91.3% accuracy. Then the RNN and Bi-LSTM model was implemented. The results shows that the RNN is the top-performing model as it achieved 92.3% accuracy and 88 F1-score. Overall, RNN exceeds LSTM and Bi-LSTM in each scenario.

All of the classifiers have an accuracy of more than 80%. As shown in the table and graph, we used TFIDF feature engineering to get the best accuracy.

Afterward we have built ensemble stacking model for machine and deep learning classifiers to maximize accuracy. Our results demonstrate that ensemble model performed better as compared to individual model as shown in Table 5.

The assessment of different deep learning models is shown in Fig. 16. Moreover, the result shows that the deep ensemble model outperformed the other individual deep learning model. In ensemble model we used three deep learning classifiers.

Table 4

Result Comparison of various Machine Learning Classifiers on Proposed Dataset.
Classifier	Accuracy	Precision	Recall	F1-Score
Naive Bayes	85.66	88	86	82
Logistic Regression	88.46	90	88	87
Decision Tree	85.6	85	86	85
Random Forest	88	85	86	85
SVM	87	88	87	86
Passive Aggressive	76.92	78	77	77
AdaBoost	85	85	85	84
Ensemble Machine Learning Model (Proposed)	89.1	89	89	88

Table 5

Result Comparison of various Deep Learning Classifiers on Proposed Dataset.
Classifier	Accuracy	Precision	Recall	F1-Score
LSTM	91.8	89	86	87
Bi-LSTM	89.9	85	86	85
RNN	92.3	91	87	89
Ensemble DL Model (Proposed)	93	92	87	89

Table 6

Comparison of Presented Model with the Machine Learning Baseline Studies.
Citation	Dataset	Approach	Accuracy	Precision	Recall	F1-Score
[26]	Web Scraping	Naïve Bayes Random Forest Logistic Regression	59% 62% 69%	59 62 69	92 71 83	72 67 75
[27]	Web Scraping	Naïve Bayes SVM	63% 75%	68 74	67 74	67 75
[28]	PolitiFact	logistic regression SVM	57.58% 58.68%	--	--	--
[29]	LIAR Dataset	Naive Bayes Decision Tree	60% 50%	60 51	60 51	57 51
[30]	Twitter Dataset	Logistic Regression	62.47%	89	75	81
[31]	Political News Dataset	Random Forrest Naïve Bayes	59% 62%	63 59	51 96	57 73
Current Study	Small dataset	Machine Learning (Ensemble)	87.3%	87	87	87

Table 7

Comparison of Presented Model with the Deep Learning Baseline Studies.
Citation	Dataset	Model /Approach	Accuracy
[13]	LIAR	CNN, LSTM, BiLSTM and CLSTM	70%
[16]	Twitter Data	CNN + bidirectional LSTM ensembled network	88%
[21]	PHEME	Hybrid CNN and RNN LSTM	80.38% 82%
[26]	Web Scraping	RNN LSTM	74% 78%
[28]	LIAR	LSTM	61%
[30]	PolitiFact	LSTM BiLSTM bidirectional LSTM + LSTM ensembled model	80.62% 83.81% 86.89%
[31]	LIAR	LSTM, BiLSTM	70%
Current Study	Small dataset	Machine Learning (Ensemble)	93%

Algorithm 1: Algorithm for Ensemble Model

Input: Training dataset (News Text) X = {(x1, y1), (x2, y2), ..., (xn, yn)}

(xi is an input feature vector and yi is the corresponding output label)

Output: Y = Ensemble Model that can identify whether news is Real or Fake.

Result: Real or Fake News

1. filteredData ← preprocess (Labeled Real and Fake News)

2. tokenizedData ←tokenize(filteredData)

3. embeddings ← TF-IDF_Vectors (inputSequence)

4. outputSequence ←

5. combined_deep_model = Neural Network (embeddings, modelLSTM, modelRNN, modelBi-LSTM)

6. ls_model = LogisticRegressionClassifier()

7. return combined_deep_model, ls_model

8. Ensemble_model = (combined_deep_detection + ls_detection)

return detection answer as Real or Fake

In this study, we presented an ensemble stacking methodology to detect fake news. The major contribution of this study is the collection and assessment of the labeled dataset for the detection of fake news in Pakistan, which consists of 5,000 news samples, 3500 of which are authentic and 1500 of which are fake news. In evaluate the developed dataset, we employed well-known machine learning techniques like Decision Tree, Logistic Regression, Random Forest, AdaBoost, Support Vector Machine, Naive Bayes, and Deep Learning classifiers like Long short-term memory (LSTM), Recurrent neural networks (RNN), and Bidirectional Long short-term memory (Bi-LSTM). We used Count Vectorizer Features and TF-IDF Features. In order to get more accurate outcomes, we constructed a multi-model fake news detection system using Ensemble Stacking approach. We built two ensemble classifiers. Firstly, Deep ensemble classifier that merge the three classifiers namely RNN, LSTM and Bi-LSTM. The experimental results demonstrate that, comparing to other individual deep learning models, our suggested Deep Ensemble technique obtained accuracy of 93%. Secondly, Ensemble Machine learning approach that combines the five traditional learning classifiers namely Decision Tree, Logistic Regression, Random Forest, AdaBoost, Naive Bayes and achieved accuracy of 89%. It can be seen from a comparison of the two ensemble strategies that the deep ensemble model outperformed the others in terms of accuracy.

The produced dataset can be enhanced over time to achieve balance between the two classes. The news label that is either Fake or True in most dataset, can be divided into several classes.

Hangloo, S., & Arora, B. (2021). Fake News Detection Tools and Methods--A Review. arXiv preprint arXiv:2112.11185. [arXiv paper]
Buntain, C., &Golbeck, J. (2017, November). Automatically identifying fake news in popular Twitter threads. In 2017 IEEE International Conference on Smart Cloud (SmartCloud) (pp. 208- 215). IEEE. [conference paper].
Agarwal, V., Sultana, H. P., Malhotra, S., & Sarkar, A. (2019). Analysis of classifiers for fake news detection. Procedia Computer Science, 165, 377-383. [journal paper]
Lazer, D. M., Baum, M. A., Benkler, Y., Berinsky, A. J., Greenhill, K. M., Menczer, F., ... & Zittrain, J. L. (2018). The science of fake news. Science, 359(6380), 1094-1096. [journal paper]
Sastrawan, I. K., Bayupati, I. P. A., & Arsa, D. M. S. (2021). Detection of fake news using deep learning CNN–RNN based methods. ICT Express. [journal paper]
Garg, S., & Sharma, D. K. (2020, December). Phony News Detection using Machine Learning and Deep-Learning Techniques. In 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART) (pp. 27-32). IEEE. [journal paper]
Wang, W. Y. (2017). ” Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (p. 422426). [conference paper]
Ke Li. (2022). The preprocessed Weibo and PHEME datasets for fake news detection. IEEE Dataport. https://dx.doi.org/10.21227/5k7z-xx31. [dataport]
Kareem, I., & Awan, S. M. (2019, November). Pakistani media fake news classification using machine learning classifiers. In 2019 International Conference on Innovative Computing (ICIC) (pp. 1-6). IEEE. [conference paper]
Abbas, Q., Zeshan, M. U., & Asif, M. (2022, January). A CNN-RNN Based Fake News Detection Model Using Deep Learning. In 2022 International Seminar on Computer Science and Engineering Technology (SCSET) (pp. 40-45). IEEE. [conference paper]
Naredla, N. R., & Adedoyin, F. F. (2022). Detection of hyperpartisan news articles using natural language processing technique. International Journal of Information Management Data Insights, 2(1), 100064. [journal paper]
Long, Y., Lu, Q., Xiang, R., Li, M., & Huang, C. R. (2017, November). Fake news detection through multi-perspective speaker profiles. In Proceedings of the eighth international joint conference on natural language processing (volume 2: Short papers) (pp. 252-256). [conference paper].
Ramya, S. P., & Eswari, R. (2021). Attention-Based Deep Learning Models for Detection of Fake News in Social Networks. International Journal of Cognitive Informatics and Natural Intelligence (IJCINI), 15(4), 1-25. [journal paper]
Purevdagva, C., Zhao, R., Huang, P. C., & Mahoney, W. (2020, December). A machine-learning based framework for detection of fake political speech. In 2020 IEEE 14th International Conference on Big Data Science and Engineering (BigDataSE) (pp. 80-87). IEEE. [conference paper].
Alonso-Bartolome, S., & Segura-Bedmar, I. (2021). Multimodal Fake News Detection. arXiv preprint arXiv:2112.04831. [arXiv paper]
Kumar, S., Asthana, R., Upadhyay, S., Upreti, N., & Akbar, M. (2020). Fake news detection using deep learning models: A novel approach. Transactions on Emerging Telecommunications Technologies, 31(2), e3767. [journal paper]
Patil, D. R. (2022). Fake News Detection Using Majority Voting Technique. arXiv preprint arXiv:2203.09936. [arXiv paper]
Girgis, S., & Amer, E. (2022, May). A Proposed Ensemble Voting Model for Fake News Detection. In 2022 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC) (pp. 316-322). IEEE. [conference paper].
Kaliyar, R. K., Goswami, A., & Narang, P. (2021). DeepFakE: improving fake news detection using tensor decomposition-based deep neural network. The Journal of Supercomputing, 77(2), 1015-1037. [journal paper]
Kareem, I., & Awan, S. M. (2019, November). Pakistani media fake news classification using machine learning classifiers. In 2019 International Conference on Innovative Computing (ICIC) (pp. 1-6). IEEE. [conference paper].
Jehad, R., & Yousif, S. A. (2020). Fake news classification using random forest and decision tree (j48). Al-Nahrain Journal of Science, 23(4), 49-55. [journal paper]
Pandey, S., Prabhakaran, S., Reddy, N. S., & Acharya, D. (2022). Fake News Detection from Online media using Machine learning Classifiers. In Journal of Physics: Conference Series (Vol. 2161, No. 1, p. 012027). IOP Publishing. [conference paper].
Islam, N., Shaikh, A., Qaiser, A., Asiri, Y., Almakdi, S., Sulaiman, A., ... & Babar, S. A. (2021). Ternion: An Autonomous Model for Fake News Detection. Applied Sciences, 11(19), 9292. [journal paper]
Jehad, R., & Yousif, S. A. (2020). Fake news classification using random forest and decision tree (j48). Al-Nahrain Journal of Science, 23(4), 49-55. [journal paper]
Senhadji, S., & San Ahmed, R. A. (2022). Fake news detection using naïve Bayes and long short term memory algorithms. IAES International Journal of Artificial Intelligence, 11(2), 746. [journal paper]
Sharma, U., Saran, S., & Patil, S. M. (2020). Fake news detection using machine learning algorithms. International Journal of Creative Research Thoughts (IJCRT), 8(6), 509-518. [journal paper]
Nagaraja, Arun, et al. "Fake News Detection Using Machine Learning Methods." International Conference on Data Science, Elearning and Information Systems 2021. 2021 [conference paper]
Truong, T. C., Diep, Q. B., Zelinka, I., & Senkerik, R. (2020). Supervised classification methods for fake news identification. In Artificial Intelligence and Soft Computing: 19th International Conference, ICAISC 2020, Zakopane, Poland, October 12-14, 2020, Proceedings, Part II 19 (pp. 445-454). Springer International Publishing. [conference paper]
Junaed Younus Khan, Md Tawkat Islam Khondaker, Sadia Afroz, Gias Uddin, and Anindya Iqbal. A benchmark study of machine learning models for online fake news detection. Machine Learning with Applications, 4:100032, 2021. [journal paper]
Kumar, S., Asthana, R., Upadhyay, S., Upreti, N., & Akbar, M. (2020). Fake news detection using deep learning models: A novel approach. Transactions on Emerging Telecommunications Technologies, 31(2), e3767. [journal paper]
Reddy, H., Raj, N., Gala, M., & Basava, A. (2020). Text-mining-based fake news detection using ensemble methods. International Journal of Automation and Computing, 17(2), 210-221. [journal paper]

The authors declare no competing interests.

Download PDF

Version 1

posted

You are reading this latest preprint version

Uncovering the Truth: A Deep Learning Ensemble Model for Identifying Fake News

Status:

Version 1

Abstract

Figures

1 Introduction

2 Literature Review

3 Our Methodology

3.1 Need of Dataset

3.2 Dataset Description

3.2.1 Key Terms used to Collect Data

3.2.2 Sources used to Collect Data

3.2.3 Method used to Identify Fake News from Multiple Sources

3.2.4 Dataset Range

3.3 Data Preprocessing using NLP

3.3.1 Remove Null Values

3.3.2 Remove Stop Words

3.3.3 Tokenization

3.3.4 Lemmatization

3.3.5 Data Transformation

3.4 Feature Extraction

3.5 Dataset Statisics

4 Artificial Intelligence Approaches

4.1 Machine Learning Classifiers

1) Decision Tree Classifier

2) Naïve Byes Classifier

3) Logistic Regression

4) Random Forest

5) SVM Classifier

6) Passive Aggressive Classifier

7) AdaBoost

4.2 DEEP LEARNING CLASSIFIER

1) RNN

2) LSTM

3) BiLSTM

4.3 Ensemble Learning Approach

4.4 Stacking Approach

1) Ensemble Machine Learning Approach

2) Ensemble Deep Learning Approach

5 Experimental Setup and Evaluation

5.1 Machine Learning Classifier Evaluation

5.2 Deep Learning Classifier Evaluation

6 Conclusion and Future Studies

References

Additional Declarations

Status:

Version 1