Systematic Literature Review: The Influence and Effectiveness of Deep Learning in Image Processing for Emotion Recognition

doi:10.21203/rs.3.rs-3856084/v1

Download PDF

Research Article

Systematic Literature Review: The Influence and Effectiveness of Deep Learning in Image Processing for Emotion Recognition

https://doi.org/10.21203/rs.3.rs-3856084/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

In the current digital era, image processing and Emotion Recognition are important topics in the field of artificial intelligence. Deep learning, as one of the most widely used AI techniques in pattern recognition, has shown great potential in addressing these challenges. This research employs a Systematic Literature Review method to collect and analyze previous studies related to deep learning algorithms, namely Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), allowing the researchers to conclude efficient deep learning methods for emotion recognition through image processing. This paper has the result that most studies used CNN to identify emotion from facial expressions, while some studies used RNN. Furthermore, some researchers used combined CNN and RNN to identify emotion from images. Based on the analysis of this research, it is recommended that further studies to take a more holistic approach by considering a wider range of indicators that can be used as signs or signals to analyze a person's emotions. This approach allows for a comprehensive understanding of emotions from multiple perspectives.

Deep Learning

Facial Recognition

Image Processing

Emotion Recognition

Systematic Review

The way industries operate has been greatly transformed by technological innovations such as chatbots, artificial intelligence, and robotics [1][2]. In the present time, an intelligent system relies heavily on machine learning methods. With its algorithms, the system learns from specific training data, automating the creation of analytical models and solving particular, more specific tasks [3]. Human cognitive problems like learning, problem-solving, and pattern recognition can now be addressed by AI [4][5]. AI encompasses various development methods, one of which is Machine Learning, within which Deep Learning techniques exist to accelerate learning processes [6]. Deep learning techniques are capable of detecting generic objects. The goal of object detection is to determine whether specific objects, such as human faces or certain animals, are present in an image. As a computer vision task, object detection provides a foundation for accomplishing high-level and complex observation tasks, such as segmentation, understanding scenes, object tracking, image captioning, event detection, and activity recognition [7].

The utilization of deep learning methods in implementing Facial Expression Recognition (FER) has been widely employed and developed. This method is chosen because deep learning possesses the capability to learn from data processed by deep learning itself. Therefore, the more data processed, the better the results that can be produced by deep learning [8][9].

With the advancement of deep learning technology, numerous research studies have focused on facial identification for various purposes. One of these purposes is the recognition of emotions in individuals. One of the most effective, natural, and universal ways for individuals to express their emotions, feelings, and thoughts, regardless of age, gender, ethnicity, or social background, is through facial expressions. Emotions can be defined as the physical and mental states that assist individuals in adapting to various situations [10]. That is why the recognition of emotions through facial expressions can provide valuable insights into human life. By identifying emotions, people can further identify the factors that can cause happiness, sadness, joy, fear, disgust, and other forms of emotions. From there, individuals can start considering efforts to enhance, limit, or regulate these emotions, as well as other subsequent actions that impact the improvement of quality of life and performance in various areas of life.

One relevant research study that can be referenced is conducted by Guiping Yu from Eastern Liaoning University, China. The study focuses on face recognition and emotion recognition algorithms to monitor the emotions of preschool children. In the research, a total of 42 facial samples of children were used, with each child having 1500 continuous facial photos taken simultaneously. These samples were analyzed using a combination of deep learning algorithms, specifically Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). The study also contributes to online learning models, where educators can not only track students' attendance but also recognize their mental states and learning attitudes in the class [11].

Indeed, the mentioned study utilized CNN, but there are several other algorithms available for emotion recognition in human faces. However, determining the best algorithm for emotion recognition remains challenging [12][13]. Therefore, this research aims to explore commonly used deep learning algorithms for facial emotion recognition and analyze their performance in emotion recognition. The study will also discuss the practical applications of these algorithms in daily life.

This research employs the Systematic Literature Review (SLR) method by examining previous articles that meet the criteria related to deep learning and emotion recognition. The articles were gathered from websites such as Google Scholar, Mendeley, ResearchGate, Hindawi, and IEEE Xplore. The selected articles for this research were published between 2019 and 2023. The research consists of five sections. The first section is the introduction, which provides the background and an overview of the article. The second section discusses related works, including previous projects or research related to the application of deep learning for image analysis. The third section presents the methodology used in this scientific article. The fourth section focuses on the discussion of the literature study results and the arguments built around the implementation of deep learning. Finally, the fifth section comprises the conclusion and recommendations related to this research for its future development.

Based on the research on facial emotion recognition using deep learning conducted by Ms. Manju Lata Joshi, it is stated that facial recognition is an important and intriguing research area for recent researchers. The Facial Expression Recognition (FER) method is divided into two categories: Dynamic Sequence FER and Static Image FER. According to the survey, it is mentioned that the majority of studies on facial emotion recognition are based on the Facial Action Coding System (FACS). The Facial Action Coding System (FACS) was initially developed by Paul Ekman and encompasses two approaches: Action Unit (AU) and Facial Point. From these observations, it is concluded that facial expression recognition is a highly complex research area. Additionally, deep learning is widely used in this field due to its powerful learning capabilities [8].

Based on the research conducted by Rohan Borgalli et al., which focuses on the application of CNN architectures to train and test the architecture with the best accuracy in processing data using static image datasets containing various facial expressions. This paper utilizes the main Deep Learning frameworks, including custom CNN and standard CNN architectures such as VGG and InceptionnetV2, known for image classification applications. The datasets used in the research are divided into two main types: lab-controlled databases, which consist of clearer images such as the RAFD and KDEF databases, and non-lab-controlled databases, which are gathered from publicly available images, resulting in less clear and focused images such as the SFEW, AMFED+, and RAF-DB databases. These datasets are classified into 6 to 8 basic emotions. The evaluation method employed is accuracy, where the obtained scores reflect the sample class. The experiments show that the facial expression recognition accuracy on lab-controlled datasets, namely RAFD and KDEF, is higher compared to the non-lab-controlled datasets RAF-DB, SFEW, and AMFED+, with accuracy rates of 68.84% for RAFD, 99.63% for KDEF, 75.26% for RAF-DB, 40.78% for SFEW, and 54.13% for AMFED+. The results of this research can be implemented using Facial Action Units (AUs) in Facial Expression Recognition (FER) [12].

In a study conducted by Wen-Tsai Sung et al., the You Only Look Once (YOLO) method is utilized for object detection. This experiment is applied to address COVID-19 by monitoring its spread using an image recognition system combined with an infrared thermal sensor to monitor human body temperature. The YOLO v4 object detection method is employed to train a Convolutional Neural Network (CNN) for facial and mask recognition, achieving a high accuracy of 81% with low computational cost [14].

Furthermore, there has been research by Soumyya Kanti Datta et al. discussing the use of Soft-Attention mechanisms in Deep Learning architectures implemented in a clinical application for skin cancer identification and classification. The study focuses on image clustering and compares the performance of VGG, ResNet, Inception ResNet v2, and DenseNet architectures with and without Soft-Attention mechanisms in classifying skin cancer. The results indicate that the use of Soft-Attention in the original networks significantly improves performance, and the Soft-Attention mechanism also provides additional benefits by internally handling noise in the images. Overall, this research demonstrates the potential and effectiveness of Soft-Attention-based deep learning architectures in image analysis, particularly in skin cancer classification. The implementation of this model can assist dermatologists in dermoscopy systems and has the potential to be used in classifying data from other medical databases [15].

The systematic literature review methodology is employed in this study. This methodology involves iterative analysis to gather and analyze data. Before conducting the study, research questions are formulated. Subsequently, the PRISMA (Preferred Reporting Items for Systematic Review and Meta-Analysis) methodology is utilized to collect data relevant to these research questions from journals, conference proceedings, and literature reviews [16].

The research questions and objectives of the study can be seen in Table I.

Table 1

Research Question
RQ	Questions	Purpose
RQ1	What algorithm is most commonly used for image detection?	To determine the algorithm that is frequently utilized for image detection.
RQ2	Which algorithm performs well in image processing in recognizing emotion from facial expressions?	To identify the deep learning algorithms that exhibit good performance in image processing.
RQ3	What is the implementation of deep learning in image processing for emotion recognition?	To explore the implementation of deep learning in image processing for recognizing human emotions.

To answer the Research Question in Table 1, in this study, we searched for data from several databases using keywords such as "deep learning" and "emotion recognition" with a publication year ranging from 2019 to 2023. The number of journals found can be seen in Table II.

Table 2

Database Journal Research Sources
Database Journal	Number Article
Google Schoolar	17.500
Mendeley	2826
ScienceDirect	9130
IEEE Xplore	2023
National Library of Medicine	362
Total Journal: 31.841

The articles that were found were filtered to remove duplicate papers and those that did not have full-text articles. Then, screening was performed again based on the title and abstract of the research papers to eliminate irrelevant papers according to the criteria as seen in Table III.

Table 3

Criteria Article
Criteria
Inclusion	Articles related to the use of Deep Learning in Image Processing for Emotion recognition.
	Published within the range of 2019 to 2023.
	Written in English.
Exclusion	Duplicate articles from different databases.
Exclusion	Articles that are not relevant to the use of Deep Learning in Image Processing for Emotion recognition.

Based on Table 3, the criteria for article selection include articles related to the use of Deep Learning in Image Processing for Emotion recognition, published within the range of 2019 to 2023 and written in English. However, there are two criteria for exclusion such as duplicate articles from different databases and Articles that are not relevant to the use of Deep Learning in Image Processing for Emotion recognition.

By applying these criteria, the complete texts of relevant studies have been filtered and selected for further analysis. These selected articles will be used to answer the research question stated in Table 1. In the first research question, this study uses 31 articles to answer the question about what algorithms are commonly used to perform image recognition. for the second research question, this study uses 9 articles to answer the question about which algorithm has good performance for processing images. Finally, this study found 4 articles to answer questions about the application of deep learning in image processing for emotion recognition.

RQ1. What algorithm is most commonly used for image detection?

Table IV shows the frequent use of deep learning algorithms in emotional recognition in research.

Table 4

Algorithm by Research Paper
Algorithm	References
CNN	[17], [18], [19], [20], [21], [22], [5], [23], [24], [25], [26], [27], [28], [5], [4], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38]
CNN + GoogleNet	[39], [40], [41], [42]
CNN + VGG 19	[43], [44], [27], [28], [5], [42]
CNN + VGG 19 + Resnet-50	[45], [46], [4], [28], [41], [47]
CNN + FCN	[48], [49]
TDNN	[50]
R-CNN	[51], [52], [53]
CBAM + ResNet	[54], [55]
CNN + RNN (LSTM/BiLSTM)	[11], [56], [57], [58], [59], [60], [61]
DNN	[62], [63], [5]
SVM + VGG	[64]
SHCNN	[65]

Based on the research collected regarding emotion recognition from facial expressions, most of them utilize a Convolutional Neural Network (CNN) [66]. VGG, ResNet34, GoogleNet, and R-CNN are all CNN-based model developments [66][67][68]. However, the ability of CNN to detect emotions from faces still relies on the dataset, architecture, model, and training techniques. That's why, to improve the detection accuracy in the aforementioned research examples, some studies combine CNN with other related technologies such as FCN, DBN, Image Edge Computing, Transfer Learning, CRBM, and CBAM. Fully Convolutional Network (FCN) is a development of VGG [68][69][36][35].

There is also SHCNN which uses Leaky ReLU to avoid the "Dead ReLU problem" which can bring better convergence on the dataset [65]. CNN can also combine with CRBM and Transfer Learning. This combination aims to address feature extraction complexity in the target dataset. Pre-training with CRBM helps overcome content differences among datasets, while replacing the fully connected layer with CRBM in the CNN model during the transfer learning stage enhances the ability to recognize abstract features, particularly in facial expression recognition in environments with complex backgrounds. This method successfully improves the effectiveness of feature recognition in the target dataset, as demonstrated by experimental results indicating the effectiveness and feasibility of this hybrid transfer learning approach [35].

CBAM, or Convolutional Block Attention Module, is a specialized attention mechanism designed to enhance the capabilities of Convolutional Neural Networks (CNNs) in capturing important characteristics and interactions within images [70]. Additionally, TDNN is a type of neural network architecture used in signal processing and speech recognition [71]. Apart from CNN's dominance in emotion recognition across several studies, Recurrent Neural Network (RNN), specifically with its LSTM model, is also used for developing learning models to recognition emotions based on facial expressions [57].

RQ2. Which algorithm performs well in image processing in recognizing emotion from facial expressions?

In Table 5, the average recognition rates of algorithms for emotion recognition are presented.

Table 5

Average Rate Result of Accuracy from Algorithm
Algorithm	Recognition emotion accuracy average	References
CNN	97.75%	[37]
CNN + GoogleNet	75.09%	[39]
CNN + VGG 19	95.35%	[43]
CNN + VGG 19 + Resnet-50	95.39%	[46], [4]
CNN + FCN	80.60%	[49]
TDNN	90.00%	[50]
R-CNN	82.38%	[72]
CBAM + ResNet	88.27%	[55]
CNN + RNN (LSTM/BiLSTM)	99.43%	[59]

According to Table 5, the emotion recognition algorithm with the lowest accuracy is the combination of CNN and GoogleNet. This algorithm has a lower accuracy percentage compared to other algorithms. This could be due to the quantity of training and testing datasets used, which may have resulted in poorer performance for the GoogleNet algorithm [39]. On the other hand, the algorithm with the highest accuracy is CNN + RNN (LSTM/BiLSTM). This is because data augmentation is performed on the image data before the analysis, which improves the performance of the trained data and helps address the issue of overfitting [59].

Based on the analysis above, it can be said that several factors that can influence the accuracy of algorithms for analyzing human emotions. These factors include the quantity of training and testing datasets, data augmentation, and the features present in the LSTM algorithm, such as the Global Feature Attention Layer. In the context of using the Soft-Attention mechanism in Deep Learning architectures, this mechanism allows the model to focus attention on important parts of the image or other input data. In this process, an attention distribution is used to determine the weights assigned to each hidden state generated by the model. These weights are then used to calculate the weighted average result of the hidden states, which reflects the most relevant or important information in the processed image or data and improves the classification process of the model [59].

RQ3. What are the applications of deep learning in image processing for emotion recognition?

The following table VI presents data showing the application of deep learning algorithms based on relevant paper references.

Table 6

Application of Algorithm Based on Research
Application	Forms of Application	References
Education	The application of emotional recognition is used to determine students' interest in a particular subject.	[11], [73]
Robotic	The human face and emotional recognition that embeds into a robot	[13], [61]
Automotive	Monitoring of driver’s emotional state when driving	[74]

From the table above, several applications of facial detection technology in everyday life can be further explained as follows:

In the field of education, research is divided into two parts based on the online and onsite learning models. In the onsite learning model, surveillance cameras are installed inside the classroom to record the actions and expressions of students. Guiping Yu applied a more comprehensive method to identify students' emotions by utilizing information from their faces, body movements, and contextual cues to enhance facial emotion recognition. The face identification and pre-processing process were conducted on a dataset of images, and the faces were ultimately identified and processed using a computer. To gather continuous video data, surveillance cameras were installed in the classrooms where the students were present. These videos were extracted at a certain frame per second (FPS). Then, the facial images were cropped and underwent pre-processing steps such as face localization, alignment, grayscale conversion, and scale normalization. Due to the low-quality or noisy nature of the provided images, these pre-processing steps were crucial for the expression recognition system. Compared to static images, detecting faces in video surveillance scenarios presents greater challenges [11].

On the other hand, in the online learning model, activity recording is done as usual in online learning using the camera on the students' devices. The recorded data is then analyzed and identified. Research in the field of education is also conducted by Swadha Gupta and colleagues. They designed a student engagement detection system that utilizes facial emotions to detect student engagement in real-time scenarios. The application scenario includes the following steps:

First, facial emotion information captured by the cameras of each device is used to evaluate online student engagement.

Face detection is performed using a pre-trained Faster R-CNN model.

A modified landmark extractor called MFACEXTOR extracts 470 facial landmark points or key points.

For real-time learning scenarios, deep learning models such as Inception-V3, VGG19, and ResNet-50 are used to classify student emotions such as anger, sadness, happiness, neutrality, and other emotions using the softmax function.

An engagement evaluation algorithm is proposed, which utilizes the output of facial emotion classification to determine the engagement index.

The system determines online student engagement based on the engagement index value.

This research focuses on using facial emotions as a means to detect student engagement in real-time learning scenarios, utilizing various deep learning models and algorithms for emotion classification and engagement evaluation [73].

Furthermore, in the field of robotics, Tzuu-Hseng S. Li et al. underscore the crucial role of emotion recognition in advancing human-robot interaction (HRI). The authors elaborate on how emotions, involving cognitive appraisal, bodily language, action tendencies, expressions, and feelings, become integral elements in human interaction, allowing individuals to convey thoughts without words. To address challenges in classifying facial expressions under different conditions, they propose the use of the Facial Action Coding System (FACS) as a practical solution. FACS measures human facial movement based on muscle actions, decomposing facial expressions into component actions that can be further applied. The study advocates for the use of six basic emotions (happiness, anger, disgust, fear, sadness, and surprise) as the foundation for emotion recognition. To enhance human-robot interaction, the authors propose an emotion recognition system based on deep neural networks, specifically the Convolutional Neural Network (CNN) trained with static images, and the Long Short-Term Memory (LSTM) network to capture temporal and contextual information in dynamic facial expressions. The use of transfer learning is introduced to overcome traditional machine learning limitations. The research results in a CNN and LSTM-based model for facial emotion recognition, incorporating transfer learning concepts and validated through experiments with a humanoid robot [61].

Not only in education and robotics but the application of deep learning for emotion recognition is also carried out in the automotive field. Mira Jeong and other researchers have developed a previous technology called Advanced Driver Assistance System (ADAS) with a research focus on autonomous vehicles. Autonomous vehicles have the advantage of Driver State Monitoring (DSM) system. The research related to DSM consists of three types: 1) analysis of driving patterns based on movement data; 2) evaluation of psychophysiological states based on sensor information; and 3) image analysis inside the vehicle based on camera sensors. The last method utilizes facial images captured by cameras installed in the vehicle to identify the driver's condition. This image-based DSM approach aims to comprehensively recognize the driver's emotional state and prevent accidents caused by fatigue or drowsiness. It also aims to make driving more comfortable. Furthermore, the driver's status can be monitored by the DSM method to provide information about the appropriate timing for transitioning control from autonomous mode to manual mode when needed. Additionally, detecting the driver's facial expressions in autonomous vehicles can prevent passengers from getting motion sickness and create a more comfortable journey by adjusting the vehicle's ambiance according to the driver's facial expressions. The research conducted by Mira Jeong and her colleagues focuses on developing Advanced Driver Assistance System (ADAS) technology for autonomous vehicles. The Driver State Monitoring (DSM) system utilizes facial images captured by in-vehicle cameras to analyze the driver's emotional state, prevent accidents caused by fatigue, and enhance driving comfort. This approach aims to improve the overall driving experience in autonomous vehicles by monitoring the driver's condition and adjusting the vehicle's mode accordingly [74].

Thus, deep learning technology provides tremendous positive impacts on human life, and if further developed, it will undoubtedly benefit more areas of life, especially with this emotion recognition technology.

Based on the systematic literature review conducted in this research, it was found that the majority of studies related to image processing for identifying emotions from facial expressions utilize Convolutional Neural Network (CNN), while some studies employ Recurrent Neural Network (RNN). Furthermore, there are research works that combine CNN and RNN, as well as others that utilize CNN-based models.

Several factors can influence the accuracy of emotion identification, including the quantity of training and test datasets, data augmentation, and the integration of a global feature attention layer. From this research, CNN + RNN (LSTM/BiLSTM) achieved the highest average accuracy of 99.43% [59]. On the other hand, CNN + GoogleNet had the lowest accuracy rate of 75.09% [39]. Deep learning with its machine learning algorithms has emerged as a cutting-edge discovery in the present time. The resulting machine learning has facilitated broader and more specific human-computer interactions. Various technological approaches have applied deep learning algorithms to solve various problems in human life such as in education, robotics, and automotive. In the field of education, emotion recognition is used to analyze the interests and engagement of learners [11][73]. Meanwhile, in robotics, it is employed in humanoid robots to foster deeper interactions between robots and humans [13]. As for the automotive industry, this technology is applied to enhance comfort and safety during driving [74].

Based on the analysis of this research, it is recommended that future studies focus on the development of deep learning for emotion recognition, not only based on image processing but also utilizing other indicators such as sound, heart rate and respiration, body movement, body temperature, and even cognitive activities of individuals. By considering these indicators, research in emotion recognition can employ a holistic and comprehensive approach to understanding a person's emotions through various signs and signals.

Author Contribution

I Putu Ronny Eka Wicaksana:1. Search for journals and references to address the research question related to the discussed topic.2. Combine various sources related to the implementation of deep learning for human emotion recognition.3. Design a methodology in accordance with the systematic literature review.4. Research effective algorithms for emotion detection.Gabriel Rolly Davinsi:1. Provide data.2. Search for references and journals.3. Explore algorithms commonly used for image detection.4. Develop the structure of the writing.5. Research effective algorithms for emotion detection.Muhammad Aris Afriyanto:1. Provide data from various sources.2. Summarize information based on specified sources.3. Format the writing of the journal.4. Research effective algorithms for emotion detection.Antoni Wibowo:1. Guide throughout the journal creation process and provide guidance during the publication process.Puti Andam Suri:1. Guide throughout the journal creation process.

K. Sahoo, A. K. Samal, J. Pramanik, and S. K. Pani, “Exploratory data analysis using Python,” Int. J. Innov. Technol. Explor. Eng., vol. 8(12), 2019.
A. Shrestha and A. Mahmood, “Review of Deep Learning Algorithms and Architectures,” IEEE Access, vol. 7, pp. 53040–53065, 2019.
C. Janiesch, P. Zschech, and K. Heinrich, “Machine learning and deep learning,” Electron Mark., vol. 31, pp. 685–695, 2021.
L. V. L. Pinto et al., “A Systematic Review of Facial Expression Detection Methods,” IEEE Access, vol. 11, 2023.
J.-H. Kim, B.-G. Kim, P. P. Roy, and D.-M. Jeong, “Efficient Facial Expression Recognition Algorithm Based on Hierarchical Deep Neural Network Structure,” IEEE Access, vol. 7, pp. 41273–41285, 2019.
M. Woschank, E. Rauch, and H. Zsifkovits, “A Review of Further Directions for Artificial Intelligence, Machine Learning, and Deep Learning in Smart Logistics,” Sustainability, vol. 12, no. 9, p. 3760, 2020, doi: https://doi.org/10.3390/su12093760.
L. Liu et al., “Deep Learning for Generic Object Detection: A Survey,” Int. J. Comput. Vis., 2019.
M. L. Joshi and S. Agarwal, “Facial Emotion Recognition using Deep Learning A Survey,” OORJA, 2022, [Online]. Available: https://www.researchgate.net/publication/360889041_Facial_Emotion_Recognition_using_Deep_Learning_A_Survey
Y. He, “Multi-Branch Attention Convolutional Neural Network Based On A Multiple-Branch Structure,” IEEE Access, vol. 11, pp. 1244–1253, 2022.
J. X.-Y. Lek and J. Teo, “Academic Emotion Classification Using FER: A Systematic Review,” Hum. Behav. Emerg. Technol., vol. 2023, 2023.
G. Yu, “Research Article: Emotion Monitoring for Preschool Children Based on Face Recognition and Emotion Recognition Algorithms,” Hindawi Complex., vol. 2021, p. 12, 2021, [Online]. Available: https://doi.org/10.1155/2021/6654455
R. Borgalli and S. Surve, “Deep Learning Framework for Facial Emotion Recognition using CNN Architectures,” 2022 Int. Conf. Electron. Renew. Syst., pp. 1777–1784, 2022.
S. Dwijayanti, M. Iqbal, and B. Y. Suprapto, “Real-time Implementation of Face Recognition and Emotion Recognition in a Humanoid Robot Using a Convolutional Neural Network,” IEEE Access, 2017.
W.-T. Sung, C.-H. Lin, and S.-J. Hsiao, “Image Recognition Based on Deep Learning with Thermal Camera Sensing,” Comput. Syst. Sci. Eng..
S. N. Shaikh, Mohammad Abuzar. Srihari, S. K. Datta, and M. Gao, “Soft-Attention Improves Skin Cancer Classification Performance,” State Univ. New York, Buffalo, 2023.
A. Trifu, E. Smîdu, D. O. . Badea, E. Bulboacă, and V. Haralambie, “Applying the PRISMA method for obtaining systematic reviews of occupational safety issues in literature search,” MATEC Web Conf., vol. 354, no. 00052, 2022.
A. H. Mary, Z. B. Kadhim, and Z. S. Sharqi, “Face Recognition and Emotion Recognition from Facial Expression Using Deep Learning Neural Network,” 2nd Int. Sci. Conf. Al-Ayen Univ., vol. 928, 2020.
S. R. Barman, J. D. Nath, S. K. Mitra, A. Krishnamoorthy, R. Kannadasan, and N. Prabakaran, “Emotion Classification using Convolutional Neural Network,” J. Surv. Fish. Sci., vol. 10(2S), 2023.
Y. ELsayed, A. ELSayed, and M. A. Abdou, “An automatic improved facial expression recognition for masked faces,” Neural Comput. Appl., vol. 35, pp. 14963–14972, 2023.
C. Białek, A. Matiolanski, and M. Grega, “An Efficient Approach to Face Emotion Recognition with Convolutional Neural Networks,” Electronics, vol. 12, 2023.
T. Patgar and Triveni, “Convolution Neural Network Based Emotion Classification Cognitive ModelforFacial Expression,” Turkish J. Comput. Math. Educ., vol. Vol.12 No., pp. 6718–6739, 2021.
N. A. S. Badrulhisham and N. N. A. Mangshor, “Emotion Recognition Using Convolutional Neural Network (CNN),” J. Phys. Conf. Ser., vol. 1962, no. 012040, 2021.
A. Santra, V. Rai, D. Das, and S. Kundu, “Facial Expression Recognition Using Convolutional Neural Network,” Int. J. Res. Appl. Sci. Eng. Technol., vol. 10, no. V, 2022.
A. Kumar, R. K. Yadav, D. J. Bahadur, and S. C, “Create And Implement A New Method For Robust Video Face Recognition Using Convolutional Neural Network Algorithm,” e-Prime - Adv. Electr. Eng. Electron. Energy, vol. 5, p. 9, 2023, doi: https://doi.org/10.1016/j.prime.2023.100241.
A. G. N. and Y. Suresh, “Optimal Deep Convolutional Neural Network Based Face Detection and Emotion Recognition Model,” Int. J. Intell. Syst. Appl. Eng., vol. 11, p. 9, 2023.
M. F. Alsharekh, “Facial Emotion Recognition in Verbal Communication Based on Deep Learning,” Sensors, vol. 22, 2022, doi: https://doi.org/10.3390/ s22166105.
S. Wang, J. Qu, Y. Zhang, and Y. Zhang, “Multimodal Emotion Recognition From EEG Signals and Facial Expressions,” IEEE Access, vol. 10, p. 1109, 2023.
M. A. H. Akhand, S. Roy, N. Siddique, M. A. S. Kamal, and T. Shimamura, “Facial Emotion Recognition Using Transfer Learning in the Deep CNN,” Electronics, vol. 10, p. 1036, 2021.
Y. An, J. Lee, E. Bak, and S. Pan, “Deep Facial Emotion Recognition Using Local Features Based on Facial Landmarks for Security System,” Comput. Mater. Contin., vol. 76, no. 2, 2023, doi: DOI: 10.32604/cmc.2023.039460.
X. Wang, Y. Wang, and D. Zhang, “Complex Emotion Recognition via Facial Expressions with Label Noises Self-Cure Relation Networks,” Hindawi Comput. Intell. Neurosci., p. 10, 2023.
W. Wang, K. Xu, H. Niu, and X. Miao, “Emotion Recognition of Students Based on Facial Expressions in Online Education Based on the Perspective of Computer Simulation,” Hindawi Complex., 2020, doi: https://doi.org/10.1155/2020/4065207.
E. G. Dada, D. O. Oyewola, S. B. Joseph, O. Emebo, and O. O. Oluwagbemi, “Facial Emotion Recognition and Classification Using the Convolutional Neural Network-10 (CNN-10),” Hindawi Appl. Comput. Intell. So Comput., 2023, doi: https://doi.org/10.1155/2023/2457898.
C. Zhu, “Real-Time Monitoring and Assessment System with Facial Landmark Estimation for Emotional Recognition in Work,” Int. J. Recent Innov. Trends Comput. Commun., vol. 11, no. 8, 2023, doi: https://doi.org/10.17762/ijritcc.v11i8.7737.
Y. He, Y. Zhang, S. Chen, and Y. Hu, “Facial Expression Recognition Using Hierarchical Features With Three-Channel Convolutional Neural Network,” IEEE Access, vol. 11, pp. 84785–84794, 2023.
Y. Wang, Y. L. Rong, Y. Song, and X. Rong, “The Application of a Hybrid Transfer Algorithm Based on a Convolutional Neural Network Model and an Improved Convolution Restricted Boltzmann Machine Model in Facial Expression Recognition,” IEEE Access, vol. 7, pp. 184599–184610, 2019.
H. Zhang, A. Jolfaei, and M. Alazab, “A Face Emotion Recognition Method Using Convolutional Neural Network and Image Edge Computing,” IEEE Access, vol. 7, pp. 159081–159089, 2019.
D.-H. Lee and J.-H. Yoo, “CNN Learning Strategy for Recognizing Facial Expressions,” IEEE Access, vol. 11, pp. 70865–70872, 2023.
Avishek Dey and S. Chander, “Role Of Face Feature Classification For The Detection And Recognition,” J. Res. Adm., vol. 5, no. 2, p. 22, 2023.
T. E. Prasad and R. L, “An Efficient Facial Expression Recognition System Using Novel Supervised Machine Learning by Comparing CNN over Google Net,” J. Pharm. Negat. Results, vol. 13, no. 4, 2022.
T. Ramu and A. Muthukumar, “A GoogleNet architecture based Facial emotions recognition using EEG data for future applications,” 2022 Int. Conf. Comput. Commun. Informatics, pp. 1–7, 2022.
H. Arabian, V. Wagner-Hartl, and K. Moeller, “Network Architecture Influence on Facial Emotion Recognition,” Curr. Dir. Biomed. Eng., vol. 8, no. 2, pp. 524–527, 2022, doi: https://doi.org/10.1515/cdbme-2022-1134.
C. Dalvi, M. Rathod, S. Patil, S. Gite, and K. Kotecha, “A Survey of AI-Based Facial Emotion Recognition: Features, ML & DL Techniques, Age-Wise Datasets and Future Directions,” IEEE Access, vol. 9, 2021.
U. Dudekula and P. N, “Linear fusion approach to convolutional neural networks for facial emotion recognition,” Indones. J. Electr. Eng. Comput. Sci., vol. 25, No. 3, pp. 1489–1500, 2022.
S. Vignesh, M. Savithadevi, M. Sridevi, and R. Sridhar, “A novel facial emotion recognition model using segmentation VGG-19 architecture,” Int. J. Inf. Technol., vol. 15, pp. 1777–1787, 2023.
H. Echoukairi, M. El Ghmary, S. Ziani, and A. Ouacha, “Improved Methhods for Automatic Facial Expression Recognition,” Int. J. Interact. Mob. Technol., vol. 17, No. 6, 2023.
B. Li and D. Lima, “Facial expression recognition via ResNet-50,” Int. J. Cogn. Comput. Eng., vol. 2, pp. 57–64, 2021.
J. Liu, Y. Feng, and H. Wang, “Facial Expression Recognition Using Pose-Guided Face Alignment and Discriminative Features Based on Deep Learning,” IEEE Access, vol. 9, pp. 69267–69277, 2021.
T.-R. Huang, S.-M. Hsu, and L.-C. Fu, “Data Augmentation via Face Morphing for Recognizing Intensities of Facial Emotions,” IEEE Trans. Affect. Comput., vol. 14, n, pp. 1228–1235, 2023.
D. Li, X. Zhao, G. Yuan, G. Liu, and Y. Liu, “Robustness comparison between the capsule network and the convolutional network for facial expression recognition,” Appl. Intell., vol. 51, no. 4, pp. 1–10, 2021.
C. Zekhnine and N. E. Berrached, “Human-Robots Interaction by Facial Expression Recognition,” Int. J. Eng. Res. Africa, vol. 46, 2020.
M.-H. Hoang, S.-H. Kim, H.-J. Yang, and G.-S. Lee, “Context-Aware Emotion Recognition Based on Visual Relationship Detection,” IEEE Access, vol. 9, pp. 90465–90474, 2021.
S. B. S. AlMarri, “Real-Time Facial Emotion Recognition Using Fast R-CNN,” Kate Gleason College of Engineering, Rochester Institute of Technology, Dubai, Uni Emirat Arab, 2019.
S. Kanjanawattana, P. Kittichaiwatthana, K. Srivisut, and P. Praneetpholkrang, “Deep Learning-Based Emotion Recognition through Facial Expressions,” J. Image Graph., vol. 11, no. 2, 2023.
D. Li et al., “Emotion Recognition of Subjects With Hearing Impairment Based on Fusion of Facial Expression and EEG Topographic Map,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 31, pp. 437-445, 2023.
T. Shen and H. Xu, “Facial Expression Recognition Based on Multi-Channel Attention Residual Network,” Comput. Model. Eng. Sci., vol. 135, no. 1, pp. 539–560, 2023.
R. Singh, S. Saurav, T. Kumar, R. Saini, A. Vohra, and S. Singh, “Facial expression recognition in videos using hybrid CNN & ConvLSTM,” Int. J. Inf. Technol., vol. 15, pp. 1819–1830, 2023.
P. S. Lamba and D. Virmani, “CNN-LSTM-Based Facial Expression Recognition,” Lect. Notes Networks Syst., vol. 167, 2021.
M. M. Kabir, T. A. Anik, M. S. Abid, M. F. Mridha, and M. A. Hamid, “Facial Expression Recognition Using CNN-LSTM Approach,” 2021 Int. Conf. Sci. Contemp. Technol., pp. 1–6, 2021.
R. Febrian, B. M. Halim, M. Christina, D. Ramdhan, and A. Chowanda, “Facial expression recognition using bidirectional LSTM - CNN,” Procedia Comput. Sci., vol. 216, pp. 39–47, 2023.
Y. Ming, H. Qian, and L. Guangyuan, “CNN-LSTM Facial Expression Recognition Method Fused with Two-Layer Attention Mechanism,” Hindawi Comput. Intell. Neurosci., vol. 2022, 2022.
T.-H. S. Li, P.-H. Kuo, T.-N. Tsai, and P.-C. Luan, “CNN and LSTM Based Facial Expression Analysis Model for a Humanoid Robot,” IEEE Access, vol. 7, pp. 93998–94011, 2019.
V. Sati, S. M. Sanchez, N. Shoeibi, A. Arora, and J. M. Corchado, “Face Detection and Recognition, Face Emotion Recognition Through NVIDIA Jetson Nano,” 2020.
S. Hangaragi, T. Singh, and N. N, “Face Detection and Recognition Using Face Mesh and Deep Neural Network,” Int. Conf. Mach. Learn. Data Eng., vol. 2018, pp. 741–749, 2023, doi: https://doi.org/10.1016/j.procs.2023.01.054.
H. Chen and C. Haoyu, “Face Recognition Algorithm Based on VGG Network Model and SVM,” J. Phys. Conf. Ser., p. 9, 2019.
S. M. Zhu, H. Xu, Z. Han, and Y. Zhu, “Recognizing Facial Expressions Using a Shallow Convolutional Neural Network,” IEEE Access, vol. 7, 2019.
B. K. Durga and V. Rajesh, “Performance Analysis of ResNet in Facial Emotion Recognition,” Comput. Electr. Eng., vol. 104, Part, no. 108384, 2022.
E. Suherman, B. Rahman, D. Hindarto, and H. Santoso, “Implementasi ResNet-50 pada End-to-End Object Detection (DETR) pada Objek,” Sink. J. Penelit. Tek. Inform., vol. 8, No. 2, 2023.
J. Wu, B. Liu, H. Zhang, S. He, and Q. Yang, “Fault Detection Based on Fully Convolutional Networks (FCN),” J. Mar. Sci. Eng., vol. 9, no. 3, 2021.
S. Zhang, X. Pan, Y. C. X. Zhao, and L. Liu, “Learning Affective Video Features for Facial Expression Recognition via Hybrid Deep Learning,” IEEE Access, vol. 7, pp. 32297–32304, 2019.
C. Chen, B. Wu, and H. Zhan, “An Image Recognition Technology Based,” IAENG Int. J. Comput. Sci., vol. 50, no. 1, 2023.
J. Monteiro, J. Alam, and T. H. Falk, “Multi-level self-attentive TDNN: A general and efficient approach to summarize speech into discriminative utterance-level representations,” Speech Commun., vol. 140, pp. 42–49, 2022.
J. Li et al., “Facial Expression Recognition with Faster R-CNN,” Procedia Comput. Sci., vol. 107, pp. 135–140, 2017.
S. Gupta, P. Kumar, and R. K. Tekchandani, “Facial emotion recognition based real-time learner engagement detection system in online learning context using deep learning models,” Multimed. Tools Appl., vol. 82, no. March 2023, pp. 11365–11394, 2023.
M. Jeong, J. Nam, and B. C. Ko, “Lightweight Multilayer Random Forests for Monitoring Driver Emotional Status,” IEEE Access, vol. 8, pp. 60344–60354, 2020.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Systematic Literature Review: The Influence and Effectiveness of Deep Learning in Image Processing for Emotion Recognition

Status:

Version 1

Abstract

1 Introduction

2 Related Works

3 Methodology

4 Result and Discussion

5 Conclusion

Declarations

Author Contribution

References

Additional Declarations

Status:

Version 1