In recent years, China's education sector has been undergoing extensive reform and development, with a significant increase in the number of newly enrolled students. For relevant functional departments, the advantages of large data storage capacity and diverse collection channels make it possible to use data mining techniques to more quickly and easily discover relationships within the data. This, in turn, provides scientific and rational references for decision-making, while also enhancing the credibility of national education policies1.
In today's information era, the continuous development of relevant information technology provides solutions for various educational institutions2. Traditional statistical methods are insufficient to fully explore the latent patterns within the multidimensional data, including existing student information, academic performance, classroom behavior, and psychological states. Instead, leveraging multiple technologies such as machine learning, deep learning, and image processing can facilitate the feature extraction and analysis of complex high-dimensional data. By using a Convolutional Neural Network (CNN) model to analyze and predict student performance, educators can utilize the interpretability inherent in these learning methods to clearly understand the learning progress differences among students in a particular class. This enables the formulation of differentiated teaching plans, optimizing the allocation of educational resources, and improving education quality. Conducting in-depth analysis and evaluation of campus data related to student performance, and identifying valuable attribute features for performance analysis and prediction, has become a current research focus.
As one of the earliest research directions in educational data mining, student performance prediction has been extensively studied by numerous scholars who have made significant contributions. The methods used in these studies can be categorized into two main types: traditional machine learning-based approaches and deep learning-based approaches.
Grade Prediction Based on Traditional Machine Learning Methods.Wang et al.3 combined students' gender, educational background data, subject background, and online learning behavior data, inputting these into a decision tree to predict academic performance. Xiaoli Wang et al. 4 proposed a weighted naive Bayes classification method based on mutual information and Bayesian classification algorithms to predict students' computer proficiency test scores. Ashenafi et al. 5 extracted features from students' classroom discussion questions and answer ratings that capture student activity information, using a multiple linear regression model to predict course grades. Jintao Hu 6 utilized decision trees to analyze all course grades, identifying the impact of basic course grades on specific professional course grades, and used the generated rules to establish a predictive model for professional course grades. As an ensemble learning meta-algorithm, Bagging was used by Hillebrand et al. 7 to reduce generalization error and improve prediction accuracy by combining different models. Ahmed8 proposed using the GBDT algorithm to predict college students' performance in final exams.
Grade Prediction Based on Deep Learning Methods.Okubo9 proposed a method for predicting students' final grades by inputting log data stored in educational systems, such as attendance, video watching, and reports, into a recurrent neural network. Kalyani10 applied convolutional neural networks (CNN) to predict student performance, considering that CNNs can mimic the human brain's behavior in analyzing and processing information, which helps solve problems beyond human capabilities. Hongjiang Cao11 addressed the sequential nature of students' historical grades and the forgetting characteristics during the learning process by introducing an LSTM network to model students' knowledge structure states. By integrating emotional and behavioral features, the method significantly improved the accuracy of the grade prediction model according to experimental results. Qu et al. 12 constructed a grade prediction framework with an attention mechanism, using the attention mechanism to adjust the weights of partial time behaviors and behavior patterns. Experiments demonstrated that this model has higher accuracy compared to other methods. Aljaloud13 tackled the issue of complex features and high processing difficulty in online learning systems by using CNN to extract learning features from time-series data, which were then fed into an LSTM neural network for grade prediction, proving the feasibility of this method. Tao Fang14 proposed an Att-LSTM model, capable of filtering out key information from a large number of input features, with a focus on features that significantly impact student performance, to predict grades.
Related technologies
Principles of Data Processing
Perform the following operations on the existing data:First, anonymize student information by replacing real names with specific non-numeric strings. Use letter codes as identifiers to determine whether the student meets certain special attributes and assign values accordingly. Second, after importing the dataset, create a new mapping dictionary to numerically replace all non-numeric strings that appear in the dataset. After replacement, define a data cleaning function to check for missing and abnormal values in the converted numerical data, provide timely warnings based on the check results, and make corrections to ensure data quality. Finally, perform one-hot encoding on the verified data. The purpose is to convert relevant feature columns into numerical values that can be processed by deep learning algorithms, maintain the independence of feature columns, and enhance the model's understanding ability.The principle of single hot encoding is shown in Fig. 1.
Convolutional Neural Network
This study employs the Convolutional Neural Network (CNN) algorithm for deep learning analysis. Due to their efficiency in extracting implicit features and strong capability to hierarchically capture spatial structure information, some researchers use CNNs to extract latent features in students' learning activities15. This research uses the CNN feedforward algorithm as the foundational structure for student performance prediction analysis. The model structure is specifically divided into an input layer, convolution layer, pooling layer, fully connected layer, and output layer, as shown in Fig. 2.