Study Design
The research employed a quantitative, cross-sectional analytical study design. This approach was chosen to examine the relationship between predictor variables (demographics, academic performance, study habits) and the assessment outcomes of pre-clinical MBBS/BDS students at a specific point in time. The quantitative nature of the study facilitated statistical modelling and the application of machine learning algorithms for objective measurement and analysis. The study was planned in four Phases as shown in Figure 2.
Setting
The study was conducted at two institutions in Rawalpindi, Pakistan: Riphah International University and its affiliated Islamic International Medical College. These institutions were selected based on their accessibility, willingness to participate, and the diversity of their pre-clinical MBBS/BDS student populations, which enhanced the generalizability of the findings.
Sample Selection
The study population encompassed all 4th-year MBBS and BDS students enrolled at Riphah International University during the data collection period. This approach ensured a comprehensive representation of the target population, capturing the full spectrum of academic abilities and backgrounds. The impact of sample size on accuracy has been explored using various methods, including curve-fitting, cross-validation, and linear discriminant analysis (12–14). However, the results have been inconsistent due to factors like bias-variance tradeoffs, over-sampling, algorithms, and feature size. While reducing samples can theoretically increase variance, removing samples that contribute to large data variance might decrease it. The study employed a census sampling technique, encompassing all 4th-year MBBS/BDS students. The sample size determination was guided by a comprehensive literature review and the guidelines outlined in Rajput et al. (2023), resulting in the inclusion of 144 students in the final analysis.
Inclusion Criteria:
- Enrollment: Students must be actively enrolled in the 4th year of either the MBBS or BDS program at Riphah International University. This criterion ensured that the participants were at a similar stage in their academic journey, allowing for meaningful comparisons and predictions of assessment outcomes.
Exclusion Criteria:
- Incomplete Questionnaire: Students who failed to complete all mandatory items in the questionnaire were excluded from the study. This measure was implemented to maintain data integrity and ensure the reliability of the analysis. Incomplete questionnaires could introduce bias and compromise the accuracy of the predictive models.
This rigorous sample selection process aimed to create a homogenous and representative sample of pre-clinical medical and dental students, enhancing the validity and generalizability of the study's findings.
Ethical Approval
Prior to data collection, the study protocol underwent rigorous review and received ethical approval from the Institutional Review Board (IRB) of Riphah International University (Appendix 1). This ensured that the research adhered to stringent ethical guidelines and safeguarded the rights and well-being of the participants.
Questionnaire Development Process
A structured questionnaire was meticulously developed to gather data on factors influencing assessment outcomes (Appendix 2). The process involved:
- Literature and Expert Review: An initial draft was created based on extensive literature review and then scrutinized by experts in medical education, artificial intelligence, and questionnaire design. Their feedback ensured content validity and clarity of questions.
- Pilot Testing: The questionnaire was pilot-tested with a small sample of 15 4th-year MBBS/BDS students to assess its clarity, identify ambiguities, and estimate completion time. Feedback from the pilot phase led to further refinements in question phrasing and response options.
- In addition to the questionnaire data, relevant academic records were obtained from the examination departments of the respective institution. These records included exam scores, attendance records, and other pertinent academic performance data. The collection of these records was conducted with strict adherence to confidentiality protocols to protect student privacy
These methodological steps, particularly the careful questionnaire development and pilot testing, contributed significantly to the validity and reliability of the study. By ensuring the collection of high-quality data and employing robust validation techniques, the research aimed to generate findings that are both accurate and generalizable to the broader context of medical education.
Data Utilization in Python
The data, meticulously organized and anonymized in Microsoft Excel, was seamlessly transitioned into the Python programming language, version 3.9.1 environment for further analysis and model development. This transition was facilitated by the Python Libraries, a powerful tool renowned for its data manipulation and analysis capabilities.
Libraries and Tools
The analysis and model development leveraged the capabilities of several key Python libraries, each playing a crucial role in achieving the research objectives:
- NumPy: This fundamental library for numerical computations provided the necessary tools for efficient array manipulation and mathematical operations, enabling seamless handling and transformation of the student data.
- Pandas: This powerful data analysis library offered data structures and functions for data cleaning, manipulation, and exploration. It streamlined the process of organizing and preparing the dataset for model development.
- Scikit-learn: This comprehensive machine-learning library provided a wide range of algorithms and tools for model development, training, and evaluation. It facilitated the implementation of various classification algorithms, including Random Forest, AdaBoost, Logistic Regression, Support Vector Machine (SVM), and XGBoost, enabling a comparative analysis of their performance in predicting student outcomes.
- Matplotlib: This versatile plotting library was used to create visualizations of the data and model results. It enabled the generation of informative graphs and charts, such as ROC curves, confusion matrices, and feature importance plots, facilitating the interpretation and communication of the findings.
Machine Learning Model Development
The core of this study lies in the application of machine learning algorithms to identify patterns within the data and predict student assessment outcomes. Five algorithms were employed:
- Random Forest: An ensemble learning method that constructs multiple decision trees and outputs the mode of the classes (classification) or mean prediction (regression) of the individual trees. Its ability to handle high-dimensional data and resistance to overfitting makes it suitable for predicting student outcomes in medical education.
- AdaBoost (Adaptive Boosting): An ensemble technique that combines multiple weak learners to create a strong learner. It iteratively trains weak learners, focusing on misclassified instances in each round. AdaBoost's adaptability makes it suitable for predicting student outcomes where various factors interplay.
- Logistic Regression: A statistical model that predicts the probability of a binary outcome based on one or more predictor variables. Its simplicity and interpretability make it a popular choice for predicting binary outcomes, such as pass/fail, in educational settings.
- Support Vector Machine (SVM): A supervised learning model that finds the optimal hyperplane separating data into different classes. SVM's ability to handle high-dimensional and non-linear data makes it suitable for predicting student outcomes where complex relationships may exist.
- XGBoost (Extreme Gradient Boosting): An advanced implementation of the gradient boosting algorithm. It builds an ensemble of decision trees sequentially, with each tree correcting the errors of the previous ones. XGBoost's high performance and efficiency make it suitable for handling large and complex datasets in medical education.
Feature Engineering and Model Optimization
The raw data transformed to enhance the models' predictive power:
- Data Preprocessing: Data cleaning addressed missing values, outliers, and inconsistencies. Missing values were imputed, and outliers were handled to minimize their influence. Data was normalized to ensure consistent input ranges for the models.
- Feature Selection: Techniques like recursive feature elimination and feature importance analysis were used to identify the most impactful predictors of student outcomes, reducing dimensionality and focusing on salient factors.
- Feature Creation: New features, such as interaction terms, were created to capture additional information and potentially enhance predictive accuracy.
- Model Training and Validation: Models were trained and validated using a portion of the dataset and k-fold cross-validation to ensure robustness and generalizability.
- Hyperparameter Tuning: GridSearchCV was used to optimize model parameters based on performance metrics.
Achieving Objective 2: Establishing Model Validity and Reliability
The study employed a multi-faceted approach to establish the validity and reliability of the AI-based models:
- Validity: Assessed through k-fold cross-validation and multiple evaluation metrics (accuracy, precision, recall, F1-score, MCC, AUC-ROC, AUC-PRC). This ensured the models generalized well to unseen data and accurately predicted student outcomes.
- Reliability: Evaluated through repeated k-fold cross-validation runs and the use of MCC, demonstrating consistent performance and a strong correlation between predicted and actual outcomes.
- Bias Control: Strategies were implemented to mitigate potential biases, including selection bias, sampling bias, algorithmic bias, and overfitting, ensuring fairness and transparency in the AI system.
These rigorous methods ensured that the developed AI models were not only accurate and reliable but also fair, transparent, and ethically sound, thus contributing to their trustworthiness and potential for effective implementation in medical education.
Methodology to Achieve Objective 3: Predicting Assessment Outcomes
The final objective of the study was to utilize the developed AI models to predict the assessment outcomes of pre-clinical MBBS/BDS students. This involved applying the best-performing model, identified in Objective 2, to the preprocessed dataset and generating predictions for each student.
Data Preparation and Prediction
The preprocessed dataset, which included the relevant predictor variables and the target variable (high achiever or low achiever), was fed into the chosen AI model. The model, trained on historical data, utilized its learned patterns and relationships to generate predictions for each student in the dataset. These predictions indicated the likelihood of each student being classified as a 'high achiever' or a 'low achiever' based on their individual characteristics and academic performance.
Classification and Interpretation
Students were classified into two categories based on their assessment outcomes: high achievers and low achievers. High achievers were those who scored more than 70% in their assessments, while low achievers scored less than 70%. A new column was added to the Excel sheet to label each student as either a high achiever (1) or a low achiever (0). This binary classification was used as the target variable for the machine learning models.
The confusion matrix, a table that compares the model's predictions to the actual outcomes, was generated to assess the accuracy of the classification.
Identification of At-Risk Students and High Achievers
Based on the model's predictions, students classified as 'low achievers' were identified as potentially at-risk. These students were flagged for further attention and potential interventions to improve their academic performance. Conversely, students classified as 'high achievers' were recognized for their strong academic potential.
Anonymity and Confidentiality
Throughout the prediction and classification process, the anonymity and confidentiality of student data were strictly maintained. Surrogate IDs were used to represent individual students, ensuring that their personal identities remained protected.