In this study, we found that the OSCE results in clinical reasoning were lower than CAT in four cohorts, with a linear tendency to get better results from the first to the last cohort analyzed. The results of the TeleOSCE in the pandemic cohort (2020) show similar results compared with the previous generation (2018). There is a significant difference between the third and fourth-year OSCEs in the two cohorts analyzed, although the trends in these results are the opposite and the components of the clinical reasoning do not change equivalently between them. Concerning the validity of the CAT, the instrument's internal consistency was high in this sample.
This study shows that the performance of medicine students in patient-centered communication is higher than clinical reasoning in fourth-year OSCE. These results could be attributed to the early familiarization with communication skills, which correspond to transversal competencies that have been developing since before university entrance. In patient-centered communication, there are elements common to empathic communications deployed in other contexts that are re-signified for medical practice. On the other hand, the complexity of clinical reasoning as a specific process that relies on medical knowledge, may be difficult to achieve in early stages in the curriculum.
However, these patient-centered communication results are better than those of a similar OSCE applied in 2015 at the end of the medical career in Chile [6, 7]. This can result from the students' and teaching strategies' adaptation to the assessment system, following the idea that “assessment guides learning” [5]. Some of the items that show improvement reflect the model of shared decisions between the clinician and the patient, which was underscored previously in young physicians in Chile [6, 7].
Similarly to other studies, we confirm that a remote OSCE is a valid instrument for assessing communication and clinical reasoning skills [8, 11, 12].
In our study the student's performance at intermediate-level OSCEs was not affected negatively during and after the pandemics, as was described in internal medicine clerkship [12], and even with the reduction in hospital practice hours. On the other hand, these results contrast with other studies that show lower outcomes in pandemic cohorts [10] and with the statement that the pandemic affects more low-income countries [9].
In this study, the cohort with more hours of high-fidelity simulation showed the best clinical reasoning performance in OSCE in five years. Our results support the idea that practice based on simulation provides better opportunities for students to achieve the expected results at this curricular level [8]. The coherence between the teaching strategy and the assessment system may explain the result in this cohort, but another explanation could be the adoption of strategic performance during OSCEs [5].
This study also revealed that the changes between the third and fourth years compared in two cohorts were different in the analyzed groups and that the elements inside the tool used to assess patient-centered communication and clinical reasoning changed in a non-equivalent manner between cohorts over the years.
In contrast with the prior approach of applying a compensatory approach [7], our results support the idea that an analysis component by component is a better approach because global scores do not capture all the exam details.
This is aligned with the recommendations to assess the OSCE as a whole system, but analyzing each station independently. One implication of these findings is that the decisions to guide the curriculum post-OSCE should be based not only on global scores and compensatory grading systems at a single OSCE but also on the analysis of the components of each competence, and in the analysis of trends of change, and the relation of this changes with the curricular opportunities. To decide just based on a compensatory approach questions the validity of the instrument in order to make high-impact decisions at this level of the curriculum.
From our perspective, making high-impact decisions at this level of the curriculum solely based on a compensatory approach is questionable [13]. On the contrary, the educative value of a system that provides details of both communication competence and clinical reasoning should be encouraged [5].
Finally, concerning the validity evidence of CAT, the results that we found are similar in reliability [6, 14], confirming that the instrument could be used in this level of the curriculum, being cautious about the passing score, because although the mean results are better than described previously in Chile [6, 7], they are still far the results described in the original article that assess Canadian experienced physicians [14].
While this study provides information on the clinical outcomes of medicine students assessed in intermediate OSCEs, some limitations must be considered. Considering the restriction to clinical practices, we can´t correlate the OSCE results with reliable clinical assessment. The nature of the clinical reasoning theoretical practice or the implementation of high-fidelity simulation in the fourth year or previous curriculum was not described, and there is no systematic data on the implementation and impact of these practices. It is essential to acknowledge that the absence of information about the instructors' and clinical supervisors' backgrounds and experiences limits the depth of the comparative analysis. The OSCE administration using written registers of the clinical reasoning process does not collect the interaction information during the station, potentially impacting the comprehensive interpretation of the results if videos are not systematically analyzed. Finally, the findings may not be universally applicable, as the study is limited to one single medicine school in Chile. The differences between curriculum, instructors, and clinical tutors' experience and credentials as well as OSCE organization experience could affect the generalizability of the results to other institutions.
However, one of the strengths of this study is that it compares cohorts and analyzes progression in intermediate levels of the curriculum, analyzing communication and reasoning in a complementary but independent manner, which is infrequent to be found in medical education literature. Another strength is the use of CAT, which is an instrument that has been validated in multiple languages and that appears suitable, valid, and reliable when used in undergraduate OSCE.