The study of the impact of using an abdominal simulator to train 3rd year medical students in the palpatory aspects of the physical examination of the abdomen showed that the training produced significant improvements in the calibration of the depth of the medical students’ abdominal palpation, and in the thoroughness of their examination of the abdominal organs. This improvement was larger than produced by the training and clinical experiences the students had had on their previous rotations. This study makes a contribution because no study has previously been published showing the effects of abdominal simulator training on these two skills.
This study involved both main types of simulation-based research described by Cheng et al (2014): “1) studies that assess the efficacy of simulation as a training methodology and (2) studies where simulation is used as an investigative methodology” (p. 1091). The AbSim simulator provided the training, and its measurements before and after are the source of the objective measurement of changes in competence.
The AbSim abdominal simulator used here has unique features: a) it simulates the feel of the abdomen; b) it measures the student’s palpation, both position and depth; and c) it provides these measurements as both concurrent and summary feedback for the student. To the best of our knowledge, it is the only commercially available abdominal simulation trainer that has these useful features, and that has software specifically designed to take the learner through a sequence of experiences that build up competence. Still it is not completely realistic. It is full of air, not liquid, so it does not make informative sounds when tapped. The resistance to a probe is provided by a taut rubber sheet, not by variously shaped organs suspended in liquid. Its “skin” is that of a slender person, so it does not provide experience palpating through various thicknesses of fat.
The training procedure includes 5 of the 12 features that McGaghie et al (McGaghie et al., 2010) identified as promoting learning: “(i) feedback; (ii) deliberate practice; (iii) curriculum integration [in that the students may well need it in their current clinical placement]; … (v) simulation fidelity; … (xi) instructor [researcher] training.” It can be viewed as a way to support the development of higher levels of performance through supervised practice of each part of a skill in isolation and subsequent reintegration with the execution of the whole skill (Ericsson, 2004, 2015).
To address Bewley and O’Neil’s (2013) observation that much research assessing methods of medical education is methodologically weak, we aimed for: precise definition of the goals of the training in terms of particular subskills of competence; accurate and relevant measurements of the skill; effective training that provides the opportunity to practice the skill with concurrent as well as summative feedback; the best feasible sample; and an adequate design for evaluation of the effect of studying with the simulator.
The study’s sample was good in that over 80% of the students in the third-year medical school class participated by filling out the questionnaires at the beginning and end of the family medicine clerkship month; and over 60% of these participants actually took the training, although the proportion doing so declined during the year. The observations were taken over the course of 11 months, so the results are applicable to students at all points in the educational calendar. However, practical considerations – the rights of research participants, and an already overcommitted curriculum – precluded requiring all students to participate. Hence participation was voluntary, as was whether participants studied with the simulator. We did not randomize students to studying or not studying with the abdominal simulator, and we had no palpation competence measures for the nonusers of the simulator. Further, to minimize burden (which could have reduced participation), no measurement using the simulator was obtained at the end of the month; so the only competence data were obtained in one hour-long session at the beginning of the month, before and immediately after the training.
For each individual studying with the simulator we had multiple measures before and after the training, and the measures were taken in fine grain (dots at locations) and summarized in multiple ways (over all, and for particular regions and particular organs, and with separate assessments of light and deep palpation technique). While we found highly statistically significant improvements in the quality of the palpation examination immediately after training -- including the thoroughness of the students’ coverage, and the calibration of their light and deep palpation, overall and at most regions of the abdomen – we do not know how long these effects would last.
An incidental observation was that on the average female students started off palpating more lightly than men (on the other hand men more often palpated “too deeply”), perhaps because on average they weigh less or are more careful about not hurting people. The differences between men and women were somewhat reduced by the training.
One particular location that almost all students failed to palpate, even after training, was the epigastric area, i.e., the esophageal/gastric junction in the upper middle abdomen. Perhaps this is exceptionally difficult to explore adequately in real patients as well, suggesting that instructors should give special attention to this area. However, it may be a feature of this particular simulator: perhaps the sensors are not as sensitive in that region, which is near the edge of the sensor pad.
In addition to the simulator’s objective measurements, the students provided their assessments of the training (those who chose to do it) and of the general concept of simulator training (all participants). Despite the substantial improvements in objectively measured competence, there was no difference between end-of-clerkship evaluations by the participants who did and did not experience the abdominal simulator training. We had expected a difference. One interpretation is that participants who did not get simulation training chose not to because of difficulty in scheduling it (e.g., from their off-campus clinical site) rather than because of disinterest. An alternative interpretation would be that the training made little impression three weeks later.
Use of abdominal simulator to measure student performance. In addition to studying the effect of the simulator’s tutorial, the other main focus of the study was the utility and validity of the simulator’s measurements. We asked whether the students’ previous clerkship rotations, especially surgery and internal medicine, produced better performance as measured by the abdominal simulator prior to the students’ training. Our results indicate that the initial measure of palpation competence was essentially level throughout the year; neither rotating through specialties in which abdominal palpation is necessary, nor the cumulative total of all previous clerkships, produced much improvement. We had expected that prior experience would improve student competence and that it would be reflected in the simulator’s measures.
Assuming the measures are valid, what does this finding imply? Perhaps students acquire accurate palpatory calibration on each clerkship and then rapidly lose it when it is not daily reinforced. Perhaps this particular aspect of the skill is little emphasized elsewhere, because it is not viewed as the primary skill needed in the physical examination of real patient abdomens. Third year medical students may hear about this element of the physical examination of the abdomen on third year rotations, but not have much opportunity to practice. There was a large range in the amount of previous experience students anecdotally reported to the researcher, depending on the particular clinics and physicians they shadowed. On the final questionnaire in this study they also reported a wide range of chances to palpate patient abdomens on the family medicine clerkship. A few students who exhibited clear competence, as observed by the researcher and measured by the simulator, explained that they had previously worked as emergency medical technicians or nurses. This emphasizes the important role of having the opportunity to practice on numerous patients, presumably with feedback.
What is proven by the increase in student performance after training, as measured by the abdominal simulator’s measurements, which naturally are linked tightly to the particular calibration standards as embodied on the simulator? At minimum, it indicates the students learned to temporarily adjust their palpation of the simulator in accord with the instructions – indeed it may be a very accurate measure of this particular subcomponent of abdominal examination competence – but it does not prove they have learned this skill permanently, nor does it prove competence at examining the wide range of body types, nor the broad range of examination skills taught by other physicians.
However, the fact that the very experienced students did well on the simulator’s measures suggests it captures an essential part of an integrated competence. This implies that most medical students are not mastering the integrated skill of the physical exploration of the abdomen during their third-year rotations. Many of the students reported that they had not previously been guided to focus in such detail upon their palpatory technique, and so they paid close attention to the simulator’s feedback on how they were doing.
Is there value in having the simulator train students to perform the palpatory exam to a very particular standard – to calibrate to the level set by one expert – when patient body types and complaints may require a different range of pressures and techniques? We’d argue that there is benefit learning to execute a skill in a particular way, because one becomes aware of the range of possibilities in doing so. It does not stop the physician from recalibrating when the situation demands it; rather it facilitates the ability to do so.
In conclusion, though we don’t know for sure if the simulator’s assessment of competence is generally valid, other than for immediate confirmation that the students adjust their calibration as the simulator tells them to do, there is plenty of indirect evidence to think it is. Presumably they learn how it is possible to calibrate, and they are made aware of depth and coverage of palpation as important elements of their palpatory exam, whether or not the particular depth and coverage to which they were trained here is the best one in general.
Limitations. An ideal study for assessing training with a simulator would be able to tell us the cost per long term stable improvement in competence, as compared to the costs and improvements of alternative training methods. Costs of training as done in this study would include buying or leasing the simulator and the time of the researcher who guided the student through the procedure. We did not test whether unsupervised students following only the instructions of the tutorial program would have the motivation to repeat the examination multiple times sufficient to improve their skills.
Although we showed the training produced improvement between the pre and immediate post training measures of the particular subskills – depth and coverage – it is just part of the total skill of the physical examination of the abdomen, and other subskills merit supervised practice as well.
A major weakness, as we have acknowledged, is the non-randomized nature of the study, and the fact that even among study participants we had measures only from those who chose to arrange a session with the abdominal simulator. Thus we have competence measures only for those who were interesting in getting the training. We cannot compare the competence of participants who studied using the simulator and those who did not, nor with non-participants (who did not even fill out the questionnaires). Hence a possibility that our non-randomized non-comparative design cannot discount is that only the least competent students chose to study, as self-remediation. The fact that fewer participants opted to spend the hour with the simulator as the year went by is consistent with that alternative theory. It is less of a commitment for a program to provide a simulator as remediation, than as a requirement for all students.
Our study design also could not tell us how long the effects of the training last. We don’t have competence measures at the end of the month, so we cannot compare the effects of the simulator to the effects of the clinical experiences offered by the family medicine, internal medicine, and surgery clerkships. We did not investigate the difference between the effect of expensive, guided individual use of the simulator versus less costly group instruction, or individual sessions using the simulator guided only by its computer program. We did not quantify the cost of the simulator per student; as this study focused on individual measurement, it used researchers not teachers, and did not compare individual instruction with team or group exercises with a simulator.
The AbSim simulator was used both to provide the training and to assess student performance on the trained skill. To allay concern about simply “training for the test,” it would be better to have an independent assessment of change in competence. And to assess the accuracy of the simulator’s assessments, it could be used before and after a different form of training, and/or compared with an alternative valid measurement of competence.