In the authors’ knowledge, this is the first study to have ever evaluate the reliability of echocardiographic measurements through the assessment of the concordance with two board-certified operators, here considered as the gold standard. Furthermore, this is the first study to comprehend a large number of operators from multiple institutions, and with different levels of experience.
As already reported in literature, our results show that the intra-operator reliability is better than the inter-operator’s (Dukes-McEwan et al. 2002), with only M-IVSs and M-LVWs that show moderate intra-operator repeatability (M-IVSs = 0.71, M-LVWs = 0.73), while the other parameters vary from good to excellent. On the contrary, the inter-operator correlation coefficient does not reach excellent values for any echocardiographic parameter, and for the measurements that show poor reproducibility, such as M-IVSs (0.38), M-LVWs (0.44) and M-LVWd (0.27), interchangeability between operators is not recommended. The concordance correlation coefficients with the GS confirm the poor reliability of these parameters and M-IVSd, and the violin plots, which graphically represent the GS-O% distribution, show how operators tend to overestimate their measurement when compared to the GS; M-IVSs, M-IVSd, M-LVWs, and M-LVWd are the parameters with the higher deviation from the GS. These results demonstrate how some of the M-mode-obtained parameters are the least reliable ones and are in line with what is reported for the cat by Chetboul et al. (2003), in a study in which operators with different experience were involved (Chetboul et al. 2003). However, our results are in contrast with the study of Dukes McEwan et al. (2002), which reported the lowest coefficients of variation for M-mode measurements (Dukes-McEwan et al. 2002). The Authors hypothesize that this difference could be due to the different experience of the operators of this study in comparison with the Dukes McEwan’s, where only two experienced echocardiographers were involved (Dukes-McEwan et al. 2002). In fact, measurements of the interventricular septum and the left ventricular free wall in systole and in diastole are greatly influenced by the alignment of the cursor, the blood-tissue interface, the myocardial contractility, the presence of the papillary muscles and the cordae tendinae, as well as by the interference of the right ventricle for the interventricular septum and the hyperechoic area of the pericardium for the left ventricular free wall. In contrast with measurements of the interventricular septum and the left ventricular free wall, the internal diameters of the left ventricle in systole and diastole show moderate to good inter-operator reproducibility (M-LVIDd = 0.73, M-LVIDs = 0.76) and good concordance with the GS (M-LVIDd = 0.81, M-LVIDs = 0.78). Furthermore, M-LVIDd shows a small deviation between the GS and the operators’ measurements through its cumulative curve (28% and 82% of operators show a deviation of 0.05 and 0.15 from the GS, respectively). The violin plots of M-LVIDd and M-LVIDs show that operators tend to underestimate them, but the deviation from the GS’ measurement is low. The reliability of these parameters is fundamental since they allow the operator to assess important aspects of the left ventricle, such as its systolic function and its degree of dilation.
Results show a poor inter-operator correlation coefficient for 2D-LVVs (0.11), and yet the intra-operator correlation coefficient is good for this parameter (0.79); this could indicate that the operators tend to repeat the same error in the measurement of 2D-LVVs. The concordance correlation coefficient that confronts the operators with the GS confirms the poor reliability of this parameter (0.47), and its cumulative curve shows the highest deviation from the GS when compared to all other parameters (7.5% and 23.8% of operators show a deviation of 0.05 and 0.15 from the GS, respectively). The violin plot shows that operators tend to overestimate 2D-LVVs and confirms the high deviation from the GS. These results agree with the study of Dukes-McEwan et al. (2002), which reports for 2D-LVVs a moderate to high coefficient of variation (24.14%) (Dukes-McEwan et al. 2002). A more recent study reported a high inter-observer intraclass correlation coefficient for this parameter; however, according to the scheme of that study, images were obtained only by one operator and measurements where then made by 3 expert operators (Visser et al 2019). Errors in the measurement of 2D-LVVs made by various operators could be due to the blood-tissue interface, which is more difficult to detect in systole than in diastole. Attention must be paid in theoretical and practical courses in teaching how to correctly measure this parameter since an overestimation in its detecting could lead to the misinterpretation of the systolic function of the left ventricle obtained in 2D echocardiography. Moderate reproducibility is demonstrated for the inter-operators’ measurements of 2D-LVLs (0.60), 2D-LVLd (0.71), and 2D-LVVd (0.63), instead. The concordance correlation coefficients confirm a moderate agreement for these 2D-obtained parameters when compared to the GS.
The inter-operator correlation coefficient does not reach excellent values for any parameter, but show a good reproducibility of different measurements, such as the aortic annulus (0.80), and the aortic root measurements obtained on the left parasternal left ventricular outflow view (VLS = 0.89, STJ = 0.84, AA = 0.79). A moderate inter-operator correlation coefficient is reported for the pulmonic valve annulus (0.67). Although aortic root measurements (VLS, STJ and AA) have good inter-operator correlation coefficients, through the analysis of the concordance with the GS, only VLS reaches an excellent value (0.92), while STJ confirms a good agreement with the GS (0.80), and AA only reaches a moderate value (0.57). Measurement of the pulmonic valve annulus confirms a moderate concordance with the GS (0.65), and the aortic annulus remains a reliable measurement with a good concordance correlation coefficient with the GS (0.85). Cumulative curves confirm that VLS is the parameter that shows the smallest deviation between the GS and the operators’ measurements (48.7% and 93.4% of operators show a deviation of 0.05 and 0.15 from the GS, respectively), while violin plots show that for AVA, PVA, VLS, STJ and AA, operators tend to underestimate the measurements. Basing on these results, measurement of the aortic annulus is interchangeable between operators, and this may be because it is measured in early systole, while the valve is open, so the blood-tissue interface is clear. On the contrary, AA is more difficult to be correctly obtained, because it is a derived measurement since it is acquired at a distance from the sinotubular junction equal to the diameter of the sinotubular junction itself. The reliability of the aortic annulus and root measurements demonstrated in this study is very important since measurements and evaluations of the aorta are pivotal for the diagnosis of congenital heart diseases. The moderate values obtained for the pulmonic valve, on the contrary, suggest that a higher attention should be paid in teaching how to obtain this parameter, since interchangeability in its measurement cannot be confirmed yet.
Lastly, the analysis of the answers gathered regarding the operators’ experience shows that only the number of echocardiographic examinations performed per month significantly reduces the variability with the GS. Therefore, a specialist clinical activity, more than the acquired theoretical knowledge, affects the reproducibility of the echocardiographic examination.
This study presents some limitations. Firstly, data have been acquired during practical courses held by the GSs, so the operators could have been influenced by what they had learnt during the course. This may have induced a sort of standardization, otherwise absent, among the operators. Secondly, not all the echocardiographic measurements that are normally used in clinical practice have been included in this study, so it might be interesting to evaluate these other measurements in the future using the same scheme hereby proposed. Lastly, two different ultrasound machines have been used and different dogs have been engaged in the study, and this could have been responsible for the great differences between the inter-operator correlation coefficients and the concordance correlation coefficients, along with the different experience of the involved operators. However, the fixed effects of the variables “dog” and “operator” have been included in all the analyses, so they do not have to be considered; furthermore, the different experience of the operators must be regarded as a strength point because it reflects a real population of veterinary echocardiographers.
In conclusion, this is the first study to assess the reproducibility and repeatability of numerous echocardiographic parameters through the analysis of data acquired from a large number of operators with different levels of experience and the comparison with two board-certified operators. Furthermore, this is the first time that the effect of the operators’ experience, both theoretical and practical, on the reliability of the echocardiographic examination has been assessed. Results show that M-IVSs, M-IVSd, M-LVWs, M-LVWd, and 2D-LVVs are the least reliable parameters since they show a high deviation from the GS and/or a poor inter-operator correlation coefficient. Attention must be paid in theoretical and practical courses in teaching how to correctly measure these parameters. On the contrary, M-LVIDd, M-LVIDs, AVA, VLS, and STJ are the most reliable echocardiographic measurements, demonstrating that all the operators are able to correctly assess left ventricular systolic function and dilation, as well as to precisely evaluate the aortic annulus and root, fundamental aspects for the diagnosis of congenital and acquired heart diseases. Furthermore, a specialist clinical activity affects the reliability of the echocardiographic examination.