We conducted a psychometric analysis using a sample of 1638 Austrian citizens who completed an online COVID-19 symptom checklist on up to 20 days during a period of restrictive country-wide COVID-19 measures. After the first confirmed cases of COVID-19 in Austria on 25 February 2020, nationwide infection control measures were ordered by the Austrian government from 16 March 2020 onwards. Public life in Austria remained severely affected before first easing measures were implemented in mid-April 2020 [14]. The self-reported, online COVID-19 symptom checklist used in the present study was therefore available from 22 March to 30 April, 2020.
We developed the checklist based on the WHO symptom descriptions for COVID-19 [15] and included fever, fatigue, cough, dry cough/no sputum production, pain in limbs, sore throat, headache, shortness of breath, chills, vomiting, diarrhea, nasal congestion, sneezing, sniffles/rhinitis and smell and taste disorders (Supplement Table A). Response options and scores for each item were as follows: ‘Yes’ (scoring 1), ’No‘ (0) and ’I can’t say that‘ (3). For the analysis, we dichotomized the items by collapsing ‘No’ and ‘I can’t say that’ because we assumed that these two answers would indicate that the participant had not experienced a certain symptom. In addition, we asked the participants to state gender, age group, the highest completed education, the current smoking status, body height and weight, whether any type of COVID-19 test had been conducted and if so, what the result of this test was, whether any comorbidities existed (nervous system, cardiovascular, gastrointestinal, liver, kidney, oncologic, high blood pressure and/or diabetes), whether he or she was taking immunosuppressive medication and whether the participant was pregnant (females only). We used a self-reported code consisting of numbers and string based on first name initials, initials of first names of relatives and birth months to allocate multiple assessments to the correct participants despite guaranteeing anonymity. Due to the psychometric nature of this study, only complete cases were included. The relevant ethical committees approved the study (Medical University of Vienna 1379/2020, Medical University of Innsbruck 1076/2020 and ethical committee of the region Vorarlberg).
Fit to the Rasch measurement model
Overall and item-based fit to the Rasch model was explored in a series of dichotomous models [16] using two different data sets: one data set with the questionnaire filled in for the first time when the participants entered the study and another dataset in which we recorded a symptom to be affirmed by a participant, if it was ticked at least once during the period of nationwide restrictive measures. We used raw scores without weighting [17] and determined the hierarchy of items based on their location parameters. The item local parameter refers to the likelihood of each item to be affirmed. Items differ in their likelihood to be affirmed. A hierarchy of items can be determined based on this likelihood. Likewise, a hierarchy of persons can be established based on the likelihood that a person is likely to affirms more or less items.
Item fit residuals between − 2.5 and + 2.5 with non-significant F-tests represented individual item fit. Non-significant chi-squared values were interpreted as fit to the latent trait. Local dependency between items was determined using residual correlations based on a cut-off of 0.2 above the mean [18]. To assess the instrument’s item-based internal consistency and reliability, we compared Cronbach’s alpha with the person separation index (PSI). The PSI refers to the reproducibility of relative measure location and indicates whether a scale is able to distinguish between people with higher and lower levels of the concept measured by the instrument [19]; in general, a PSI ≥ 0.7 indicates that the instrument is sufficiently suitable for group comparisons.
Unidimensionality
To test unidimensionality, we used an approach proposed by Smith [20] and combined principal component analysis of the item residuals with a series of t-tests to assess whether subsets of residuals which loaded positively or negatively resulted in different estimates of person parameters. These sets of items were chosen as a way to maximize the contrast between them and were thus then most likely to violate the assumption of unidimensionality [21].
Differential item functioning
Differential item functioning was assessed separately for each item by comparing the expected responses to a specific item between respondents from different sub-groups who shared the same likelihood to affirm a certain number of items. The sub-groups were built based on gender (female, male, divers/other), age group (10 sub-groups listed in Table 1), highest completed education (6 levels listed in Table 1), COVID-19 test status (pos/neg/no test), comorbidities (yes/no), immunosuppressive medication (yes/no), pregnancy (yes/no), current smoking status (yes/earlier, but not now/never) and body mass index (BMI; above versus below median). As only seven participants indicated ’divers/other’ for gender, we did not include this category into the differential item functioning analysis and compared only female and male participants. If differential item functioning was apparent in an item for a personal factor with more than two properties, we determined between which sub-groups these differences occurred using post hoc analysis of the residual means.
Table 1
Participant characteristics
Personal factors/characteristics | Frequencies |
---|
Total number of participants N (%) | 1638 (100%) |
Gender n (%) | |
Female Male Divers/other | 1088 (66.4%) 543 (33.2%) 7 (0.4%) |
Pregnancy (women only) n (%) | 17 (1.6% of the women) |
Age groups n (%) | |
0–9 Years 10–19 Years 20–29 Years 30–39 Years 40–49 Years 50–59 Years 60–69 Years 70–79 Years 80–89 Years ≥ 90 Years | 21 (1.3%) 31 (1.9%) 421 (25.7%) 451 (27.5%) 262 (16.0%) 264 (16.1%) 137 (8.4%) 40 (2.4%) 10 (0.6%) 1 (0.1%) |
Highest education n (%) | |
Unfinished compulsory education Completed compulsory education Completed apprenticeship Completed post-secondary non-tertiary education Completed first stage of tertiary education Completed second stage of tertiary education | 37 (2.3%) 30 (1.8%) 140 (8.5%) 453 (27.7%) 718 (43.8%) 260 (15.9%) |
COVID-19 tested n (%) | |
Positive Negative Not tested | 16 (1%) 187 (11.4%) 1435 (87.6%) |
Comorbidities n (%) | |
Yes No | 359 (21.9%) 1279 (78.1%) |
Immunosuppressive medication n (%) | |
Yes No | 45 (2.7%) 1593 (97.3%) |
Note. N (%) = total number of participants (percentage); n (%) number of participants (percentage) |
Person-item targeting
Person-item targeting was inspected graphically using person-item map.
Transformation to a metric interval scale
Based on the logit scale from the Rasch model, we transformed the raw sum scores into a metric scale. If differential item functioning existed for a personal factor, we split the respective item and performed a separate metric transformation for each sub-group. We used the differences between these metric scales to adjust the scores for the respective personal factor, e.g. for people with and without comorbidities. All analyses were performed with either Microsoft Excel, RUMM2030 or the eRm and ltm packages in R (www.r-project.org).