We conducted several online pandemic survey packages for different groups covering the general national population limited to specific populations. For instrument evaluation, we decided to select data from two surveys that used the same three instruments and had their data entered on approximately the same date.
The quantitative research approach was utilized in this study and employed a cross-sectional quantitative survey method. There were 765 respondents who participated in this study from West Java Province, Indonesia. Convenience sampling technique was used, respondents were administered an online survey via the mailing list/online platform. With regard to ethical considerations, the participants’ consent to take part in this study was obtained before they filled in the questionnaire. On the front cover of the questionnaires’ online form, it was stated that the respondents are given a choice whether to participate in the survey. Participation was strictly voluntary and anonymous. Thus, by completing the questionnaire, the respondents provided their consent. Details of the respondent demographic profiles are shown in Table 1.
Table 1
Demographic Data of Respondent (N = 765)
Demographics
|
Frequency
|
Percentage
|
Gender
Male
Female
Age
|
433
332
|
56%
44%
|
Less than 21 years
21 – 40 years
41– 55 years
More than 55 years
|
79
380
221
85
|
18%
49%
29%
11%
|
Marital Status
Single
Married
Divorce
|
191
532
42
|
25%
69%
6%
|
Instruments
Three instruments were used in this study: the SRQ (Self-Rating Questionnaire), PTSD (Post-Traumatic Stress Disorders), and CESD-10 (Center for Epidemiological Studies Depression Scale). The Self Reporting Questionnaire has been developed by WHO to screen psychiatric disturbance. In Indonesia SRQ consist of 20 items (SRQ-20) was widely used in primary health center, hospitals, and Ministry of Health to screen psychiatric symptoms like neurotic, substance abuse, psychotic symptoms. Screening with SRQ-20 is widely used because easily to be used, short period of time for filling the questionnaire, self-reported with two choices (Yes or No) [10-17].
The PTSD instrument is for measuring psychological situations and consists of five items with two choices (Yes or No). In clinical area, the PTSD is used to helped health professional identified emotional disturbance in daily practice. The code for PTSD is F 43.10 with consist of persistent re-experienced symptoms, persistence avoidance of stimuli associated with trauma, persistence symptoms of increased arousal which not present before the trauma with two choices (Yes or No) [18,19]. Meanwhile, the CESD-10 is a screening tool for depression with a scale consisting of 10 items that provides five choices (0 to 4) each.
The measurement model
Rasch model analysis was employed in this study to analyze the data collected from the three questionnaires. This study was chosen as it can provide accurate and precise latent trait measurements about measuring mental health. According to Wright and Mok, a good and valid measurement model has to follow five measurement principles for human science. They are to (a) yield a linear measure, (b) overcome missing data, (c) provide a precision estimate, (d) discover outliers or misfits, and (e) be replicable [20]. The Rasch model fulfills these compared with other measurement models.
The Rasch model is a subset of a larger group of measurement models, called item response theory (IRT), which transform raw ordinal type data using probability and logarithms to become equal-interval scale data called logit (log odd unit). The Rasch model had been used widely to analyze psychometric data in many fields, such as educational research, language assessment, and health sciences [21-25]. Rasch model analysis also provided an extremely effective alternative to investigate the psychometric properties of a cognitive and non-cognitive instrument to address response bias [26,27].
All collected raw data was input into a Microsoft Excel file, checked by the Rasch measurement model software WINSTEPS version 3.73, for data validation and cleaning. No missing data were found. Rasch analysis was performed using the Rasch Rating Scale Model (RSM), an extension of the Rasch model for dichotomous data developed by Andrich [24].
The content validity and internal consistency reliability of each scale were determined to validate the three instruments used in this study. The first stage is to identify at the instrument level, which includes data fit to the model, Chi-square test, item reliability, item separation, and Cronbach’s alpha indices were examined as well. The uni-dimensionality requirement was investigated using the Principal Components Analysis of Rasch measures and residuals. It can be asserted that the data is fundamentally one-dimensional if the Rasch measurement indicates a relatively elevated percentage of explained variance (at least 20% for dichotomous data, which were the SRQ-20 and PTSD questionnaires, and at least 40% for polytomous data applied to the CESD-10 questionnaire) and the first residual components of the unexplained variances are less than 3 eigenvalues [28].
Rating scale analysis is another examination to check the effectiveness rating given to respondents. Several criteria have been suggested by Linacre to diagnose a malfunctioning empirical rating scale used in the instrument [29]. A rating scale at an ideal point can be considered when: a) there are at least 10 observations in each category, b) the person’s measurements are an average by category and increase monotonically with the rating scale, c) outfit MnSq should be less than 1.5 in each step, and d) step difficulties should advance no less than 1.4 and not more than 5 logits [30].
The appropriateness of the item quality was checked two indices because the number of respondents was more than 500 [22, 31] which are the Outfit MnSq and Point-measure correlation (Pt-Measure Corr) for each item. The Wright map for each instrument were displayed to demonstrate the spread item difficulty and respondent abilities comprehensively as can be seen in Figures 1, 2 and 3.
The presence of test items that functioned differently, a response bias, was explored for each respondent’s demographic variable. DIF analysis was also executed. A moderate item DIF was considered to be present if the difficulty parameters between different groups of demographic variables (such as male and female) fulfilled three criteria, which are if it had a t-value of less than −2.0 or more than 2.0, DIF contrast value of less than −0.5 or more than 0.5, and the p (Probability) value of less than 0.05 or greater than −0.05 [22, 23, 32].