Setting
The two surveys (i.e., pre-survey and formal survey) were carried out in eight hospitals in Shanxi Province, China. These hospitals were the First Hospital of Shanxi Medical University, the Second Hospital of Shanxi Medical University, Shanxi Cancer Hospital, the 264 Hospital of Chinese People's Liberation Army (PLA), the 17th Hospital of the Chinese Railway, the People’s Hospital of Gaoping City, the People’s Hospital of Zezhou City, and the Fourth People’s Hospital of Linfen City.
Sample
Before collecting samples, investigators contacted related departments of target hospitals and communities to get support from hospital staff and community workers. Preparations were also made to publicize the study through posters in hospital departments and communities. The documents introducing the survey were distributed. From July 2015 to September 2015, patients diagnosed with GC were recruited. The inclusion criteria for patients with GC were as follows: patients who had been diagnosed with GC, were over 18 years old. The exclusion criteria were as follows: patients with other serious disease; patients with disturbance of consciousness; patients who were unable to understand to complete the questionnaire for any reason. We simultaneously selected healthy subjects who lived in the same communities as the patients. Healthy subjects met the following criteria: They were not suffering from other diseases of the digestive system, other malignant tumors, or mental illness; were similar in age to the patients with GC; and they volunteered to participate in the investigation.
Development and formation of GC-PROM
The GC-PROM was developed in three phases[21], and details of each phase are described below. Figure 1 presented a flowchart of three-phase development process.
Phase 1: Identification of conceptual framework and items
Literature searches and Patient interviews
Literature searches were carried out on network databases for keywords such as PRO measure, PRO scale, PRO instruments, and gastric cancer. Using the principles of FDA on the PROM and search results, we established a conceptual framework for GC-PROM including four domains and 13 subdomains. We conducted face-to-face interviews with ten patients with GC. Researchers wrote down the interviewees’ original words as far as possible. After the interview, all information was sorted and an initial pool was developed.
Cognitive test and expert consultation
Other ten hospitalized patients with GC took part in a cognitive test of the questionnaire. The group included seven men and three women, with an average age of 54 years. We also sought views from experts. In the final step, we integrated the views of experts and patients to modify the items and develop the draft version of GC-PROM.
Scale scoring
The response options of items used five-point Likert scoring scales, with scores ranging from zero to four points, including positive items (items with higher QoL) and negative items (items with lower QoL). For the convenience of calculation, positive items were recoded as the original score plus one point. The negative items were recoded as five minus the original score[22].
Phase 2: Formation of initial and final scales using two item-selection processes
During the formation process of GC-PROM, seven methods were used to select items through two item-selection processes. The first six methods were based on classical test theory (CTT). The IRT was used as the seventh method. One of IRT models (i.e., Samejima’s Graded Response Model) were the preferred methodology for statistically analyzing patients’ latent traits[23]. An item was considered for selection if it was retained by six or more methods. An item’s practical significance was considered before deleting in the pre-survey. If it was meaningful in fact, the item would be temporarily retained and screened in the formal survey. We finally removed this item when it was still suggested to be deleted.
Statistical methods
Seven methods were used to evaluate the items:
- When the standard deviation (SD) of an item was ≤ 1, the corresponding item was deleted[24].
- We deleted items with factor loading that were low (< 0.4) or close to other factors in the exploratory factor analysis[25].
- An item was considered for deletion when the Pearson correlation coefficient for the item and its subdomain was < 0.60 or the Pearson correlation coefficient for the item and another subdomain was > 0.50[25].
- An item was considered for deletion when the corrected item-total correlation was < 0.50 and the item’s deletion increased the value of Cronbach’s alpha coefficient[24].
- Items with smaller correlation coefficients of retest reliability (< 0.6) were removed[26].
- Each item score of patients and healthy subjects was analyzed using a t-test to distinguish the items in distinction analysis. Deletion was recommended for items with P values > 0.05[23].
- In the Graded Response Model, the practical values of the item parameters for deletion were as follows: item discrimination parameter (a) < 0.4 or difficulty parameter (b) (–3, 3)[27].
Phase 3: evaluation of measurement properties
The properties of the final GC-PROM version were assessed by using data from a formal investigation.
Evaluation of reliability
The internal consistency of the GC-PROM was assessed by using Cronbach’s alpha coefficients of 13 subdomains. Generally, a value of more than 0.70 indicated that it had a good internal consistency[28].
Evaluation of validity
Content validity. The relevant literature, subjects’ opinions, and experts were consulted in establishing the content validity, which represents how well the items captured the concept of interest[29].
Construct validity. Confirmatory factor analysis was used to examine the structure of the GC-PROM. The standardized factor loadings for an item should be greater than 0.5[30].
Discriminant validity. Discriminant validity is the ability of an instrument to measure a difference between two groups. The t-test was used to compare differences between patients with GC and healthy subjects, with the significance level set at P< 0.05[31].
Evaluation of feasibility
Feasibility mainly reflects the acceptability of the GC-PROM. The return and response rate of the questionnaires was rationalized with the general requirement set at85%. The questionnaire completion time was generally less than half an hour. We also took the proportion of miss data and maximum endorsement frequencies[32].
Interpretation of PRO results: Minimal clinical important difference (MCID)
MCID was designed to solve the clinical explanation problem of a GC-PROM score change[33]. The methods used to estimate the MCID mainly include the effect size (ES), standard error of measurement (SEM), standardized response mean, and reliable change index (RCI) [34]. In this article, we used SEM and RCI to estimate the MCID.