Participants and procedure
Setting
Singapore is a multi-ethnic country, with the major ethnic groups being Chinese (74%), Malay (13%) and Indian (9%) (11). Among the population aged 15 and above in Singapore, 94.7% of the population are literate in English, Chinese or both languages, with 15.6% literate in Chinese only (12).
Translation and cultural adaptation
Guidelines for the POS family of measures were used for translation of the Chinese version and cultural adaptation of both patient versions (13). The procedure for the Chinese version, was as follows: conceptual definition or equivalence, forward-translation, back-translation, expert review, cognitive interviewing, and proofreading. In the first phase, staff and researchers working in palliative care identified conceptual definitions and equivalence of key concepts. Next, forward-translation was conducted by two researchers independently from English to Chinese. A third person facilitated discussions to produce a preliminary Chinese version. This preliminary version was then back-translated by another two researchers independently from Chinese to English. In the fourth phase, review was performed by an expert panel comprising two palliative care doctors, one palliative care nurse, one medical social worker and one health outcomes researcher, in consultation with the original developers of IPOS to refine the phrasing of each question. Subsequently, 12 English-speaking and 12 Chinese-speaking patients with advanced solid tumours completed cognitive interviewing via semi-structured interviews. Feedback on the clarity of questions was obtained from the participants and used to revise question phrasing, to ensure that items could be clearly understood within Singapore's context of both languages. The finalised versions were reviewed by the researchers and developers of IPOS.
Validation
Inpatients were recruited from Singapore General Hospital; community patients were recruited from National Cancer Centre clinics, Dover Park Hospice and Assisi Day Hospice. The inclusion criteria were: i) at least 21 years old, ii) diagnosed with advanced cancer, defined as stage 4 solid tumour, iii) able to communicate in either English or Chinese, iv) aware of their advanced cancer diagnosis and v) able to give informed consent. There were no event-based or other criteria such as hospital admission or time from diagnosis. Healthcare staff managing the patients who consented to participate were recruited for evaluation of the staff version of IPOS in English. This study was approved by the Singhealth Centralised Institutional Review Board (CIRB) (Reference number 2018/2086). All data collection was done in accordance to relevant guidelines and regulations.
Data were collected at 2 timepoints – baseline and follow-up at 2 to 5 days later for inpatients or 7 to 21 days for community patients. At baseline, patients completed the patient version of IPOS, Functional Assessment of Cancer Therapy – General (FACT-G) and Edmonton Symptom Assessment System - revised (ESAS-r) (14, 15). At the follow-up timepoint, patients completed the patient version of IPOS and answered a question on whether they felt their main problems or concerns had changed since baseline (Global change question). Patients chose either the English or Chinese version of IPOS and used the same version for both timepoints. At both timepoints, a staff member (either nurse or doctor) involved in the patient’s care completed the staff version of IPOS, and reported the time needed to complete the questionnaire as well as the perceived utility of the questionnaire in patient management. All staff used the English staff version of IPOS.
Measurement Tools
IPOS
A 17-item questionnaire comprising 3 subscales: Physical Symptoms subscale (10 items), Emotional Symptoms subscale (4 items) and Communication and Practical Issues subscale (3 items) (4). Each item was scored on a 5-point Likert-type scale from 0 (best) to 4 (worst) for each individual item for patients. Staff versions included the same questions with an additional option of "cannot assess". In addition, responses could be marked as “not applicable” or “don’t want to tell”. Total and subscale scores were summed, with higher scores indicating poorer outcomes.
Functional Assessment of Cancer Therapy – General (FACT-G)
A 27-item quality of life measure that comprises four subscales of Physical Wellbeing, Social Wellbeing, Emotional Wellbeing and Functional Wellbeing (14). Each item was measured on a 5-point Likert Type scale. Higher scores indicate better functioning. This was used to assess construct validity.
Edmonton Symptom Assessment System-revised (ESAS-r)
A 9-item measure with visual analogue scales (scored from 0–10) for pain, shortness of breath, nausea, depression, activity, anxiety, wellbeing, drowsiness, and appetite (15). Higher total summed scores indicate high symptom burden. This was also used to assess construct validity.
Global change question
At follow-up, patients were asked “Since the questionnaire was last completed, thinking about your main problems and concerns, would you say that: things have got much better, things have got a little better, there has been no change, things have got a little worse, or things have got much worse”. This was used to determine the patients included in the sample to assess test-retest reliability.
Staff feedback
Staff were asked to report the amount of time taken to complete the questionnaire on a 3-point scale (< 5 minutes, 5–10 minutes, > 10 minutes), and their opinion on the relevance of the staff IPOS for assessing patient outcomes on a 4-point Likert scale (very relevant, slightly relevant, slightly irrelevant, and very irrelevant).
Demographic and clinical data were collected from medical records. English and Chinese versions of patient responses were analysed separately.
Descriptive Statistics
For total and subscale scores, floor and ceiling effects were calculated, and a threshold of 15% was deemed acceptable (16).
Validity
To determine structural validity, we conducted a confirmatory factor analysis (CFA) for patient and staff IPOS responses at baseline using the pre-determined subscales of IPOS (5). Responses were treated as ordered categorical data. We hypothesized that using the three pre-determined subscales of IPOS (Physical Symptoms subscale, Emotional Symptoms subscale, Communication and Practical Issues subscale) with each item loaded onto one subscale, the goodness-of-fit indices would be within acceptable limits (Comparative fit index [CFI] and Tucker-Lewis-Index [TLI] of more than 0.90 and Root Mean Square Error of Approximation [RMSEA] of less than 0.08) (17).
Known-group validity was evaluated using Student’s t-test comparing total and subscales between patient responses obtained in the inpatient vs community settings at baseline. We hypothesized that inpatients would be more unwell and have more problems and concerns than community patients. Therefore, we anticipate that inpatients would have higher IPOS scores than community patients.
Construct validity was tested by correlating IPOS subscales with the respective total and subscale scores of ESAS-r and FACT-G, using Pearson’s correlation coefficients (r) and data from baseline. We hypothesized that there were correlations of r≥|0.3| between the following scores, as they measure similar themes as found in previous studies (5):
-
IPOS Physical Symptoms subscale vs patient ESAS-r Total and FACT-G Physical Wellbeing subscale
-
IPOS Emotional Symptoms subscale vs patient ESAS-r Total and FACT-G Emotional Wellbeing subscale
-
IPOS Communication and Practical Issues subscale vs FACT-G Social Wellbeing subscale
Reliability
Internal consistency was evaluated by calculating Cronbach's alpha for the total and subscales using staff and patient responses at baseline. Following the original validation study, we adopted a lower threshold (0.60 instead of 0.80 normally accepted). Due to the non-redundant nature of IPOS, we expected less agreement between individual questions in each subscale as each question assessed for a different aspect of QOL (5).
Inter-rater reliability between patient and staff IPOS was tested by calculating the intraclass correlation coefficient (ICC) using an analysis of variance (ANOVA) estimator for total and subscale scores.
For patients who responded "no change" to the Global change question, patient IPOS scores at baseline and follow-up were used to evaluate for test-retest reliability. This was done by calculating the ICC using an ANOVA estimator for total and subscale scores.
Healthcare worker’s acceptability
Proportions were calculated for staff responses on the amount of time spent to complete the questionnaire and their opinion on the relevance of the staff IPOS for assessing patient outcomes.
Sample Size
To establish construct validity, a sample size of 113 per language was needed to detect a Pearson's correlation coefficient of at least 0.3 between the summative symptom assessment score in IPOS and ESAS-r, with 90% power at 5% two-sided type 1 error.
Missing Data
IPOS and ESAS-r responses with multiple answers for a single item, missing responses, “not applicable” response, “don’t want to tell” response or “cannot assess” response for any of the 17-item closed-ended questions were removed from analysis (18). In cases with missing or "not applicable" responses for FACT-G, values were replaced with the mean of their respective subscales, with all the subscales being at least 50% completed (19).