The present study had two purposes: (i) to determine the data quality, validity and internal consistency reliability of the PIPEQ-CEM used for national measurements in Norway, and (ii) to create a short version of the instrument in order to reduce the burden on respondents and the cognitive load.
The psychometric testing produced good evidence for data quality, internal consistency and construct validity. The PIPEQ-CEM was originally developed using a standardized and comprehensive process, but was adapted according to developments in data collection procedures [5, 6, 7]. The EFA and CFA results strengthened previous results and indicated that the PIPEQ-CEM discriminates between different aspects of experiences, with the following three scales: (i) structure and facilities, (ii) patient-centred interactions, and (iii) outcomes. The scales had excellent psychometric properties, and the PIPEQ-CEM was also considered relevant as a basis for identifying quality indicators. As recommended by Kilbourne et al. [1], mental healthcare quality measures need to be validated across the Donabedian spectrum, involving structure, process and outcome.
The literature review revealed a lack of similar studies, which makes it difficult to compare our results with others. A previous systematic review indicated that the most salient experiences of mental health inpatients that inform the provisions of high-quality services are high-quality relationships; averting negative experiences of coercion; a healthy, safe and enabling physical and social environment; and authentic experiences of patient-centred care [18]. Another recent review identified 75 PREMs available for mental healthcare; while 24 were designed for inpatient and residential settings, the measures differed in scope, content and psychometric robustness [2]. The most-represented dimensions were interpersonal relationships, respect and dignity, access and coordination of care, drug therapy, information, psychological care, and the care environment, which are also included in the PIPEQ-CEM. Another previous national study using the PIPEQ-OS assessed the importance of different types of patient-reported predictors for outcome assessments for mental health inpatients. The results indicated that the most important structure and process variables for patient outcome assessments were related to patient-centred interactions [19].
The relatively small proportion of “not applicable” responses and the low percentage of omitted answers suggest good acceptability and indicate that the questions are relevant to most patients. However, one of the major disadvantages of the PIPEQ-CEM reported by employees at the psychiatric institutions was the burden associated with completing it. Concerns have been raised regarding the cognitive abilities and motivation of patients, and employees have emphasized the need for a shorter questionnaire that is appropriate for patients with a wide range of literacy levels. The present study identified a seven-item short form that provides a uniquely efficient approach for brief and comprehensive measurements that can be applied in the future. The short form includes questions related to if the treatment is adapted to the situation of the patient, if the therapists/staff understand the patient’s situation, if the patient have enough time for discussions and contact with the therapists/staff, feels safe, considers the activities to be satisfactory, if the help and treatment contribute to improving their understanding of mental health issues, and if the help and treatment are satisfactory overall. The present results illustrate the detailed information that can be obtained on an instrument using a combination of EFA, CFA and IRT. Some information from the approaches overlapped, providing triangulated evidence of item quality, while other information was unique to each method. IRT provided item-level detail that informed the revising of the scale, and there was a strong correlation between full and short versions.
The national patient-experience surveys in Norway aim to systematically measure user experiences with healthcare structures and processes of care, as a basis for quality improvement, healthcare service management, free patient choice and public accountability. Previous studies have indicated that two barriers to using patient survey results include delays in disseminating results and a lack of sufficiently specific information at specific levels of healthcare [20–22]. The PIPEQ-CEM results in Norway are published only weeks after the reporting period, and reports are distributed to all units with a sufficient number of responses. Apart from a study protocol with continuous PREMs and patient-reported outcome measures (PROMs) for elective hip and knee arthroplasty [23], we could not find any research studies of large-scale or national programs for continuous measurements of patient healthcare experiences. PIPEQ-CEM represented a novel, feasible and time-effective approach by collecting large-scale data and rapidly reporting responses using web-based administration methods.
Patients with severe mental illness and substance use disorders are often considered vulnerable, and higher rates of mental disorders are associated with social disadvantage, especially alongside low income, low education and occupational statuses, and financial strain [24]. This population is also confronted with persistent gaps in access to and receiving mental healthcare, with major challenges including inadequate treatments and underused guidelines, healthcare variation among geographical regions, stigma and discrimination, and poor adherence to treatment by patients [1, 2, 4]. These studies demonstrate the importance of systematic measurements of patient experiences in mental healthcare. Although measuring patient experiences is an accomplishment in itself that might lead to quality improvement, it is necessary to make the right choices in designing reliable interventions to improve patient experiences. The PIPEQ-CEM provides feedback in specific areas, and the results can be used to monitor performance and identify departments where the quality should be improved from the patient perspective.
The three scales were empirically based, but it is essential that the survey tools and methods provide feedback that is sufficiently specific and can be acted on when conducting user-experience surveys. Further research should address the relevance of local quality improvement work on healthcare services, addressing specific experiences, and timely publishing and sharing of the results that are consistent with the patient experience. The short version of the instrument presented here can be used in settings where respondent burden and cognitive load are crucial issues, but further research is needed since the choices were only driven by data. Further research should involve an expert panel of patients and healthcare professionals to assess priorities.
The appraisals of a patient may differ throughout their hospital stay, and so interpreting the scales would benefit from standardized timing. However, data collection at discharge represents a more-time-consuming method. The NIPH has to establish contact at all levels, all institutions must establish new routines for data collection and the data collection would not be restricted to a specific day or week. Continuous communication between the NIPH and each institution is also needed to report on how the data collection is progressing. Moreover, it is harder to reach patients who drop out of treatment. Even though the number of patients in the surveys has been increasing over time, many patients are still not included. To obtain representative and useful data, all patients should be invited to participate. Future surveys should combine the existing on-site approach with a post-discharge approach for outpatients. The surveys are currently anonymous. Obtaining background data from the Norwegian Patient Registry would allow us to develop follow-up routines, and implement post-discharge surveys to supplement the on-site surveys, enabling non-response analysis and case-mix adjustments.
Web-based surveys have many advantages, but a major limitation is that they exclude those with poor digital literacy. The number of responses might have been larger if the patients also had the option to respond using a pen and paper. Pen-and-paper questionnaires induce complexity and resource demands and will not be available on-site, but national infrastructure might be used for future post-discharge surveys among patients not included on-site, and follow-up surveys for inpatients that responded on-site.
Previous research has concluded that personal contact in recruitment and data collection may increase the response rate, but there is some concern that on-site data collection is associated with different responses. On-site data collection might increase the number of responses and accordingly how representative the data are, but research indicates that on-site approaches result in more-favourable responses compared with mailed surveys [25–27]. We will assess this in future research, especially to identify a method to adjust for mode effects when comparing results obtained by different data collection modes.
The present study has highlighted the use of IRT as an important tool for developing and validating scales, and how its applications can provide richer and more accurate descriptions of performance at the item and scale levels, and allow fielding fewer questions to participants without a loss of measurement precision. However, single items are normally less reliable than scales [28], and the psychometric properties and relevance of the short form of the instrument require further evaluation.
Strengths and limitations
One strength of this study was that the domains and items were derived using a standardized, comprehensive process. Furthermore, the large national sample included responses from 70% of all inpatient units in Norway. The short version of the instrument will hopefully reduce dropout rates and improve the coverage of patients with poor cognitive skills.
A potential source of bias in this study was that response rates and background data on non-respondents were unavailable. Future data collection efforts should aim to include such information and predict hypothetical experiences of non-respondents in order to estimate the impact of response rates, and how these affect patient-experience data. Further research should compare respondents and non-respondents to assess if they have different experiences. Case-mix adjustment is important to fairly compare across different healthcare sections, and more evidence is needed on the impact of case-mix adjustment. The test–retest reliability of the questionnaire should be evaluated in order to determine both short- and long-term reliability, pending a formal test–retest assessment. Furthermore, the generalizability of the results to all inpatient departments in Norway is uncertain.