The respondents
The majority of the respondents was female (67%, compared to 47% nationwide) and the mean age (53 years) was slightly above the national average (50 years). Half of the interviewed GPs (n=6) work in a Group Practice, the others work in duo (n=4) or in a community health center (n=2). One GP practice also included a local pharmacy. Eight GPs had heard of the PCPR before the invitation. A minority of the GPs had downloaded and used the PCPR before the interview.
According to a few GPs, patient feedback is the most important source of feedback received. Direct feedback provided by patients, including compliments, critiques or complaints, is always taken into account and most impactful. Less direct personal feedback from patients may include the outflow of patients over the years, and feedback provided in patient surveys. However, one GP mentioned that the extent of impersonal patient feedback is insufficient to make valid statements on the quality of delivered care.
R1: “Were patients satisfied and did they appreciate the service, that is actually the most important source of feedback indicating whether I did a good job or not.”
The majority of the GPs was familiar with benchmarking as a general feedback tool. However, regional peer-to-peer sessions with affiliated general practices, in which patient cases are discussed, were mostly used and considered as most essential and helpful. Reviewing deviations and abnormalities between colleagues leads to a deeper insight into and a critical attitude towards their own clinical behavior.
GPs were interested and curious about the benchmark feedback provided by the PCPR. Two GPs expressed a certain lack of trust about the source of the PCPR. The role of health insurance companies as a supplier of data was addressed, leading to a skeptical attitude among these respondents. The main expectation was that the PCPR would provide GPs with insight into their clinical practice and to examine if they perform better than others. Nevertheless, several respondents declared beforehand, that the PCPR would not change their clinical behavior.
R2: “Are a hundred colleagues providing too many treatments or am I providing too few? Hmmm, who knows?”
Content
An important observation of GPs was that the current information in the PCPR hardly presents data on clinical behavior or actual performance. Most information represents total or average costs, usually of healthcare providers following up on referrals. However, GPs feel that they are not able to influence average costs of mental healthcare or medical specialist care. Two GPs explicitly express the wish to receive more specific feedback on their own clinical behavior regarding common disorders and the influence they have on the health of their population.
R1: “I would like to see more figures on my referrals for separate conditions. Current information is mostly cost-related, so price increases have an effect. However, I can’t control those costs”
R7: “It is unclear to me how these figures might help the patient sitting at the other side of my desk.”
GPs miss information on health gains as a possible parameter on how well they perform. With the availability of data on actual health outcomes as a consequence of GPs activities, it is easier to compare one situation with another. E.g. does health of the GP population improve when the GP makes more home visits as compared to longer consultations in the office? Are health outcomes better when GPs perform small surgery themselves, or when people are referred to medical specialist care? And what are the consequences of strict gatekeeping versus more active referral patterns?
R11: “We strive for high quality healthcare, but feedback usually reflects euro’s or percentages, instead of health gains. For how many people suffering from high blood pressure did my prescription result in a lower blood pressure? And how many of my patients have not had a stroke as a consequence? How do I relate to peers?”
Reliability and validity
The majority of the figures appeared evident to the respondents and seemed well understood and correctly interpreted. Especially diagrams containing population characteristics and care utilization of patients, subdivided per age group, led to recognition and were similar to the respondents’ perception of their clinical practice.
R3: “We have fewer elderly patients than average, so that could partly explain the difference.”
Although respondents understand the majority of the presented feedback provided by the PCPR and could provide explanations for all data presented in the figures, not all explanations were based upon a correct interpretation. For example, eight of the twelve GPs indicated that differences and deviations from the expected rates were explained by casemix differences, however figures were already corrected for population characteristics in their specific practice. According to the GPs, this could for example explain differences in applying diagnostics or minor surgery. Some of the respondents disagreed with the data and therefore did not accept the presented feedback.
R1: “Then my thoughts are: is it correct? Is it correct for our population?”
R5: “This seems to be untrue; I do not know how the benchmark is developed, this difference is too big. So, I have my reservations.”
The organization that is calculating and distributing the PCPR is mistrusted because it is associated with the health insurance companies and therefore is not seen as an independent organization. Besides, the trustworthiness of the casemix adjusted ‘expected population’ in the benchmark is questioned. Sometimes the data is difficult to interpret for GPs, yet the GP’s do not ask for help to interpret the data or to guide them in improving clinical behavior. Several respondents indicated that they would be more alert about their registration process because the PCPR showed that they registered their claims incorrectly. Three of the twelve GPs indicate that they may have claimed too little of specific interventions they performed, which would explain a value lower than expected.
R4: “This number strikes me. I think this is due to under registration on our part”
R6: “We should look at our administrative process. Probably we sometimes forget to submit declarations. That is no medical difference, but purely administrative.”
Timeliness is an issue for half of the respondents. In the PCPR that was presented in January 2018, information had reference date July 2016.
R4: “The longer ago, the more difficult it is to apply in practice. I would appreciate feedback on my figures from last year somewhere within the first three to four months of the current year. That would really help me.”
Usability
Respondents were careful in stating implications for their clinical behavior based upon the feedback. They would rather describe scenarios than determine actions to change clinical behavior. A few GPs made assumptions that in the future they might use a different approach for specific patients, or focus on particular patient groups. Some respondents indicated that they might change their clinical behavior by increasing or decreasing the provision of minor surgeries or the amount of primary diagnostic applications. But as guidelines or standards of care are not available, or taken into account in the PCPR, it is impossible to decide what should be the preferred action.
R5: “If you are higher than expected at one year and lower at the year after, then I think: ‘the average seems good, there’s no reason to work differently’. If I have a consultation with a patient who is coughing for half a year, I do not think ‘oh, I’ve applied too much photo so far, let’s not refer this patient.’ You still look at the patient and his needs.”
R12: “Suppose I request a huge number of hip scans although there is no scientific evidence, but with all colleagues acting the same way. The presentation of a casemix corrected expected value based on averages would not lead to an adjustment of my clinical behavior.”
The PCPR generally presents data on the practice level, in which more than one GP may be active. More individualized feedback gains support from a majority of respondents. Although GPs recognize that they are partly responsible for the fact that a breakdown to the individual level is not possible, as claims data are filed per practice.
R6: “The figures per GP in this practice are indistinct. All patients are registered on [NAME OF PRINCIPAL GP]. So, if I request diagnostics, it will appear under his name”.
R9: “In this practice, two days a week another GP is employed. He requests tests on my practice code”.
Assessment tool based on interviews to systematically analyze the PCPR
From the interviews with GPs, recurring criteria emerged that can be categorized into three themes that were identified as decisive for the perceived effectiveness of performance feedback aimed at GPs. (1) Content: does performance feedback refer to actual healthcare delivery activities. (2) Reliability and validity: is information recognizable and reliable for the addressed person. (3) Usability: is performance feedback accompanied by an articulation of (behavioral) performance goals and/or an action plan?
Following these three key themes from the interviews, we constructed an assessment tool to systematically analyze all 34 tables and graphs in the PCPR. The key criteria are operationalized using numerous variables and transformed into an effectiveness assessment tool to systematically explore the PCPR more in depth. The complete assessment tool is shown in Appendix 3, the assessment itself in a separate document, Appendix 4.
Assessment of the Praktijkspiegel
Content
Figure 3 demonstrates that of the 34 tables and graphs, 21% (7/34) reflect on actual clinical behavior, while 62% (21/34) present data on average or total costs of patients in the GP practice. Other tables and graphs for example, compare demographic characteristics of the patients in the practice to the regional average. Tables and graphs that reflect clinical behavior provide feedback on the number of applications for diagnostics (3/34, 9%), number of consultations (2/34, 6%), number of interventions (2/34, 6%) and number of referrals (2/34, 6%). In 32% (11/34) of the tables and graphs, the PCPR provides feedback on costs that are partially or completely related to activities of the GP. Costs that are more or less the result of treatment choices of medical specialists after referral by the GPs are presented in 41% (14/34) of the tables and graphs. GPs are not able to influence the treatment choices of medical specialists and the associated costs. Some tables or graphs of costs associated with mental healthcare present total costs of the care delivered in a primary care setting (mental health practice assistant) and in specialized mental health facilities combined.
Reliability and validity
The PCPR summarizes claims data of the population from a GPs practice per calendar year. In 4 tables and graphs, data regarding the total number of registered people is presented. In 30 tables and graphs, a subset of the registered people is presented, while “high cost patients” are excluded. High cost patients have mental healthcare costs of €10.000 or more, or total healthcare costs of €22.500 or more.
In three descriptive tables and graphs it is stated that (all) registered patients are included, in the other 31 tables and graphs it is not described clearly to what specific (sub)population the table or figure refers and if outliers are excluded.
In most (28/34) tables and graphs casemix correction is performed, but this is not directly clear from titles or explanatory notes. In an appendix to the PCPR, casemix correction method is explained. Actual values or percentages are compared to the expected values for a practice with a population.
The data presented is usually one year old (15/34, 44%) or two years old (19/34, 56%).
Usability
One table and graph presents the feedback of individual GPs within the group practice. All other (33) tables and graphs use the GP practice as aggregation level of analysis. Also, all except one used benchmarks. Figure 4 shows the benchmarks that were used. In most tables and graphs (28/34), the casemix corrected reference population is presented as the benchmark, often (18/34) also compared with values of own clinical practice in other years.
In some cases, as figure 5 shows, data is presented for one or more subgroups of patients of the GP practice. These subgroups allow to further analyze healthcare delivery or costs for specific age groups. In 20/34 tables and graphs, no subgroups are presented. In 14/34 tables and graphs a breakdown in age-groups, gender, healthcare use, or socio-economic status – based on aggregated statistical data of the average regional income – is displayed, with one table and graph combining age and gender.