This study used an online survey to field a DCE to compare the preferences of a sample of the public representing two jurisdictions (Scotland and Sweden). The Scottish DCE was conducted in 2016 [30]; this study replicated that DCE in Sweden, to enable comparisons between the two jurisdictions.
The DCE design and analysis are reported in line with published guidance [20, 29]. The online survey comprised four sections: an initial page of narrative introducing key concepts, and rationale for the sharing and use of anonymised linked data; questions about attitude to sharing and use of anonymised linked data; the choice-sets that formed the DCE; and socio-demographic questions.
Conceptualising the choice question
The DCE used two ‘unlabelled’ alternatives to present scenarios describing the type and use of anonymised linked data. Respondents were asked to select which, if any, of the two alternatives was their preferred option. Respondents could also indicate if neither of the two alternatives was acceptable, allowing respondents the option to ‘opt-out’. Figure 1 shows an example choice-set.
Attribute and level selection
Each alternative scenario was described with five attributes and plausible levels (see Table 1). For all but one attribute, there were four levels; the remaining attribute (research purpose) had three levels as there was no meaningful fourth level. Detail of the identification and generation of the five attributes and their levels has been published previously in relation to the original Scottish DCE [30]. Briefly, the attributes were chosen as the most important characteristics of sharing and using linked data of concern to the public, based on qualitative research [31] and a systematic review of the literature on public attitudes to linked data [12]. The levels were chosen to represent a range of actual or potential variations in these attributes and set to be within realistic and meaningful ranges to represent how linked data could be potentially shared and used. The wording of the attributes and levels was refined through iterations and engagement with members of an existing public involvement panel (the Farr Institute Scotland Public Panel).
Experimental design
There were 768 (44 x 31) unique profiles possible from the chosen attributes and levels, which could create 294,528 different combinations for the choice-tasks. To reduce this unmanageable number of potential alternatives, a main effects design was generated using Sawtooth Software [32] with each respondent allocated to one of 40 blocks each containing six choice-sets. The paired alternatives selected by the software were reviewed to remove irrational or implausible choice sets. Pilot testing in Scotland revealed that using twelve choice-sets resulted in respondent fatigue, hence each respondent was presented with six choice-sets in a random order in the main survey [30].
Survey design and piloting
The DCE was embedded as part of an online survey, as described earlier. The Swedish DCE used the same design as that used previously in Scotland, with appropriate changes for the different organisational health care systems. Forward and backward translation was conducted by an independent organisation and validated by bilingual members of the research team. Each survey was tested, using qualitative piloting in interviews in each country, with a convenience sample of 20 people of a variety of ages and gender. The aims of the qualitative pilot were to ensure respondents understood the instructions and the language used, and to test how they interacted with the survey and how long they took to complete it. Minor changes were made to the ordering and wording of some questions in Scotland for improved clarity [30] and these were carried forward to the Swedish survey where no additional changes were needed.
Study population and sampling frame
The relevant study population for this study were adult (18 years and over) members of the public from two selected example jurisdictions (Scotland and Sweden). Scotland was chosen as an exemplar because National Health Service (NHS) Scotland is a publicly funded health care system that has the capacity to share and use anonymised linked data. Sweden was chosen as a comparator because the use of linked data is relatively more common, with large national registries integrating health and other social data used to answer a range of research questions. The two jurisdictions have comparable universal healthcare coverage by either national (the NHS in Scotland) or local (county councils in Sweden) providers, respectively.
For a DCE, the required sample size depends on the number of choice-sets, the number of alternatives in a choice set, and the number of levels attached to an attribute [33]. Given these characteristics, and the objectives to explore preference heterogeneity and compare the responses between Scotland and Sweden, a sample of 1,000 respondents from each country was deemed more than sufficient for this study. In this DCE, the power calculation for sample size suggested by Orme would indicate a minimum sample size of 167 [33]. This power calculation, however, does not make allowances for investigations into preference heterogeneity nor the difference in preferences between Sweden and Scotland. A published review of sample sizes in DCEs found that, out of 505 healthcare DCE studies, only six had sample sizes of over 1,000 [34].
The DCE was sent to a sample of adult members of the public in the two countries (Scotland and Sweden). The sample was identified using an international market research company, Ipsos [35] (called Ipsos Mori in the UK), who provide members of online panels [36]. Participants were members of the Ipsos international panel (called i-Say), who had volunteered to take part in regular market research surveys. Panellists received regular invitations from Ipsos to participate in surveys and were free to decide whether to complete any individual survey. Panellists were selected at random, and invited to take part via an email, with quotas set on key demographic variables, namely, age, gender, and working status, with the aim of achieving a sample of 1,000 people in each country who were representative of the population for these criteria. The Scottish survey was conducted in August 2016 and the Swedish survey in June 2017; both were live for 14 days, until the quotas were filled.
Screening questions were used at the start of the survey, based on the attributes and levels. For example, respondents were asked which of the levels in the attribute “the purpose of research” was closest to their view, along with the option to select that “data linkage should not be permitted under any circumstances”. Respondents selecting the latter were routed-out and did not complete the DCE [30]. It was hypothesised that these respondents would always select the opt-out option. Removing them from the sample ensured that DCE respondents did not fundamentally object to data linkage and thus allowed an investigation of the nuanced public preferences for conducting research with linked data.
Analysis
Choice data from the DCE were analysed using discrete choice models. All attributes were categorical and were dummy coded relative to a base level (Table 1) that was deemed to be the ‘worst’. The primary analysis estimated the preferences from each sample of respondents from the two countries separately using a conditional logit model. To further compare data between Scotland and Sweden, a pooled conditional logit model was estimated with interaction terms between dummy variables that identified the respondent’s nationality (1=Scottish) and each attribute level. To account for differences in scale, a pooled heteroskedastic conditional logit model with these same interactions was also estimated [37, 38]. The scale parameter was allowed to vary by the respondent’s nationality. In order to identify the scale term, preferences over one attribute must be restricted to be equal across countries. This attribute (purpose of the research) was selected based on statistically insignificant interaction terms in the pooled conditional logit model. All analyses were completed using Stata 13 [39].
The probability of an individual finding a specific scenario acceptable was calculated by estimating the expected observable utility of an alternative and comparing it with expected utility of another. A ‘typical’ linked data scenario was defined as university researchers or health service staff using linked health records for general public benefit, the profit is invested in public services and the process is overseen by the relevant public services. Two scenarios were then specified as: best-case (the most risk averse scenario, where only university researchers use linked data from health records for the benefit of people whose data are being used, there is no profit made and the process is overseen by a non-governmental body) and worst-case (where university researchers, health service staff, government and commercial researchers use health data linked to social care, education, employment and private sector data for research with any purpose, where the profit is kept by those carrying out these research who also oversee the process). Investigations into preference heterogeneity were conducted using a split sample analysis and comparing the probabilities of scenarios being acceptable.