Study setting and period
The study setting includes 29 out of 34 Chinese provinces, excluding Hong Kong, Macau, Taiwan, Tibet and Hainan. This study utilized the China Labor-force Dynamics Survey (CLDS), which covers a series of relevant topics, such as family, education, employment, and health. CLDS was selected to conduct this study mainly because of its uniquely rich coding of occupations that enabled us to accurately identify healthcare workers, as we illustrate below. The Fee-For-Service scheme has been dominant in health care payment in China, where health services are unbundled and paid according to the actual volume of services provided. The Fee-For-Service payment scheme is criticized for incentivizing providers to induce unnecessary medical services, especially on drugs and examinations with high profit margins. Analysis of this study was conducted from May 2017 to December 2017.
Study design and data source
We used coarsened exact matching (CEM) to directly compare health care utilization and expenditure between patients affiliated and not affiliated with healthcare professionals. With no applicable policy changes or other forms of natural experiments, quasi-experimental matching methods become a viable strategy for causal inference. Prevalent matching methods, such as propensity score matching, often improve balance between the treated and control groups while leaving balance worse for some other variables. In other words, there is no guarantee of any level of overall imbalance reduction in any given data set [28]. However, CEM makes sure that the imbalance between the matched treated and control groups will not be larger than the ex-ante user choice. Improvements in the bound on balance for one covariate can be studied and improved in isolation as it will have no effect on the maximum imbalance of each of the other covariates [29]. King et al. (2011) further demonstrate that CEM dominates other matching methods in its ability to reduce imbalance, model dependence, estimation error, bias, variance, mean square error, and other criteria [30].
In the absence of well-established cost-benefit or risk-benefit analysis to assess the value of health care services, the demand of health care for healthcare professionals can be regarded as an important benchmark to judge SID. In addition, we only matched patient information, which helps mitigate the concern over demand side driven healthcare utilization and reassures us the existence of SID.
The data were drawn from the China Labor-force Dynamics Survey (CLDS) conducted in 2014. CLDS is an open-access database and the first national longitudinal social survey targeted at the labor force in China, covering a series of topics, such as demographic characteristics, family, education, employment, work history, income, migration, and health (http://css.sysu.edu.cn/) [31,32]. A multistage stratified cluster random sampling method was used, and the subjects of CLDS were the laborers (all family members aged 15–64) randomly selected from 29 provinces in China. The survey was conducted every two years and has accumulated three waves of data now (2012, 2014, and 2016). All investigators were trained before investigation and were monitored during the investigation. Computer-assisted personal interviewing (CAPI) technology was adopted to control data quality. The study was performed in 2017 when we used the available 2014 wave for analysis, in which more than 800 investigators collected 401 village questionnaires, 14214 family questionnaires and 23594 individual questionnaires.
Occupation information on each family member was collected in CLDS, and the occupation was classified using code in the fifth National Census. Following previous studies, our analysis defined patients affiliated with healthcare professionals as patients who were also healthcare workers, or patients with at least one family members who were healthcare professional (see healthcare professional list in Appendix1). Patients not affiliated with healthcare professionals were defined as patients who were not healthcare worker and at the same time had no family members who were healthcare professional. Finally, we identified 806 individuals affiliated with healthcare professionals and 22788 individuals not affiliated with healthcare professionals for analysis.
Variables
The CLDS collected information on outpatient use in the two weeks preceding the survey and inpatient use in one year preceding the survey to measure heath care utilization to avoid the recall bias in retrospective investigation [33]. Therefore, we generated four outcome variables, outpatient proportion (0-No, 1-Yes), outpatient expenditure (continuous variable measured by CNY), inpatient proportion (0-No, 1-Yes), and inpatient expenditure (continuous variable measured by CNY), for analysis.
A series of socio-demographic variables that might be associated with health care utilization was considered for inclusion in the matching. The variables were chosen based on a literature review and data availability (see detailed definition in Table 1) [4,8–10]. We have a set of general variables including a health status indicator (self-reported question, 1-Healthy, 0-Fair/Unhealthy), coverage of health insurance (1-Yes, 0-No), living in urban or rural areas (1-Urban, 0-Rural), age (1-Ages equal and over 60, 0-Aged less 60 ), gender (1-Male, 0-Female), educational attainment (0-Primary school, 1-Middle school, 2-High school and above), access to healthcare (log of time to the nearest medical facilities) and economic status (log of household consumption per capita) in the matching for health care utilization. In addition, we have outpatient hospital tier (0-Primary, 1-Non-primary), inpatient hospital tier (0-Primary, 1-Secondary, 2-Tertiary) and inpatient reason (0-Else, 1-Disease, 2-Rehabilitation, 3-Fertility) for expenditure analysis.
Analysis and Interpretation
We employed the coarsened exact matching (CEM) to better balance distributions of the covariates between the comparison groups and thereby reduce biases [28,29,34,35]. A key property of CEM, comparing with propensity score matching (PSM), is that CEM fixes the maximum imbalance through an ex ante choice specified by the user, i.e., the user decides how the observed characteristics are to be coarsened. The user does not need to further conduct balance checking or restrict data to common support as required by PSM [28,29,34–36]. The matching approach helped to identify the counterparts for patients affiliated with healthcare professionals, based upon the observable pre-treatment characteristics. The general covariates were included in matching for health care utilization and we further included the hospital tier in the matching for per-outpatient expenditure and the hospital tier and inpatient reason in the matching for yearly inpatient expenditure. Overall, we carried out three coarsened exact matching processes in the study.
After the matching, 7722 patients not affiliated with healthcare professionals and 677 patients affiliated were identified for further analysis in health care utilization, 387 patients not affiliated with healthcare professionals and 32 patients affiliated were identified for further analysis in per-outpatient expenditure, and 195 patients not affiliated with healthcare professionals and 31 patients affiliated were identified for further analysis in yearly inpatient expenditure. The balance check (Appendix 2) is reported to confirm that there is no statistical significance between the two groups.
Theoretically, with everything else equal, patients affiliated with healthcare professionals may use less healthcare and incur lower healthcare costs than patients not affiliated due to more information or higher health literacy. The difference in outcomes between the matched groups were regarded as supplied-induced demand and were accessed using 2-tailed t-tests and a significance threshold of P < 0.05. We adjusted the health care expenditure based on the average exchange rate in 2014 (100 USD = 614.28 CNY). Furthermore, we checked the robustness of our results using weighted regression analysis. All analyses were performed in Stata version 13.0 (Stata Corp LP, College Station, Texas, USA).