1. Data sources
The data consists of two parts: panel statistical data, including Dalian Health Financial Annual Report, Dalian Statistical Yearbook, China National Health Accounts Report, and Dalian Health Accounts Report from 2017–2019, which were mainly from official sources. Another is patients’ medical expenses, which were collected from medical institutions in Dalian by sampling survey.
2. Study sample
Multistage stratified cluster random sampling was used in this study. Lottery-style drawings (prefecture-level cities) and computer programing (selection of streets, communities, and towns) are used to select samples for each stage. There were three strata included. To make sure the sample is representative. The sampling proportion was 1/3 for each stratum [30]. The first stratum was to choose sample areas from Dalian city based on considering the perfection of health information management system and the level of economic development. We select nine districts as the sample areas, including Zhongshan, Xigang, Sha Hekou, Lv Shunkou, Gan Jingzi, Jinzhou, Wa Fangdian, Pu Landian, Zhuanghe. Besides, there are Dalian municipal medical institutions and public health institutions. In the second stratum, one general hospital, maternal and child healthcare hospital, Center for Disease Control and Prevention(CDC), and traditional Chinese medicine hospital were selected from each region. Five community health service centers, 20 township health centers,17 clinics or outpatient departments, and three villages were selected from each district. In the third stratum, a total of 484 health and medical facilities and professional public health institutions were sampled in the selected sample areas according to the categories of health institutions and administrative levels. A total of 408 institutions with valid data were collected, including 48 primary health centers and community service centers, 312 village clinics and clinics, five maternal and child health care centers, five traditional Chinese medicine hospitals, eight specialized hospitals, 22 general hospitals, and 13 public health institutions (five maternal and child health institutions, six CDCs and two other public health institutions). The remaining 76 institutions were abandoned because their data was severely missing or had insufficient data integrity. The basic information for all outpatients and inpatients included age, gender, disease, type of medical institution, type of insured, season, expenditure, region, etc. A total of 12,899,830 valid samples were collected, including 4,215,603 in 2017, 4,304,902 in 2018, and 4,379,325 in 2019. The collected sample data were cleaned and screened according to the International Classification of diseases-10 (ICD-10) code of classification of diseases.
There are multiple diagnoses of the same patient in the survey data. In this study, only the patients with the first diagnosis of NCDs were selected, without considering other complications. 8,104,233 valid items for NCDs were selected, 2,637,681 in 2017, 2,678,359 in 2018, and 2,788,194 in 2019.
3. Quality control and data management
Data gathering was classified and coded according to ICD-10. Data extract, audits, cleaning, and calculation were maintained by implementing the basic accounting guidelines of SHA 2011 [31]. The National Health Commission of China has trained participants in the data cleansing process. All data were entered electronically into a data terminal connected with STATA version 15.1 (StataCorp, College Station, TX, USA).
4. Analyses of influencing factors of hospitalization curative care expenditure for NCDs
A total of 827,513 items of inpatient data was extracted from the whole valid items of NCDs. Expenditures do not conform to a Gaussian distribution but were log-normally distributed. Multiple stepwise regression was used to analyze the influencing factors. The independent variables were the year, age, gender, length of stay, medical insurance or not, surgery or not, and institution level. Inclusion criteria were 0.05, and exclusion criteria were 0.10. All statistical analyses were performed using STATA version 15.1 (StataCorp, College Station, TX, USA).