Data collection
The Chinese government established a routine reporting system for selected infectious diseases during the 1950s, and data are available for 31 provinces in mainland China (about 1.4 billion people). This system was web-based since 2003 and operates through administrative grading responsibility and territorial management13. Hong Kong, Macau, and Taiwan were not included in the present analysis because of a lack of data availability.
The newly notified TB cases examined in this study were extracted from the notifiable infectious disease report database, which is open and available from the public health and science data center of the Chinese Center for Disease Control and Prevention14. These data include the number of newly notified TB cases, and patient status as sputum smear-positive, sputum smear-negative, sputum not done, and sputum culture-positive TB. Due to the high risk of TB transmission, patients with smear-positive TB receive the greatest attention. Therefore, in addition to considering newly notified TB cases, we also examined newly smear-positive TB cases from 1997 to 2018 at the provincial level. In addition, we have contacted with National Center for Tuberculosis Control and Prevention to verify the data before analysis. According to the discussion, the annual data were generally correct and consistent with the data reported to the World Health Organization (WHO).
To explain factors attributed to disparities in disease burden in different provinces, four social-economic variables were described: (a) economic levels: gross domestic product per capita (PGDP); (b) demographic characteristic: population density (persons per square kilometers; PD); (c) accessibility to and coverage of health facilities: including number of beds in medical institutions (NBMI) and number of medical personnel (NMW). The health service data were collected from the China Health Statistical Yearbook15, and other socio-economic variables were obtained from the China Statistical Yearbook16
Statistical analysis
The provinces were divided into three regions (eastern, central, and western regions), as defined by the National Bureau of Statistics of China and according to sociodemographic status. The eastern region included Beijing, Fujian, Guangdong, Hainan, Hebei, Jiangsu, Liaoning, Shandong, Shanghai, Tianjin, and Zhejiang; the central region included Anhui, Heilongjiang, Henan, Hunan, Hubei, Jiangxi, Jilin, and Shanxi; and the western region included Chongqing, Gansu, Guangxi, Guizhou, Inner Mongolia, Ningxia, Qinghai, Shaanxi, Sichuan, Tibet, Xinjiang, and Yunnan17.
Annual notification rate was defined as the number of newly notified cases of TB per 100,000 people. For each province, the numbers of newly notified and smear-positive cases were calculated. For analysis of TB in the eastern, central, and western regions, the numbers of newly notified and smear-positive TB cases from all provinces within each region were calculated. Population data were from the China Statistical Yearbook16.
Joinpoint regression models were used to analyze the annual percentage changes of newly notified cases and their significance18. This analysis, which is often used to describe changes in trends, fits a series of joined linear models of the natural logarithm of annual incidence, using calendar year as an independent variable19. In this analysis, the response variable was the natural logarithm of the notification rate, and the independent variable was the year of notification. A maximum of five joinpoints were used for estimation, as suggested by the program developers, and the Bayesian information criterion (BIC) was used to select the model with the best fit. Once a joinpoint was determined, the annual percentage changes and 95% confidence intervals (CIs) of each period (segment) was determined. An annual percentage change was considered to be significant when its 95% CI did not include zero (p < 0.05). In describing trends, the terms “increase” and “decrease” were used when the slope (annual percentage change) was significant. The term “stable” was used to refer to a non-significant change (i.e., a stable notification rate or a notification rate that was unreported or only reported sporadically).
Then, spatial auto-correlation analysis was used to examine the spatial heterogeneity of annual percentage changes during the most recent period (i.e., after the last joinpoint if a joinpoint was identified, or the whole period if a joinpoint was not identified). Auto-correlation statistics are commonly used to examine spatial dependence in geographic data20. The spatial auto-correlation has two major features: (a) the global spatial auto-correlation, which indicates the overall relationship of all the research units in the area; and (b) the local indicators of spatial association (LISA), which indicate the influence of individual locations on the magnitude of the global statistic and the locations and types of clusters21. The spatial weights were created using the rook contiguity rule, and applied to describe the spatial relationships among provinces. The spatial distribution of annual percentage changes of notification rates from each province were determined by calculating the global Moran’s I and LISA. The following equation was used to calculate the global Moran’s I22:
The local Moran’s I was used to make the LISA cluster maps, and was computed as follows23:
where n is the number of provinces, xi and xj are the annual percentage changes of provinces i and j, is the average of the annual percentage changes of all provinces, and wij is the spatial weight matrix corresponding to the provinces pair i and j. The value of Moran’s I usually ranges from −1 to 1, with positive values representing a positive spatial correlation and negative values representing a negative spatial correlation.
IBM SPSS version 25 was used to compare notification rates in different regions and at different times. Joinpoint regression analysis was performed using the Joinpoint Regression Program version 4.8.0.1 (Statistical Research and Applications Branch, National Cancer Institute). ArcGIS version 10.2 (ESRI, Redlands, CA, USA) was used to perform the spatial auto-correlation analysis and to plot the maps.