The proportion of lineage 4 population in Sichuan-Chongqing region was high (29.6%). On the contrary, the proportion of lineage 4 population in North China was low (9.0%). Some studies suggested that temperature and humidity could affect the incidence rate of TB and make it seasonal change in the same area [27, 28]. However, lineage 4 of MTBC was widely distributed in the world, and high burden countries spanned different latitudes and different climatic regions [2, 29]. Therefore, temperature and humidity may not be the important factors leading to significant differences in distribution of lineage 4 in China. In addition, although human activities brought about by the convenience of transportation and the deepening of globalization in modern times may bring interference to the geographical distribution of MTBC, its distribution in different countries and continents was still significant different [6, 29]. Therefore, modern human activities may also not be the main factors affecting the distribution of lineage 4. As large-scale historical human activities can affect the spread of MTBC [30, 31], special historical events may have contributed to the particular distribution of lineage 4 in China.
Previous studies suggested that the early transmission of lineage 4 was related to the “Northern Route” migration route of East Asians (about 15,000 to 18,000 years ago [32]), and lineage 4 was introduced into China from Central Asia or Siberia and spread from north to south [33]. If China's lineage 4 distribution pattern today was mainly derived from this way, the proportion of lineage 4 in North China should be higher than that in Sichuan-Chongqing region and other areas of South China. However, the actual proportion of lineage 4 in China was higher in the south than in the north. This shows that there were other factors have significantly influenced the formation of lineage 4 population in China. Moreover, in Sichuan-Chongqing region, the genetic diversity of lineage 4 was high but the genetic differentiation among the main strains within this region was not significant [14]. This seems to imply that there have been many times in history when lineage 4 was spread to Sichuan-Chongqing region, but the lineage 4 mainly epidemic in this region today could be traced back to limited incoming events. In addition, Liu et al. found that during the period of A.D. 1150–1268, an important external incoming event of lineage 4 occurred in China; the most recent common ancestors of the three main sub-lineages of lineage 4 (L4.2, L4.4 and L4.5) epidemic in South China, appeared in about A.D. 1208, 1268 and 1160; the number of MTBC in China might have increased rapidly between A.D. 1300 and 1400, and the proportion of lineage 4 also increased greatly [10]. Therefore, the lineage 4 that originated from this period and appeared in South China, may be the main factor affecting the distribution of lineage 4 in China today.
The potential spread of lineage 4 to South China through Maritime Silk Road
It is necessary to study the transmission path of the lineage 4 spread to China during A.D. 1150–1268. It was seen that the geographic distribution of L4.5, as a China’s endemic species, which might originate from South China [10, 29], was continuous. This situation may be related to the expansion of the Mongol Empire [34]. On the contrary, distribution of regions with high proportions of L4.2 and L4.4 in Asia, Africa and Europe was discontinuous. This suggested that L4.2 and L4.4’s geographical distribution might not primarily be caused by the Mongol Empire’s expansion. In addition, the possibility of L4.2 and L4.4’s spread by land route could be basically ruled out, because if they were spread by land route, the countries along the route would have higher proportion of them, and the lineage 4 isolates of these countries would be ahead of the lineage 4 isolates of China in evolutionary status. However, the proportion of L4.2 and L4.4 was low in Central Asia, Siberia and other regions where the land route connecting Europe and China had to pass. And L4.2 and L4.4 isolates from some countries in the Middle East and West Asia were nested in the samples from China on the evolutionary tree [10], which means that L4.2 and L4.4 isolates in these areas were not earlier than Chinese samples in evolutionary status, and that L4.2 and L4.4 were spread and diversified in China before being exported to these areas. Therefore, it is more likely that the lineage 4 introduced in China during A.D. 1150–1268 went from Europe to China through the sea route.
In about the 13-14th century, the main maritime migration route was Maritime Silk Road. Previous studies have shown that the pathogens of infectious diseases could be spread through Maritime Silk Road in ancient times. For example, Yersinia pestis, which caused the most terrible plague in European history, was likely to arrive in Europe from China through Maritime Silk Road [35, 36]. Then it is feasible for lineage 4 to reach southeast coastal region of China from Europe through Maritime Silk Road. In addition, 31 main ports associated with Maritime Silk Road were geographically related to regions with a high proportion of L4.2 and L4.4, which indicates that the geographical distribution of L4.2 and L4.4 in Asia, Africa and Europe may be related to Maritime Silk Road. Moreover, the time of the external introduction of lineage 4 (A.D. 1150–1268) is in the Southern Song Dynasty of China (A.D. 1127–1279 [37]), when the foreign trade was highly developed. This dynasty mainly relied on ports such as Guangzhou and Quanzhou in southeast coastal region to conduct trade activities and population movements with Europe through Maritime Silk Road [23, 38–40]. At that time, some foreigners who arrived through Maritime Silk Road lived in southeast coastal region [41, 42]. This also provided suitable conditions for the incoming and local spread of lineage 4.
The potential spread of lineage 4 in South China caused by “Huguang Filling Sichuan”
Sichuan-Chongqing region is mainly located in Sichuan Basin, where the geographical conditions were closed, the traffic was underdeveloped, and the population mobility was difficult in ancient times. It was not easy for spontaneous people-to-people MTBC to enter Sichuan-Chongqing region. However, the proportion of lineage 4 in some provinces that are closer in geographical distance to the possible first place where the external inflow of lineage 4 occurred (southeast coastal region) and that are more convenient in population mobility was lower than that in Sichuan-Chongqing region. Moreover, the most recent common ancestor of the largest strain complex of lineage 4 collected by Li et al. in Sichuan-Chongqing region appeared during A.D. 1069–1498 [14]. This time point was close to the lineage 4 external incoming event discovered by Liu et al. [9]. Therefore, in Sichuan-Chongqing region, the appearance of the lineage 4 entering China in A.D. 1150–1268, was possibly not caused by simple population diffusion. It may be that there was a special population flow which led to lineage 4 entering Sichuan-Chongqing region. In history, the migration event that occurred in Sichuan-Chongqing region at the right time was “Huguang Filling Sichuan”.
There were records of families with the same surname migrated from Huguang region to Sichuan-Chongqing region in “Huguang Filling Sichuan” [22, 43]. And the geographical distribution of the surname populations might represent the population migration [24, 25]. Therefore, the distribution of 4 surname populations (Zeng, Tang, Deng and Zhong) that in the historical data migrated from Huguang region to Sichuan-Chongqing region in the period of “Huguang Filling Sichuan” [43], could represent the population migration in “Huguang filling Sichuan”. It is found that the 4 surname populations were mainly distributed in Sichuan-Chongqing region, Huguang region and southeast coastal region, which is basically consistent with the areas that had a high proportion of lineage 4 in China. And the key distribution areas of the 4 surname populations were all on the “Huguang Filling Sichuan” immigration route. Moreover, there was a significant correlation between the proportion of surname populations of Zeng, Tang, Deng, and Zhong and that of lineage 4 in each province included in the statistics (p < 0.05). It proves that a large-scale population migration might have taken place among these areas, and that the entry of the lineage 4 that spread to China during A.D. 1150–1268 into Sichuan-Chongqing region may be related to “Huguang Filling Sichuan” (Fig. 4). Moreover, for more than 400 years of “Huguang Filling Sichuan” [22], lineage 4 was able to continue to flow and spread in the above areas, resulting in the current distribution and high proportion of lineage 4 in South China, including Sichuan-Chongqing region.