Based on the data of 1008 HIV/AIDS patients in Kashgar from 2018 to 2022, this study described the epidemic distribution and analyzed the influencing factors, and developed and verified the classification prediction of XGboost model. The results of the study showed that the rate of co-infection in Kashgar was 33.6 percent, much higher than the 25 percent of new TB cases among HIV-positive children in Ethiopia's North Wolo region[16].
Through the analysis of the demographic characteristics of patients with double infection, it was found that the age of HIV/AIDS patients with PTB infection was mainly concentrated in 30~60 years old, mainly middle-aged and young people, and there were more males than females. This conclusion was similar to the results of a study in New Delhi, which found in the investigation of tuberculosis patients: Men, HIV-TB co-infected, the elderly, people with tuberculosis and MDR-TB were at higher risk of death[17]. This may be because adult men mostly work outside, the number of male extramarital partners, and the proportion of voluntary sex are higher than women, and in some areas with a high burden of HIV, women do not know the HIV status of male partners and are infected with HIV without knowledge; the rising proportion of men who have sex with men, particularly within the elderly group, is contributing to an increase in the rate of HIV infection among the elderly[18]. Patients with double infection were mainly farmers and other, exhibiting an overall low level of education and a limited awareness of knowledge related to infectious diseases. In terms of sexual attitudes, farmers and migrant workers show a more tolerant and open attitude, which can lead to AIDS infection to a certain extent. The patients with double infection are mainly distributed in Kashgar city, which has dense population, relatively developed economy and concentrated migrant workers[19]. Most of the patients with double infection are married. According to the main route of transmission of HIV, sexual transmission is the main route of transmission. Married patients are more likely to be infected and infected with AIDS than unmarried or divorced and widowed patients.
In WHO clinical stage III or stage IV, AIDS patients exhibited weakened immunity. Patients with decreased CD4 cell numbers were more susceptible to contracting PTB or other infectious diseases[20]. The results of this study showed that stage II was a protective factor for double infection relative to stage I. The results of this study were slightly different from those of other studies, possibly because most of the WHO clinical stages of patients in this study were in stage I, resulting in deviations in the data analysis results. CD4 cell count is an important indicator for understanding the immune status of patients, and can also observe the development of the disease. If a patient's CD4 value is low on the first test, it suggests that the patient' s own immunity is weak and can very easily be combined with other infectious diseases. In this study, compared with CD4≥200 cells/mm3, CD4 <200 cells/mm3 was a risk factor for PTB infection in HIV/AIDS patients, which was consistent with the findings of other scholars[21]. White blood cells are also an important indicator to assess the strength of the body 's immunity and can help investigate whether the body is infected, and HIV/AIDS patients with other infectious diseases can lead to abnormal decline or increase of white blood cells. The results of related studies have shown that white blood cell count is a contributing factor to the development of other opportunistic infections in HIV/AIDS patients[22].
In this study, logistic regression model was constructed and compared with XGboost model. The results showed significant differences in AUC between the two models, and the area under ROC curve of XGboost model was significantly larger, and the overall performance of XGboost model was better. The comparison between the decision curve and calibration curve of the two models also showed the same oppinion.
Orwa James used logistic regression and regularized machine learning methods to predict PTB infection in HIV/AIDS patients, and the results showed that logistic regression, Lasso, Ridge regression and elastic network could be used as effective methods to predict tuberculosis in HIV patients, and these techniques may be important for the development of accurate and reliable prediction models for tuberculosis in HIV patients[23]. Seboka Binyam Tariku et al. used machine learning algorithms to evaluate viral load and CD4 classification in adults attending ART care: The experimental results showed that the XGboost classifier was the best algorithm for viral load prediction in terms of sensitivity (97%), f1-score (96%), AUC (0.99), and accuracy (96%), followed by the RF model. Therefore, it can be concluded that the XGboost model has some application value[24]. Some studies have also used machine learning models to predict the spread of HIV in the AIDS pandemic, and the collected research factors are similar to those in this study, such as patient age, marital status, type of residence, occupation and route of HIV infection. The authors used these research factors to establish machine learning models to predict AIDS transmission, and the research results also showed that the XGboost model is the best algorithm to predict HIV status, which can help identify people who may need pre-exposure prophylaxis and is more helpful for the detection of socio-behavioral HIV[25].
In summary, the XGboost model has some use value in predicting HIV virus transmission and other comorbidities.
Limitation
In this study, only existing clinical or demographic parameters were analyzed, excluding healthy lifestyle factors and socioeconomic status factors, and future studies should pay more attention to more detailed characteristics of study subjects.