In this study, we leveraged data from a longitudinal observational study to train and test different machine learning algorithms to develop a predictive model of immune responses to SARS-CoV-2 mRNA primary and booster vaccination in PLWH. The specific aims were to forecast vaccine-elicited humoral responses in this vulnerable population based on several demographic and clinical information that may be easily retrieved from electronic charts in clinical practice settings, and to simultaneously analyze the impact of these variables on antibody production over time.
We found that, while linear regression models show suboptimal performances, non-linear methodologies display a significantly better ability to capture the intricate relationships among variables. In particular, Random Forest regression resulted as the best performing algorithm in predicting vaccine-induced antibody response.
Notably, the key clinical factors influencing the vaccine humoral immunogenicity that were identified by the Random Forest model were: previous SARS-CoV-2 infection, CD4 T-cell count, CD4/CD8 ratio, BMI, and time between primary vaccination cycle and booster dose. In detail, SARS-CoV-2 infection before vaccine administration appeared to positively influence the vaccine-elicited antibody levels. By contrast, low CD4 T-cell counts, CD4/CD8 ratio and BMI values were associated with reduced antibody responses to the vaccine. Lastly, increasing time between primary vaccination cycle and booster dose was associated with higher antibody levels after the administration of booster doses.
Remarkably, all these associations are congruent with findings from other studies [3, 9, 10, 12, 23, 24], further supporting the validity of our model. Indeed, hybrid immunity, derived from a combination of both natural infection and vaccination, has been shown to ensure immune protection which is higher in both magnitude and durability than that provided by either vaccine or infection alone [25]. Furthermore, poor immune recovery despite ART has been distinctly associated with reduced humoral and T-cell responses to SARS-CoV-2 vaccines in PLWH [3, 9–12]. Similarly to obese people, underweight ones develop weaker immune responses to SARS-CoV-2 vaccines [24], due to a severe impairment of the immune system [26]. Lastly, several studies demonstrated that an extended interval between SARS-CoV-2 mRNA vaccine doses results in stronger humoral responses [27–30], owing to a decline in antibody levels which limits the Fc-mediated clearance of vaccine-encoded antigens, thus allowing de novo priming of B cells [30].
Additionally, our machine learning approach expands the knowledge of the modeling strategies to be employed in studies aiming to predict outcomes involving complex biological mechanisms. Indeed, we demonstrated that such associations are not linear and thus more nuanced than previously believed, due to the reciprocal interactions between such factors in influencing vaccine-induced humoral responses.
More interestingly, a reduction of the model dependence from the identified predictors over time was observed, revealing that while the aforementioned factors may play a critical role in dictating humoral immunogenicity to the primary vaccine cycle in PLWH, the importance of their role significantly wane over time, so that antibody responses to booster shots are uniform across the entire population regardless demographic and clinical features.
It is important to highlight that the model presented herein was developed using data derived from individuals vaccinated with the Spikevax™ mRNA vaccine (Moderna). Expanding the scope of the investigation to encompass different vaccine platforms and heterologous prime-boost combinations, alongside with data from other fragile populations and vaccine antigens, will strengthen such findings, providing valuable insights for the design of future vaccination strategies.