Demographics
The total sample consists of 2559 patients who participated in the study. The patients come from Egypt (767, 29.98%), Pakistan (448, 17.47%), India (401, 15.68%), and Syria (403, 15.75%), and ‘Other’ countries (540, 21.11%). 41.5% of the sample were males (1062 patients) and 58.5% were female (1497 patients). the largest age group was 40-49 years old (1371, 53.6%), followed by 50-59 years old (314, 12.3%), 18-29 years old (269, 10.5%), > 60 years old (255, 10.0%), and 30-39 years old (250, 9.8%). Most of the sample were not smokers (2041, 79.7%), while 249 (9.7%) were occasional smokers and 269 (10.5%) were regular smokers. The average body mass index(BMI) of the cohort was 25.43(±5.06) and about a third of the patients (35.7%) were healthcare professionals. (Table 1,2) (Figure 1)
Pre-Infection Medical History
The pre-existing conditions can be grouped into six categories: gastrointestinal, hematological, respiratory, neurological, allergies, and other conditions. The gastrointestinal conditions include acid reflux disease (200, 8.2%) and irritable bowel syndrome (176, 7.2%). The hematological conditions include anemia (243, 9.9%) and vitamin-d deficiency (227, 9.3%). The respiratory conditions include asthma (156, 6.4%) and mold infections (312, 12.8%). The neurological conditions include tinnitus (198, 8.1%), vertigo/dizziness (384, 15.7%), migraine (203, 8.3%), and insomnia (241, 9.9%). The allergies include environmental allergies dust (312, 12.8%), food allergies (132, 5.4%), and allergies of unknown origin (94, 3.8%). The other conditions include vision problems (249, 10.2%), diabetes type two (168, 6.9%), hypertension (236, 9.7%), and mental health diagnosis (235, 9.6%). (Table 3)
Vaccination, Hospitalization, And Treatments
Of the total sample, 304 patients (12.4%) were hospitalized, with a subset of 252 (10.3%) requiring oxygen support. A significant majority, 1853 patients (75.8%), received a COVID-19 vaccination. The distribution of vaccine types among the vaccinated individuals is as follows: CoronaVac (Sinovac) was administered to 525 patients (21.5%), COVID-19 Vaccine AstraZeneca (AZD1222) to 563 patients (23.0%), BBIBP-CorV (Sinopharm) to 242 patients (9.9%), Comirnaty (BNT162b2, Pfizer/BioNTech) to 257 patients (10.5%), and Moderna or other types to 246 patients (10.1%).
Regarding treatments, the use of common medications was reported as follows: paracetamol was used by 871 patients (35.6%), aspirin by 709 patients (29.0%), ibuprofen and naproxen each by 459 patients (18.8%), azithromycin by 903 patients (36.9%), and steroids by 496 patients (20.3%). Additionally, anti-oxidants were used by 192 patients (7.9%), type one anti-histamines by 358 patients (14.6%), type two anti-histamines by 232 patients (9.5%), and omega-3 supplements by 215 patients (8.8%). (Table 4)
Symptoms of COVID-19 Infection
The most frequent symptoms among the patients were loss of smell (46.8%), dry cough (40.1%), loss of taste (37.8%), headaches (37.2%), and sore throat (28.9%). The symptoms lasted for an average of 13.63 ± 17.50 days. The patients also reported high rates of depression (47.7%), chronic fatigue (6.5%), and infection after vaccination (24.2%). Other symptoms that affected more than 10% of the patients included migraine, tinnitus, reproductive and urinary symptoms, dizziness or vertigo, memory loss, brain fog, mood symptoms, temperature symptoms, cardiovascular symptoms, gastrointestinal symptoms, respiratory symptoms, and muscle and joint symptoms. (Table 5) (Figure 2)
Female health
The study involved 1322 women who had COVID-19 infection. The majority of the women (80.2%) had normal/regular menstrual cycles during the infection, while only a small proportion (2.0%) were pregnant. Prolonged symptoms (> 2 weeks) were observed in 68.1% of the women. Chronic fatigue was reported by 7.6% of the women after the infection(Table 6).
The results revealed that normal/regular menstrual cycles were significantly associated with prolonged symptoms (p<0.001, SMD 0.368). Pregnancy did not have any significant effect on the duration, severity, or frequency of symptoms, chronic fatigue, or depression. (Table 7 and 8)
The multivariate analysis confirmed that normal/regular menstrual cycles were an independent risk factor for prolonged symptoms (OR 1.50, p=0.017). (Table 9)
Infection after Vaccination and Associated Symptoms
Out of 2337 participants, 734 (31.4%) did not receive the vaccine, 1050 (44.9%) received the vaccine and avoided reinfection, and 553 (23.7%) received the vaccine but got reinfected. The analysis showed that the groups differed significantly in the prevalence of depression (p=0.037). The unvaccinated group (50.3%) and the reinfection group (48.6%) had higher rates of depression than the non-reinfection group (44.4%). The groups also differed significantly in the proportion of participants who had symptoms lasting more than two weeks (p<0.001). The unvaccinated group (78.7%) had the highest proportion of participants with long-lasting symptoms, followed by the non-reinfection group (69.5%) and the reinfection group (59.1%). The groups differed significantly in the occurrence of chronic fatigue (p<0.001) as well. The reinfection group (9.4%) had the highest occurrence of chronic fatigue, followed by the unvaccinated group (7.4%) and the non-reinfection group (3.9%). (Table 11, 12)
The multivariable analysis revealed that the following symptoms significantly associated with increased risk of infection after vaccination: increase in mood symptoms anger (OR 1.33, p=0.051), sleeping symptoms insomnia (OR 1.44, p=0.043), respiratory symptoms pain burning in chest (OR 1.58, p=0.009), muscle and joint symptoms joint pain (OR 1.46, p=0.005), gastrointestinal symptoms abdominal pain (OR 1.53, p=0.004), and chronic fatigue (OR 2.32, p<0.001). Symptoms that significantly associated with decreased risk of infection after vaccination in the multivariable analysis were increase in mood symptoms irritability (OR 0.59, p=0.003), insomnia description waking up several times during the night (OR 0.70, p=0.042), respiratory symptoms sore throat (OR 0.79, p=0.075), and muscle and joint symptoms muscle aches (OR 0.59, p<0.001). (Table 13)
Risk Factors for Long Infection Duration
The multivariate logistic model revealed that the factors that were significantly associated with prolonged symptoms (> 2 weeks) were vaccination (OR 1.58, p<0.001), migraine (OR 1.48, p=0.036), and naproxen use (OR 1.74, p=0.002), which increased the odds of having prolonged symptoms, and vertigo or dizziness before the infection (OR 0.68, p=0.002), hospitalization (OR 0.36, p<0.001), anemia (OR 0.62, p=0.003), hypertension (OR 0.62, p=0.003), lower BMI (OR 0.97, p=0.006), aspirin use (OR 0.61, p=0.001), azithromycin use (OR 0.71, p=0.001), and steroids use (OR 0.71, p=0.004), which decreased the odds of having prolonged symptoms. There was no significant association between the use of Paracetamol, Ibuprofen, antioxidants, anti-type two histamine, Omega-3, or irritable bowel syndrome(IBS) and the duration of symptoms. (Table 14)
Symptoms Associated with Prolonged Infection Duration
The results indicated that runny nose (OR 1.25, p=0.044) was the only symptom that increased the odds of having prolonged symptoms (> 2 weeks) by 25%, while the other symptoms decreased the odds of having prolonged symptoms by 22% to 40%. These symptoms were brain fog (OR 0.66 , p=0.008), anxiety (OR 0.76, p=0.046), anger (OR 0.77, p=0.036), loss of smell (OR 0.72, p=0.001), tachycardia (OR 0.75, p=0.032), loss of appetite (OR 0.73, p=0.020), shortness of breath (OR 0.73, p=0.010), dry cough (OR 0.80, p=0.027), and abdominal pain (OR 0.60, p<0.001). (Table 15)
Chronic Fatigue Syndrome
The results indicated that the factors that had a significant higher risk of chronic fatigue syndrome were male sex (OR 0.64, p=0.040), mental health diagnosis before COVID-19 (OR 2.11, p=0.001), vertigo or dizziness diagnosis before COVID-19 (OR 2.73, p<0.001), regular smoking (OR 2.39, p=0.005), hospitalization due to COVID-19 (OR 2.41, p<0.001), vitamin-D deficiency (OR 2.74, p<0.001), asthma (OR 2.01, p=0.013), diabetes type 2 (OR 2.45, p=0.001), nightmares (OR 2.51, p=0.001), and paracetamol use (OR 1.49, p=0.029). The factors that had a significant lower risk of chronic fatigue syndrome were good health status before COVID-19 (OR 0.64, p=0.023), and anti-type one histamine use (OR 0.52, p=0.024). (Table 16)
Symptoms Associated with Chronic Fatigue
The results indicated that the symptoms that had a significant higher risk of chronic fatigue were brain fog symptoms like poor attention or concentration (OR 1.52, p=0.045), increase in mood symptoms like anxiety (OR 1.62, p=0.028), and depression (OR 2.41, p<0.001), insomnia or difficulty falling asleep (OR 2.10, p<0.001), CVS symptoms like thumping or skipping beats (OR 1.93, p=0.002), gastrointestinal symptom like loss of appetite (OR 1.83, p=0.002), joint pain (OR 1.53, p=0.036), and muscle aches (OR 1.51, p=0.041). The symptoms that had a significant lower risk of chronic fatigue were altered sense of smell (OR 0.48, p=0.019). There was no significant association between the other symptoms and chronic fatigue. (Table 17)
Depression
The factors that significantly increased the risk of depression in the multivariable analysis were: age group 40-49 (OR 1.43, p=0.032), being a healthcare professional (OR 1.24, p=0.034), having less than good health status before COVID-19 (OR 2.27, p<0.001), having a mental health diagnosis before COVID-19 (OR 3.90, p<0.001), having tinnitus before COVID-19 (OR 1.88, p=0.001), having vertigo or dizziness before COVID-19 (OR 2.54, p<0.001), smoking occasionally (OR 1.41, p=0.027), requiring oxygen support during COVID-19 hospitalization (OR 2.30, p<0.001), having anemia as a preexisting condition (OR 1.40, p=0.041), having diabetes type 2 as a preexisting condition (OR 1.95, 95% CI 1.32-2.90, p=0.001), having migraine as a preexisting condition (OR 1.55, p=0.016), having irritable bowel syndrome as a preexisting condition (OR 2.03, p<0.001), having insomnia as a preexisting condition (OR 1.74, p=0.001), having nightmares as a preexisting condition (OR 1.79, p=0.018), using aspirin (OR 1.70,p=0.001), and using omega 3 (OR 1.53, p=0.010). The factors that significantly decreased the risk of depression in the multivariable analysis were: male sex (OR 0.64, p<0.001), having more than good health status before COVID-19 (OR 0.68, p<0.001), having vision problems as a preexisting condition (OR 0.70, p=0.026), using paracetamol (OR 0.76, p=0.006), and using anti type two histamine (OR 0.50, p<0.001). (Table 18)
Symptoms Associated with Depression
The symptoms that significantly increased the risk of depression in the multivariable analysis were: migraine (OR 1.47, p=0.003), tinnitus experience (OR 1.47, p=0.039), dizziness or vertigo experience (OR 1.49, p=0.003), brain fog symptoms poor attention or concentration (OR 2.21, p<0.001), increase in mood symptoms depression (OR 2.22, p<0.001), increase in mood symptoms anger (OR 1.38, p=0.020), increase in mood symptoms difficulty controlling your emotions (OR 2.12, p<0.001), insomnia description difficulty falling asleep (OR 1.50, p=0.001), insomnia description waking up several times during the night (OR 2.65, p<0.001), cvs symptoms tachycardia (OR 1.43, p=0.013), gastrointestinal symptoms abdominal pain (OR 1.44, p=0.010). The symptoms that significantly decreased the risk of depression in the multivariable analysis were reproductive and urinary symptoms (OR 0.63p=0.009), respiratory symptoms sore throat (OR 0.80, p=0.042). (Table 19)
Encoder Bottle Neck Layer Output
The AIC was 26190.1, indicating a relatively good fit of the model to the data. The R-squared was 0.35, indicating that the model explained 35% of the variance in the encoder bottleneck layer. The adjusted R-squared was 0.34, indicating that the model did not suffer from overfitting or multicollinearity.
The following factors had a positive relationship the encoder bottleneck layer values: having tinnitus, vertigo, or dizziness before COVID-19 infection (coefficient = 33.73, p<0.001), being hospitalized due to COVID-19 infection (coefficient = 14.75, p<0.001), and having insomnia before COVID-19 infection (coefficient = 23.61, p<0.001).
The following factors had a negative relationship with the encoder bottleneck layer values: being male (coefficient = -5.18, 95% CI = -9.69 to -0.66, p=0.025), taking anti type one histamine medication (coefficient = -0.08, p=0.026), and having vitamin D deficiency before COVID-19 infection (coefficient = -21.69, p<0.001). (Table 25)
Embeddings Output
The AIC was 4483.7, indicating a relatively good fit of the model to the data. The R-squared was 0.4, indicating that the model explained 40% of the variance in the SCARF embeddings.
The multivariable analysis showed that 5 out of 11 factors included in the model were significantly associated with the SCARF embeddings.
The factors with positive coefficients: before covid-19 tinnitus (coefficient = 1.61, p<0.001), before covid-19 vertigo dizziness (coefficient = 0.22, p<0.001), and hospitalization (coefficient = 0.20, 95% CI = 0.12-0.28, p<0.001).
The factors with the negative coefficients: type one antihistamine (coefficient = -0.08, p=0.026) and increase in mood symptoms irritability (coefficient = -0.59, p<0.001). (Table 26)
Prediction Models
The AUTOML algorithm selected the GMB model as the best performing model, and we fine-tuned and tested it for accuracy. The results are shown in the table below. The GMB model for chronic fatigue had a high AUC of 0.87, but a low accuracy of 0.7349, which was not better than the NIR of 0.9353. The model had a high PPV of 0.9908, a moderate sensitivity of 0.7232, and a high specificity of 0.9032, but a low NPV of 0.1842 and a low kappa of 0.2224. The model had a significant difference in the number of false positives and false negatives. The GMB model for depression had a high AUC of 0.82 and a high accuracy of 0.762, which was better than the NIR of 0.5198. The model had a high PPV of 0.7586, a high NPV of 0.7661, a high sensitivity of 0.7952, and a moderate specificity of 0.7261. The model had a moderate kappa of 0.5223 and no significant difference in the number of false positives and false negatives. The GMB model for symptoms duration had a moderate AUC of 0.74 and a moderate accuracy of 0.6931, which was not better than the NIR of 0.6827. The model had a low PPV of 0.5122, a high NPV of 0.8285, a moderate sensitivity of 0.6908, and a moderate specificity of 0.6942. The model had a moderate kappa of 0.3521 and a significant difference in the number of false positives and false negatives. (Table 27)
Model Deployment
The best-performing models then implemented in a shiny app and deployed online at: [https//ahmedshaheen.shinyapps.io/shaheen-covid-19/]
Also, there is an offline version of the application that can be downloaded from: [link]