At the time of the analysis many records were found to be incomplete or even missing and the sample was reduced to 9 857 observations.
As shown in Table 1, 81.8% of the ASs were men. Median age was 23 years (range= 18-73, IQR= (20.0-27.4)). The most common age group was 18-24 (62.80%).
Data was collected since Q3/16. The number of arrivals to the center was higher in 2016 (max in Q4/16, n=3803, 38.58%), and gradually declined during the following years. Length of stay was available for 6 938 observations: median value was 11 days (IRQ= 4-28), and 77% of the sample time of stay was less than a month.
Most of the sample came from Africa, in particular from SSA (n=6 929, 70.31%). Non-African arrivals were from the Middle East (n=568, 5.76%) and East (n=457, 4.64%).
2 365 individuals (24%) arrived at the center with at least one disease, while 1 684 (17.09%) developed at least one during their stay.
On arrival, the most frequent diseases were skin (27.71%), respiratory (14.46%), digestive (14.73%) and generic diseases (20.88%) (Table 2), whereas during the stay diseases of class R increased considerably (25.70%) and a notable decrease occurred for class S (12.29%) (Table 3). Hospitalizations were 232, classified using the ICPC-2 method as well. The most common causes were generic causes (n=35, 15.09%), respiratory diseases (n=28, 12.07%) and pregnancy issues (n=34, 14.66%). Out of the total number of women, pregnancies were 198 (on arrival) (11.06%).
Prevalence of scabies was high for African arrivals: 882 cases detected, of which 39.46% from HoA and 57.82% from SSA.
3.1 First Tree
Men model (Fig1): The model yielded to a segmentation into 5 leaves. The strongest associated predictor of DoA was Area of Origin, with a split between all the areas except HoA and SSA. Quarter provided further splits among the subgroups. In the former branch (node 2) the 2018 quarters assessed 27.5% probability of arriving with a disease (node 4, n=102) while in the previous periods the frequency was lower (node 3, n=1 038, p=12.2%). In the latter (node 5), a first split was performed between Q4/16, Q1/18, Q2/18 and the others (node 6, n=3 895, p=21.9%). In node 7 Area of Origin determined another split between HoA (node 8, n=543, p=46.2%) and SSA (node 9, n=2 471, p=27.2%).
Women model (Fig2): The model yielded to a segmentation into 3 leaves. The strongest predictor of DoA was Quarter: node 2 described the largest subgroup (n=978, p=17.7%) of women, arrived in 2017 and Q3/16. As regards other quarters (no observations from East), Area of Origin determined a second split between HoA (node 5, n=219, p=46.6%) and the remaining areas (n=593, p=26.6%).
3.2 Second Tree
Men model (Figure 3): Quarter resulted to be the strongest predictor of the response variable. The model performed 4 terminal nodes. During Q3/16, Q3/17, Q4/17 (node 7, n=463) the most frequent diseases were of type A (33.7%). On the other quarters branch (no obs. from CSA) a further split was performed by the variable Area of Origin: for arrivals from HoA, most diseases were included in category S (40% in Q2/17 and Q4/16, node 6, n=145), particularly in Q1/17, Q1/18, Q2/18 (node 5, n=40, S=75%). As regards the other areas (node 3, n=942): A=19.9%, D=16.1%, Other=20.9%, S=28.1%.
Women model (Figure 4): Quarter was the only predictor of ICPC on arrival. The model performed a single split. For arrivals on Q3/16, Q2/17, Q3/17, Q4/17 (node 3, n=133) the most frequent class of diseases was A (34.6%) followed by Other (27.1%). Regarding the other quarters (node 2, n=237), the most frequent classes were S (33.8%) and Other (25.3%). No observations were detected from East, Middle East and NA.
3.3 Third Tree
Men model (Figure 5): The model yielded to a segmentation into 4 leaves. The strongest associated predictor was Quarter, with a split between 2017, Q2/18 (node 2) and the remaining quarters. For the last subgroup (node 7, n=4656) the likelihood of developing a disease during the stay was very low (5.7%). Quarter provided further splits in the first subgroup: Q1/17, Q/2/17 (node 3) and Q3/17, Q4/17, Q2/18 (node 6, n=1406, 22.8%). The former subgroup has been split again according to the variable Area of Origin, determining two terminal nodes: the sample from C.-S. Africa, Horn of Africa and Northern Africa had a probability of 17.8% of developing a disease (node 4, n=208), whereas for arrivals from East, Middle East and Sub-Saharan Africa the prevalence was greater (node 5, n=1781, 36.4%).
Women model (Figure 6): The splitting procedure resulted in 5 leaves and Quarter as the strongest predictor. The lowest probability (5.1%) is observed for Q3/16 subgroup (node 6, n=408). In the remaining 2016 quarter the likelihood for women of age 35 and more was 36.1% (node 9, n=36) while for the other age groups node 8 determined a probability equal to 11.9% (node 8, n=706). In 2017 and 2018, a 50.9% likelihood was found for the Sub-Saharan area (node 4, n=491), 30.1% for the other areas (node 3, n=153).
3.4 Fourth Tree
Men model (Figure 7): The algorithm generated a tree with 4 terminal nodes, in which the strongest predictor resulted to be Area of Origin. The Horn of Africa is split from the other areas, and this subgroup (node 7, n=106) determined a probability of 32.1% to develop a type S disease, while probabilities for other classes were lower. From other areas, the subgroup that arrived during Q3/16, Q2/17, Q3/17, Q4/17, Q1/18 (node 6, n=639) had A as the most frequent class (28.2%). For other quarters, a further split was performed according to the areas of origin: for arrivals from the Middle East (node 5, n=65) over half later diseases were of type R (58.5%). For the remaining areas (node 4, n=432) the most frequent class was R as well (32.2%), with all other classes around 17%.
Women model (Figure 8): In the last model, the only associated variable (Quarter) carried out a single split generating two terminal nodes. For Q3/16, Q4/17, Q2/18 (node 3, n=54) the most frequent class was D (40.7%), followed by class Other (24.1%), A (18.5%), R (11.1%) and S (5.6%). In node 2 (n=355) class Other had the highest probability (30.1%) (R=23.1%, A=17.7%, D=16.6%, S=12.4%).
Trees results were consistent with the logistic models (supplementary material).