Data collection
To investigate the link between HLA alleles and leprosy susceptibility in Europeans, it was necessary to examine individuals who lived in medieval Europe, where the leprosy epidemic was estimated to affect a large proportion of the population [5, 10, 11, 37]. To this end, skeletal remains of over 350 individuals excavated from 16 Danish and 2 German archaeological sites (10th -19th century AD) (Table 1; Additional file 0: Skeletal_collections-literature) were analysed using a combination of molecular and osteological methods (see Material and Methods). Following the application of stringent selection criteria and inclusion of previously published data [38] (see below), 83 cases and 210 controls were used for the association study (Table 1).
Table 1
Number of examined individuals per archaeological site.
Site | Location | Dating (year AD) | N | N2 |
Skt. Jørgensgården† | Odense | 1270–1550 | 68 | 60# |
Skt. Knuds Plads | | 1086–1800 | 2 | 2 |
Albani Torv | | 1000–1540 | 8 | 7 |
Klosterbakken | | 1086–1800 | 1 | 1 |
Skt. Trinitatis/Drotten | Viborg | 1000–1529 | 13 | 12 |
Skt. Morten | | 1000–1529 | 4 | 4 |
Skt. Mathias | | 1000–1529 | 2 | 2 |
Skt. Mikkel | | 1000–1800 | 11 | 9 |
Gråbrødrekloster | | 1529–1812 | 2 | 2 |
Faldborg Kirkegård | | 1100-mid 16th c. | 3 | 3 |
Klosterkirke | Horsens | 1600–1800 | 50 | 49 |
Ole Worms Gade | | 1050–1536 | 24 | 21 |
Ødekirkegård | Sejet | 1250–1575 | 25 | 24 |
Tirup kirke | Tirup | 1150–1350 | 29 | 25 |
Gråbrødrekloster | Ribe | 1250–1536 | 15 | 14 |
Ribe Domkirke | | 900–1738 | 9 | 5 |
St. Jürgen | Lübeck | 1270–1550 | 30 | 30 |
Gut Melaten | Aachen | 1230–1550 | 56 | 23 |
Total | - | 900–1812 | 352 | 293# |
N – number of analysed individuals; N2 – number of individuals included in the case-control study; † - also known as St. Jørgen; # - includes individuals with HLA genotypes reported in a previous study by Pierini et al., 2020.
Metagenomic screening for M. leprae
Screening for the presence of M. leprae DNA revealed nine (16%) individuals from Gut Melaten and 28 (41.2%) individuals from the Skt. Jørgensgården cemetery to be positive for infection (Additional file 1: Tab. S1). M. leprae was not detected at any other site. Six M. leprae genomes (five from Gut Melaten and one from Skt. Jørgensgården) were successfully reconstructed with over 50% of the genome covered by at least one read (Table 2). The strains were subsequently subjected to phylogenetic analysis together with 38 previously published medieval and 139 modern M. leprae genomes (Fig. 1, Additional file 2: Fig. S1 and Additional file 3: Tab. S2). M. lepromatosis was used as an outgroup. All new genomes clustered within the diversity of phylogenetic branch 3. None of the Melaten strains clustered together, showing a considerable degree of diversity among them.
Table 2
Basic statistics of the sequence reads from six samples mapping against the M. leprae reference.
Site | Grave ID | N of reads# | Mean coverage | Coverage 1x [%] | Coverage 2x [%] | Coverage 3x [%] |
Gut Melaten | G19 | 152884 | 4.22x | 85.63 | 68.73 | 52.86 |
Gut Melaten | G119 | 111253 | 3x | 83.66 | 63.13 | 43.68 |
Gut Melaten | G43 | 376821 | 11.71x | 97.04 | 94.8 | 91.18 |
Gut Melaten | G91 | 103455 | 2.38x | 66.19 | 38.36 | 20.55 |
Gut Melaten | G35 | 68098 | 1.47x | 51.24 | 21.18 | 7.87 |
Skt. Jørgensgården | G33 | 232312 | 5.03x | 95.96 | 90.42 | 79.76 |
# The BAM files were filtered for a mapping quality of > 30.
Figure 1. A maximum-likelihood tree illustrating the phylogenetic position of the M. leprae strains. The strains from Gut Melaten and Skt. Jørgensgården (shown in red). Previously published medieval strains are marked in blue and modern M. leprae are shown in black. Circles at each node represent bootstrap support over 500 replications. Only branches with bootstrap support of at least 60 are marked. The tree includes 183 strains (139 modern and 44 medieval) (Additional file 3: Tab. S2). M. lepromatosis was used as an outgroup. Country codes can be found in the Additional file 4: Tab. S3. Dates are provided for ancient strains. The tree was visualized with iTOL [39].
Determination of cases and controls under strict selection criteria
All individuals were evaluated for kinship to avoid introducing bias stemming from close relatedness. Kinship was detected for four pairs of individuals (Additional file 5: Tab. S4) and therefore, one individual from each pair (n = 4) was excluded from further analysis. Based on molecular and osteological evidence of leprosy, 72 individuals were considered leprosy-affected “cases”, while 222 individuals were assigned to the “control” group (Additional file 1: Tab. S1).
After the HLA-targeted capture and HLA typing of the six classical loci (class I: HLA-A, B, C and class II: DRB1, DQB1 and DPB1), HLA profiles were successfully produced for 266 individuals (90.5%) (56 cases and 210 controls) (Additional file 1: Tab. S1). Additionally, HLA genotypes generated in a previous study [38] were incorporated, which resulted in a final dataset of 293 individuals (83 cases and 210 controls) (Table 1, Fig. 2 and Additional file 1: Tab. S1).
Figure 2. Map of archaeological sites from which skeletal remains included in the case-control study were excavated. Archaeological dating and the number of individuals are presented in brackets. The leprosy status of individuals at each site included in the association study is color-coded: red for cases and blue for controls. The location of all sites is shown in panel A. Panel B presents the names and locations of all Danish sites.
Evaluation of heterogeneity within the medieval sample and temporal genetic continuity in Europeans
Due to the inter-population variation in HLA allele frequencies [40], the sample was evaluated for genome-wide heterogeneity. Based on a set of informative SNPs, principal component analysis (PCA) was performed. The PCA showed that all individuals grouped together within the diversity of present-day Europeans with the majority of individuals clustering with the populations of western and northern Europe (Fig. 3A, Additional file 2: Fig. S2). When cases and controls were considered separately, the groups overlapped on the PCA plot along both axes (Additional file 2: Fig. S3). These results indicate that there is no substantial genome-wide diversity between the groups and a comparison of allelic frequencies in a case-control association study is warranted. Furthermore, population differentiation was measured with the fixation index (FST) of the HLA alleles. The FST between medieval cases and controls was low (< 0.004), suggesting that the allele pools of the two groups were highly similar. All medieval individuals pooled together exhibited very low levels of differentiation from present-day northern and central Europeans (Fig. 3B), which points to population continuity since the Middle Ages.
Figure 3. Analysis of differentiation between medieval and modern populations.
A - PCA of 185 medieval individuals, based on SNPs from the 2140k SNP panel, projected onto the first two principal components calculated from 66 present-day West-Eurasian populations.
B – PCoA of 168 medieval individuals based on the pairwise FST values for HLA genotypes. Variance accounted for each principal coordinate is shown within axis labels.
Association analysis – primary analysis
In comparison to studies conducted in present-day populations, our sample size was relatively small. Therefore, testing of all observed alleles would drastically reduce the power of detecting significant association signals (Additional file 6: Tab. S5). To overcome this limitation, a targeted approach was taken, i.e., 13 alleles previously linked with leprosy in modern non-European populations were selected for the association analysis (primary analysis) (Additional file 7: Tab. S6). The alleles were chosen based on a strong association (odds ratio (OR) of at least 2 or less than 0.5) reported in studies with a robust sample size of at least 200 cases and 200 controls. As a large proportion of the HLA calls in this study were made at the first-field resolution, which groups alleles of highly similar sequence and function, the association analysis was performed at this level.
Across the 13 selected alleles, the frequency of HLA-B*38 differed statistically significantly between cases (4.5%) and controls (0.3%) (Table 3). Furthermore, although not statistically significant, trends were observed for HLA-B*44 and DRB1*15 which were more frequent in cases in comparison to controls: 14.3% vs. 8.6% and 25.2% vs. 17.1%, respectively. Statistical power analysis indicated that this study is underpowered (achieved power < 0.5) to detect the observed frequency shifts for these alleles (Additional file 8: Tab. S7). Allele frequencies for the remaining 10 alleles were comparable between cases and controls.
Table 3
Results of the case-control association analysis between 13 selected HLA alleles and leprosy.
HLA allele | CASES (n = 83) | CONTROLS (n = 210) | p-value | pc | OR | CI low | CI up |
N | Fr | N | Fr |
A*29 | 2 | 0.016 | 5 | 0.014 | 1.000 | 1.000 | 1.158 | 0.109 | 7.184 |
B*07 | 20 | 0.179 | 54 | 0.150 | 0.459 | 1.000 | 1.235 | 0.665 | 2.227 |
B*14 | 0 | 0.000 | 1 | 0.003 | 1.000 | 1.000 | 0.000 | 0.000 | 125.461 |
B*38 | 5 | 0.045 | 1 | 0.003 | 0.003 | 0.044* | 16.701 | 1.841 | 794.261 |
B*44 | 16 | 0.143 | 31 | 0.086 | 0.102 | 1.000 | 1.772 | 0.866 | 3.506 |
B*49 | 0 | 0.000 | 2 | 0.006 | 1.000 | 1.000 | 0.000 | 0.000 | 17.200 |
C*02 | 5 | 0.041 | 17 | 0.048 | 1.000 | 1.000 | 0.842 | 0.238 | 2.448 |
C*05 | 9 | 0.074 | 23 | 0.065 | 0.834 | 1.000 | 1.139 | 0.450 | 2.648 |
C*08 | 0 | 0.000 | 6 | 0.017 | 0.346 | 1.000 | 0.000 | 0.000 | 2.447 |
C*15 | 5 | 0.041 | 12 | 0.034 | 0.778 | 1.000 | 1.210 | 0.327 | 3.788 |
DRB1*10 | 1 | 0.008 | 1 | 0.003 | 0.457 | 1.000 | 2.812 | 0.036 | 221.567 |
DRB1*15 | 31 | 0.252 | 59 | 0.171 | 0.062 | 0.802 | 1.631 | 0.959 | 2.742 |
DQB1*04 | 4 | 0.032 | 18 | 0.052 | 0.464 | 1.000 | 0.602 | 0.145 | 1.876 |
N – number of alleles; Fr – frequency; pc – p-value corrected for multiple testing using Bonferroni correction (number of tests = 13); * – p-value statistically significant after correction for multiple testing; CI low – confidence interval lower bound of OR; CI up – confidence interval upper bound of OR |
Association analysis – explorative analysis
To investigate the medieval HLA dataset for potentially novel associations, an explorative association study was performed for all the remaining alleles observed across the six classical HLA loci (HLA-A, B, C, DRB1, DQB1 and DPB1) (Additional file 9: Tab. S8). Nominally significant differences in allele frequencies were noted for HLA-A*23, DRB1*13 and DPB1*452 (Table 4). Whilst A*23 was more frequent among cases (5.5% vs. 0.3%, p = 0.0004, OR = 21.207), the frequency of DRB1*13 was over two times higher in controls relative to cases (16.8% vs. 8.1%, p = 0.017, OR = 0.439). The two instances of DPB1*452 were noted exclusively in the case group (3.9% vs. 0%, p = 0.039). In addition to the nominally significant associations, nonsignificant trends were observed for one class I and three class II alleles (Table 4). Similar to the trends noted in the primary analysis, the achieved statistical power was low (below or equal to 0.05) (Additional file 8: Tab. S7).
Table 4
Exploratory case-control association analysis: nominally significant results and non-significant trends.
HLA allele | CASES (n = 83) | CONTROLS (n = 210) | p-value | OR | CI low | CI up |
N | Fr | N | Fr |
A*23 | 7 | 0.055 | 1 | 0.003 | 0.0004* | 21.207 | 2.681 | 959.693 |
C*06 | 5 | 0.041 | 32 | 0.091 | 0.081 | 0.428 | 0.127 | 1.143 |
DRB1*03 | 10 | 0.081 | 51 | 0.148 | 0.063 | 0.511 | 0.223 | 1.062 |
DRB1*04 | 27 | 0.220 | 55 | 0.159 | 0.167 | 1.482 | 0.848 | 2.544 |
DRB1*13 | 10 | 0.081 | 58 | 0.168 | 0.017* | 0.439 | 0.193 | 0.904 |
DPB1*04 | 22 | 0.431 | 118 | 0.578 | 0.083 | 0.554 | 0.282 | 1.075 |
DPB1*452 | 2 | 0.039 | 0 | 0.000 | 0.039* | Inf | 0.758 | Inf |
N – number of alleles; Fr – frequency; * – p-value statistically significant; CI low – confidence interval lower bound of OR; CI up – confidence interval upper bound of OR; Inf – infinity due to the frequency of 0 in one of the groups |
Heterozygosity and allelic richness in cases and controls
To further explore differences between cases and controls, heterozygosity and allelic richness were calculated at first-field resolution and compared between the groups (Table 5). Heterozygosity was calculated as the proportion of heterozygous individuals within each group (cases or controls). Allelic richness was calculated using the rarefaction method that accounts for the differences in sample sizes. Interestingly, while the allelic richness values were similar in both groups, heterozygosity was consistently lower in controls relative to cases for all class I and II alleles, except for the HLA-DRB1 locus. The difference, however, was statistically significant only for HLA-A.
Table 5
Diversity parameters for each HLA loci at the first-field resolution.
Locus | Group | Sample size | Htzg | pc | N of alleles | Allelic Richness |
A | control | 176 | 0.76 | 0.038* | 17 | 13.8 |
| case | 56 | 0.93 | 13 | 13 |
B | control | 170 | 0.81 | 0.5 | 23 | 17.8 |
| case | 48 | 0.92 | 18 | 18 |
C | control | 167 | 0.73 | 1 | 16 | 12.5 |
| case | 53 | 0.81 | 11 | 11 |
DRB1 | control | 165 | 0.87 | 1 | 14 | 11.5 |
| case | 56 | 0.84 | 12 | 12 |
DQB1 | control | 166 | 0.69 | 1 | 5 | 5 |
| case | 58 | 0.76 | 5 | 5 |
DPB1 | control | 88 | 0.53 | 1 | 19 | 7.4 |
| case | 13 | 0.69 | 11 | 11 |
Htzg – heterozygosity; pc - p-value corrected for testing on six HLA loci; * – p-value statistically significant after correction for multiple testing (n = 6) |
Furthermore, zygosity analysis (Additional file 10: Tab. S9) showed a significant association between the heterozygous genotype of DRB1*13 (i.e., allele 1: DRB1*13 and allele 2: DRB1*X, where X is any other DRB1 allele) and absence of leprosy (p = 0.0097, OR = 3.03). On the contrary, the heterozygous genotypes of A*23 and B*38 were linked to leprosy presence (p = 0.001, OR = 0.07 and p = 0.0004, OR = 0.02).
Binding prediction of M. leprae proteins
The presentation of specific peptides on the cell surface, which is the major function of HLA molecules, is commonly assumed to be the causal factor for HLA-mediated resistance or susceptibility to diseases [41, 42].To investigate the peptide binding properties of the HLA alleles associated with leprosy in this study, a computational binding prediction was performed for all HLA alleles observed in the relevant loci (HLA-A, B and DRB1) and 46 antigenic M. leprae proteins (Additional file 11: Tab. S10, Fig. 4).
Figure 4. The number of peptides predicted to be bound by HLA-A (A-B), HLA-B (C-D) and HLA-DRB1 (E-F). Bound peptides were determined based on %RankEL scores either by using a strong binding threshold giving the strongly bound peptides or a weak binding threshold giving both strongly and weakly bound peptides.
Interestingly, A*23:01 is one of the HLA-A alleles that bind the lowest number of peptides (Fig. 4A and B). Within the HLA-B locus, B*38:01 binds the average number of peptides relative to other alleles (Fig. 4C and D). However, B*44 alleles are placed on the lower end of the spectrum. For the class II locus (Fig. 4E and F), DRB1*08:01, DRB1*11:01 and DRB1*13 alleles bind the lowest number of peptides. On the contrary, DRB1*09:01, DRB1*01:01 and DRB1*04 alleles are on the higher end of the spectrum.
Temporal shifts in HLA allele frequencies
Genetic continuity between medieval individuals and modern Germans and Danish (Fig. 3A and B, Additional file 2: Figures S2 and S3) allowed an investigation of potential shifts in HLA diversity over time. HLA allele frequencies were compared between the medieval controls and a large cohort of modern Germans (Allele Frequency Net Database (AFND): Germany DMKS (n = 3,456,066)) (Additional file 12: Tab. S11). The German population was used as the modern Danish cohort available in AFND included only 55 individuals [43]. Overall, the allele frequencies are similar between the two groups. Interestingly, the proportions of C*04, DRB1*11, DQB1*05 and DPB1*02 seem to have increased since the Middle Ages. On the contrary, C*03 and DQB1*06 are less common in present-day Germans relative to medieval individuals (Table 6). The shift is particularly pronounced for C*03, with a difference of over 10%.
Table 6
HLA alleles which exhibit a temporal shift (> 5%) in frequency.
| Frequency [%] |
Allele | Medieval controls | Modern Germans |
C*03 | 25 | 14.24 |
C*04 | 6.25 | 11.5 |
DRB1*11 | 4.93 | 11.96 |
DQB1*05 | 12.07 | 17.12 |
DQB1*06 | 33.05 | 25.7 |
DPB1*02 | 8.33 | 14.52 |