Variant differentiation by age range and allele and genotype frequencies
The sample consisted of a total of 819 individuals belonging to the 12 communities of the urban area of Buenaventura; 214 were minors and 605 were adults (mean = 35 years, SD = 20 years). For some variants, detection was not possible in the 819 individuals; therefore, the total for some variants may vary slightly. Similarly, there were some records without information on age (called "SG"); the results for these individuals were taken into account when calculating the general frequencies for the population. When performing the population differentiation analysis for the genotypes of the variants in individuals with different ages (8 months to 12 years, 13 to 26 years and 27 to 93 years), there were significant differences among the HbS/C variants (p = 0.0154), β-Tal-29 variant (p < 0.0001), and β-Tal-88 variant (p < 0.0001); there was no significant difference for the G6PD and Duffy variants (p = 0.1171 and p = 0.2295, respectively). These age groups were taken into account in the different analyses, especially in the calculation of fit, selection coefficients and average deviation.
Haemoglobin S and C variants
For these variants, 814 individuals were successfully diagnosed. The most frequent genotype was AA (wildtype homozygous), with 730 individuals (89.7%), represented mainly by age group 1 (8 months to 13 years), followed by the heterozygous carrier genotypes (or resistance genotypes) AS and AC, with frequencies of 5.8% and 4.2%, respectively, mostly evidenced in age groups 2 (13 to 26 years) and 3 (26 to 93 years). The least represented or unrepresented genotypes were the heterozygous genotypes of both variants, SC (0.2%), followed by homozygous SS (0.1%) and homozygous CC (0%), with the latter mostly present in age group 2. Finally, the following allele frequencies were found: 94.7% for allele A, 3.1% for allele S and 2.2% for allele C (Table 2).
Table 2. Allelic and genotypic frequencies for the HbS and C variants for the total individuals and each age group.
Age group
|
Genotype
|
Total
|
AA
|
CC
|
SS
|
AC
|
AS
|
SC
|
|
G1
|
144*
|
0
|
0
|
3
|
5
|
0
|
152
|
94.7%**
|
0.0%
|
0.0%
|
2.0%
|
3.3%
|
0.0%
|
G2
|
123
|
0
|
1
|
4
|
16
|
1
|
145
|
84.8%
|
0.0%
|
0.7%
|
2.8%
|
11.0%
|
0.7%
|
G3
|
442
|
0
|
0
|
25
|
26
|
1
|
494
|
89.5%
|
0.0%
|
0.0%
|
5.1%
|
5.3%
|
0.2%
|
SG
|
21
|
0
|
0
|
2
|
0
|
0
|
23
|
91.3%
|
0.0%
|
0.0%
|
8.7%
|
0.0%
|
0.0%
|
Total
|
730
|
0
|
1
|
34
|
47
|
2
|
814
|
89.7%
|
0.0%
|
0.1%
|
4.2%
|
5.8%
|
0.2%
|
|
Allelic frequencies
|
A=
|
94.7%
|
|
|
C=
|
2.2%
|
|
S=
|
3.1%
|
|
|
|
*Absolute frequency; **Relative frequency or prevalence. G1= 8 months to 12 years; G2 = 13 to 26 years; G3 = 27 to 93 years; SG = individuals without age records.
|
Duffy variant
For this gene, 819 individuals were diagnosed. Four hundred fifty (55%) had the homozygous genotype, i.e., the null or negative Duffy genotype (FYBES*FYBES), which was the most frequent genotype. This genotype was most represented in age group 2. That genotype was followed in frequency by the genotypes carrying the null allele, i.e., FYA*FYBES (18.6%) and FYB*FYBES (14.9%), with the highest proportions in age group 1. The FYA*FYB genotype was present in 6.2% of the individuals, with most in age group 3). The less frequent genotypes were FYA*FYA and FYB*FYB, present in 3% and 2.5% of the sample, respectively, mostly evidenced in age group 3 (27 to 93 years). The FYB ES allele frequency was 72.2%, and the FYA and FYB allele frequencies were 15.2% and 12.6%, respectively (Table 3).
Table 3. Allelic and genotypic frequencies for the Duffy variant for the total individuals and each age group.
Age group
|
Genotype
|
Total
|
FYBES*FYBES
|
FYA*FYA
|
FYB*FYB
|
FYA*FYB
|
FYA*FYBES
|
FYB*FYBES
|
G1
|
76*
|
3
|
3
|
7
|
35
|
30
|
154
|
49.4%**
|
1.9%
|
1.9%
|
4.5%
|
22.7%
|
19.5%
|
G2
|
85
|
6
|
2
|
4
|
28
|
20
|
145
|
58.6%
|
4.1%
|
1.4%
|
2.8%
|
19.3%
|
13.8%
|
G3
|
277
|
14
|
14
|
37
|
84
|
71
|
497
|
55.7%
|
2.8%
|
2.8%
|
7.4%
|
16.9%
|
14.3%
|
SG
|
12
|
1
|
1
|
3
|
5
|
1
|
23
|
52.2%
|
4.3%
|
4.3%
|
13.0%
|
21.7%
|
4.3%
|
Total
|
450
|
24
|
20
|
51
|
152
|
122
|
819
|
54.9%
|
2.9%
|
2.4%
|
6.2%
|
18.6%
|
14.9%
|
|
Allelic frequencies
|
FYA=
|
15.2%
|
|
FYB=
|
12.6%
|
|
FYBES=
|
72.2%
|
|
*Absolute frequency; **Relative frequency or prevalence. G1= 8 months to 12 years; G2 = 13 to 26 years; G3 = 27 to 93 years; SG = individuals without age records.
|
β-thalassemia-29 and -88 variants
For these variants, 816 individuals were diagnosed. For the -29 variant, among the 816 individuals, 789 had the homozygous AA or normal haemoglobin genotypes (96.0%), 32 had the AG heterozygous genotype (3.9%) (or resistance genotype), and one had the homozygous GG or double variant homozygous genotype (0.1%). The allele frequencies were 97.9% for A and 2.1% for G. The AA genotype was most represented in age group 3, with 98.8%; the AG genotype was most represented in age group 2, with 9.7%; and the GG genotype was most represented in age group 1, with 0.7% (Table 4).
Table 4. Allelic and genotypic frequencies for the β-thalassemia -29 variant for the total individuals and each age group
Age group
|
Genotype
|
Total
|
AA
|
AG
|
GG
|
G1
|
139*
|
12
|
1
|
152
|
91.4%**
|
7.9%
|
0.7%
|
G2
|
131
|
14
|
0
|
145
|
90.3%
|
9.7%
|
0%
|
G3
|
490
|
6
|
0
|
496
|
98.8%
|
1.2%
|
0%
|
SG
|
23
|
0
|
0
|
23
|
100%
|
0%
|
0%
|
Total
|
783
|
32
|
1
|
816
|
95.9%
|
3.9%
|
0.1%
|
|
Allelic frequencies
|
A=
|
97.9%
|
|
|
G=
|
2.1%
|
|
*Absolute frequency; **Relative frequency or prevalence. G1= 8 months to 12 years; G2 = 13 to 26 years; G3 = 27 to 93 years; SG = individuals without age records.
|
For variant -88, among the 816 individuals, 776 had the normal homozygous CC genotype (95.0%), 35 had the heterozygous TC genotype (4.29%), and 5 had the double variant or homozygous TT genotype (0.61%); the allele frequencies were 97.2% for C and 2.8% for T (Table 2). The CC genotype was most represented in age group 3, with 98.6%, and the heterozygous CT and homozygous TT genotypes were most represented in age group 2, with 11% and 2.1%, respectively (Table 5).
Table 5. Allelic and genotypic frequencies for the β-thalassemia -88 variant for the total individuals and each age group
Age group
|
Genotype
|
CC
|
CT
|
TT
|
Total
|
G1
|
139*
|
12
|
1
|
152
|
91.4%**
|
7.9%
|
0.7%
|
G2
|
126
|
16
|
3
|
145
|
86.9%
|
11.0%
|
2.1%
|
G3
|
489
|
6
|
1
|
496
|
98.6%
|
1.2%
|
0.2%
|
SG
|
22
|
1
|
0
|
23
|
95.7%
|
4.3%
|
0.0%
|
Total
|
776
|
35
|
5
|
816
|
95.1%
|
4.3%
|
0.6%
|
|
Allelic frequencies
|
C=
|
97.2%
|
|
|
T=
|
2.8%
|
|
*Absolute frequency; **Relative frequency or prevalence. G1= 8 months to 12 years; G2 = 13 to 26 years; G3 = 27 to 93 years; SG = individuals without age records.
|
G6PD A+ and A- variants
The total sample was 817 individuals from the urban area of Buenaventura. The overall frequencies were 72.8% for the B allele, 16.2% for the African A+ allele, and 11.0% for the A- allele (between hemizygous men and heterozygous-homozygous women). Because G6PD is a sex-linked gene, the genotype frequencies obtained are reported by sex.
For the female population (n = 616), the following genotype frequencies were found: 54.5% for the BB genotype (wild genotype), most represented in age group 3, and 22.1%, 13.5% and 6% for the heterozygous BA+, BA-, and A+ and A- genotypes, respectively (the last two resistance genotypes), most represented in age groups 2 and 3. The homozygous genotypes AA+, with 2.6%, and AA-, with 1.3%, were most represented in age groups 3 and 1, respectively. Thus, allele frequencies of 72.3%, 16.6%, and 11.0% were observed for alleles B, A + and A-, respectively (Table 6).
Table 6. Allelic and genotypic frequencies for the G6PD variant in females for the total individuals and each age group
Age group
|
Female genotypes
|
Total
|
BB
|
A+A+
|
A-A-
|
BA+
|
BA-
|
A+A-
|
G1
|
48*
|
0
|
2
|
20
|
12
|
3
|
85
|
56.5%**
|
0.0%
|
2.4%
|
23.5%
|
14.1%
|
3.5%
|
G2
|
52
|
2
|
2
|
34
|
12
|
8
|
110
|
47.3%
|
1.8%
|
1.8%
|
30.9%
|
10.9%
|
7.3%
|
G3
|
228
|
14
|
4
|
75
|
58
|
25
|
404
|
56.4%
|
3.5%
|
1.0%
|
18.6%
|
14.4%
|
6.2%
|
SG
|
8
|
0
|
0
|
7
|
1
|
1
|
17
|
47.1%
|
0.0%
|
0.0%
|
41.2%
|
5.9%
|
5.9%
|
Total
|
336
|
16
|
8
|
136
|
83
|
37
|
616
|
54.5%
|
2.6%
|
1.3%
|
22.1%
|
13.5%
|
6.0%
|
|
|
Allelic frequencies
|
B =
|
72.3%
|
|
|
A+=
|
16.6%
|
|
A- =
|
11.0%
|
|
|
|
*Absolute frequency; **Relative frequency or prevalence. G1= 8 months to 12 years; G2 = 13 to 26 years; G3 = 27 to 93 years; SG = individuals without age records.
|
For the 201 men analysed, the following genotype frequencies (and therefore allele frequencies) were obtained: 74.1% for the B genotype (wild genotype), 14.9% for the A+ genotype and 10.9% for the A- genotype (resistant genotype). These alleles were most represented in age group 3 (YB genotype) and age group 2 (YA + and YA- genotypes) (Table 7).
Table 7. Allelic and genotypic frequencies for the G6PD variant in males for the total individuals and each age group
Age group
|
Male genotypes
|
Total
|
B
|
A+
|
A-
|
G1
|
51
|
9
|
9
|
69
|
73.9%
|
13.0%
|
13.0%
|
|
G2
|
23
|
7
|
5
|
35
|
65.7%
|
20.0%
|
14.3%
|
|
G3
|
71
|
13
|
7
|
91
|
78.0%
|
14.3%
|
7.7%
|
|
SG
|
4
|
1
|
1
|
6
|
66.7%
|
16.7%
|
16.7%
|
|
Total
|
149
|
30
|
22
|
201
|
74.1%
|
14.9%
|
10.9%
|
|
|
|
*Absolute frequency; **Relative frequency or prevalence. G1= 8 months to 12 years; G2 = 13 to 26 years; G3 = 27 to 93 years; SG = individuals without age records
|
Prevalence of variants by community
Table 8 shows in detail the genotype frequencies found for each variant in each community of the city of Buenaventura. The AA, AS and AC genotypes of the HbS/C variants were the most predominant in all the communities. In all the communities, the most prevalent genotype was wildtype AA, followed by heterozygous variants related to resistance to malaria: in some communities, heterozygous AS was more predominant (communities 1, 3, 7, 8, 9, 10, 11, 12), and in another, heterozygous AC was more predominant (communities 2, 4, 5, 6), with community 1 having the highest frequency of the AS genotype and community 5 having the highest frequency of the AC genotype. For the Duffy variant, the most prevalent genotypes were the FYBES*FYBES genotype (resistance genotypes) and the heterozygous FYA*FYBES FYB*FYBES genotypes. In all the communities, the genotype with the highest prevalence was the null homozygous FYBES*FYBES. For the β-thalassemia-29 and -88 variants, in all the communities, the most frequent genotypes were wild-type homozygous AA and CC, respectively, followed by heterozygous AG and CT (resistance genotypes), except for communities 7, 8, and 9, which did not have any individuals with the heterozygous CT genotype for -88, and communities 8, 9 and 12, which did not have any individuals with the heterozygous AG genotype for -29. Finally, for the G6PD variant in women, the most prevalent genotype in all communities was the wild genotype BB followed by the heterozygous BA+ and BA- genotypes (the latter related to resistance). For men, the most frequent genotype was the wild BY genotype, followed by the A+ Y and AY genotypes, with the latter being related to resistance.
(Place table 8 here)
Independence test, multiple regression and prevalence
The results of Fisher's independence tests (p <0.005) were significant for "Community" and "Age group" (variables described in Table 1) with respect to all response variables. The exceptions were the "Duffy variant", which was not related to the variable "Age group” but with the variable "community" and the "G6PD variant", indicating that there was a relationship between “Duffy variant” and "Age group" but not between “Duffy variant” and “Community” (Table 9). Regarding the multinomial regression (and polynomial for the response variable “Protection”), the variable “Age group” influenced HbS/C, β-thalassemias -29 and -88 and G6PD (Table 9), and "community" influenced the Duffy variants, β-thalassemias -29 and -88, G6PD and “Protection”. For both tests, no significance was found for the variables with respect to "Sex", except for "G6PD variant" in Fisher's test of independence (an expected result because G6PD is sex-linked).
Table 9. Independence test and multinomial and ordinal regression for the study variables.
|
|
Variables
|
|
|
|
Duffy variant
|
HbS/C variant
|
βTal. -29 variant
|
βTal. -88 variant
|
G6PD variant
|
Protección
|
|
Variables
(Independence test)
|
Sex
|
0,8734
|
0,9725
|
0,2629
|
0,0455
|
0,0005*
|
0,5508
|
Age group
|
0,2584
|
0,0165*
|
0,0005*
|
0,0005*
|
0,0005*
|
0,0111*
|
Community
|
0,0005*
|
0,0001*
|
0,0005*
|
0,0005*
|
0,1999
|
0,0005*
|
Factores
(Regression)
|
Sex
|
0,8843
|
0,8353
|
0,3884
|
0,1617
|
0,4693
|
0,2758
|
+
Age group
|
0,1957
|
0,0297*
|
<0,0000*
|
<0,0001*
|
0,0112*
|
0,0029*
|
+
Community
|
<0,0001*
|
0,1224
|
0,0001*
|
<0,0001*
|
0,0002*
|
<0,0001*
|
|
*Significan p-values (p<0.05)
|
|
|
Considering the results of Fisher's independence tests and multiple regression analyses, the multiple comparison test was conducted for “categorized age” and “community” with respect to “protection”. For age, there were significant differences between the age group of 13 to 26 years and the age group of 8 months to 13 years and between the age group of 13 to 26 years and the age group of 27 to 93 years. On the other hand, communities 1, 3, 4 and 5 were significantly different from communities 6, 7, 8, 9, 10, 11 and 12 (p ≤ 0.0361)
To delve deeper into the results of these tests, the prevalences of “categorized age” and “community” were established in relation to the variable “protection” (Fig. 2 and 3). For the age groups, the age group from 13 to 26 years had a higher percentage of individuals in the category “at least two” (33.1%) than did the other two age groups. This highest level of protection indicates that these individuals present combinations of two or more protective genotypes of 2 or more of the resistance variants investigated in this study. The frequency of “at least two” was second highest in the age group from 27 to 93 years, followed by the age group from 8 months to 12 years (Fig. 2). The “one” category (that is, presenting only one resistance genotype of only one variant) was the most prevalent in all age groups, being higher in the age group from 27 to 93 years and the age group from 8 months to 12 years. Finally, the category “none” (not having any resistance genotype) was also most represented in the age groups of 8 months to 12 years and 27 to 93 years (Fig. 2).
Regarding the communities (Fig. 3), 1, 3, 4, and 5 had higher prevalences of individuals with “at least 2” (from 25.6% to 38.3%), with less representation of “at least 2” in communities 2, 6, 7, 8, 9 and 10. The category “one” was also the most prevalent in all communities, with quite similar percentages among them, with 2, 4 and 9 having the highest percentages. Finally, individuals with no protective variant were most represented in communities 6 to 12 (30.6% to 44%), with very little representation in communities 1 to 5 (Fig. 3).
Regarding the “protection” variable, 276 individuals were included in the “none” category; that is, for any variable, they did not have a resistance genotype. The category “one” included 360 individuals, with the Duffy variant (genotype FYBES*FYBES) being the most represented, with 78.3%, followed by G6PD, with 10.8%, with the BA- and A+A- genotypes being more frequent, and HbS/C, with 6.4%, with the AS and AC genotypes being more frequent (Fig. 4). For "at least 2" for the “protection” variable, among the 183 individuals who were included in that category, 160 (87.4%) presented a combination of only two protection genotypes for two resistance variants, 21 (11.9%) presented a combination of three protection genotypes for three resistance variants, and only 1 (0.0%) presented a combination of four protection genotypes. No individual presented a combination of 5 protection genotypes from the 5 variants investigated (Table 1).
For "at least two" for the protection variable, the most prevalent double combination included the resistance genotypes of the Duffy + G6PD variants (53.8%), with the genotype combination FYBES*FYBES/BA- being the most frequent, followed by FYBES*FYBES/A+A-. The HbS/C + Duffy pair (25.6%) was the second most frequent, represented mostly by the AS/FYBES*FYBES genotype, followed by the AC/FYBES*FYBES combination. No individual had a combination of resistance genotypes of the HbS/C variants and β-Tal-29 (Fig. 5).
Regarding triple combinations (three resistance genotypes of three different variants), the HbS/C + Duffy + G6PD combination was the most prevalent (36.4%), with the genotypic combination AS/FYBES*FYBES/BA- being more frequent (Fig. 6). That combination was followed in frequency by the triple combination Duffy + β-Tal-29 + β-Tal-88 with 18.2% (FYBES*FYBES/AG/CT genotypes). There were no individuals with the HbS/C + β-Tal-29 + β-Tal-88, HbS/C + β-Tal-29 + G6PD or HbS/C + β-Tal-88 + G6PD combinations. Finally, only one individual had a combination of 4 resistance genotypes: HbS/C + Duffy + β-Tal-88 + G6PD (AS/FYBES*FYBES/CT/BA- genotypes).
Comparison with other populations of Colombia
The analysis of population differences and homogeneity in the Buenaventura population, with respect to the results reported for the variants in populations in Colombia, yielded the following results. Regarding the results for the three alleles of HbS and HbC obtained herein, there were significant differences (p≤0.0064) in the distribution of genotype frequencies in the populations of Cali [33], Cartagena [17], Putumayo, Nariño, Guajira, San Andrés, Chocó and Valle [31]; however, there were no significant differences (p ≥0.0935) in the distribution of genotype frequencies in the populations of San Andrés [19], Cartagena [24] and Buenaventura [23]. Regarding the results for the Duffy variant, there were significant differences (p <0.0001) in the distribution of genotype frequencies between the population of Buenaventura in this study and the populations of Italy, Chocó [21], Tierra Alta, and the indigenous population of Buenaventura [25], with no significant differences in the frequency distribution between the population in this study and the population of Tumaco [25]. Finally, for the G6PD variant, there were significant differences (p≤0.0465) in the distribution of genotype frequencies between the population in this study and the populations of Quibdó, Tierra Alta and Tumaco, with no differences with respect to the population of Buenaventura [22].