In this study PNPLA3 rs738409, PNPLA3 rs2281135, PNPLA3 rs2294918, and MBOAT7 rs641738 were observed in high frequencies while TM6SF2 rs58542926 was observed in low frequency in the NAFLD-HCC cohort. Further, multiple regression analysis showed that PNPLA3 rs738409, PNPLA3 rs2281135, and PNPLA3 rs2294918 variants had significant associations with the tumor size, and PNPLA3 rs2294918 and MBOAT7 rs641738 variants had a significant association with single-nodular HCC.
This study shows that the MAF of genotyped SNPs were comparable to the overall MAF values in other Asian populations reported in the 1000 Genomes project [15] (Table 4). Previous data from a follow-up study done by Seko et al has shown that the PNPLA3 rs738409 G allele is significantly associated with developing HCC in NAFLD patients [16]. Furthermore, it was significantly high in NAFLD-HCC compared to patients with viral etiology. The same observation was reported by Uyema et al on NAFLD-HCC subjects with diabetes [17]. All these East Asian cohort studies on NAFLD-HCC have shown that the PNPLA3 rs738409 G allele is a significant genetic factor for the disease onset and progression, comparable with what we have observed in our cohort. Similar observations were made in Western populations for PNPLA3 rs738409 [18, 19]. Also, TM6SF2 rs58542926 [20] and MBOAT7 rs641738 [7, 21, 22] have indicated significant contributions to the NAFLD-HCC etiology.
Table 4. Minor allele frequencies observed in the present study and in other Asian populations *
Variant
|
Present study
|
BEB
|
ITU
|
CHB
|
JPT
|
KHV
|
STU
|
PNPLA3 rs738409
|
0.40
|
0.24
|
0.22
|
0.38
|
0.42
|
0.31
|
0.25
|
PNPLA3 rs2281135
|
0.41
|
0.24
|
0.21
|
0.40
|
0.42
|
0.34
|
0.25
|
PNPLA3 rs2294918
|
0.77
|
0.80
|
0.77
|
0.86
|
0.91
|
0.76
|
0.79
|
TM6SF2 rs58542926
|
0.18
|
0.10
|
0.12
|
0.04
|
0.07
|
0.05
|
0.10
|
MBOAT7 rs641738
|
0.60
|
0.58
|
0.54
|
0.22
|
0.12
|
0.26
|
0.61
|
* Minor allele frequencies were collected from 1000 Genomes project (Ensembl 1000 Genomes browser) BEB: Bangladesh population; ITU: Indian Telugu population in UK; CHB: Han Chinese population; JPT: Japanese population: KHV: Vietnam population; STU: Sri Lankan Tamil population in UK
Furthermore, Longo et al and Donati et al have investigated the synergetic effect of PNPLA3 rs738409, TM6SF2 rs58542926, and MBOAT7 rs641738 variants on NAFLD-HCC and found that the additive effect of either two [7] or three [20] of the variants can increase the burden towards HCC as shown in our results as well.
Prevalence of high proportions of the risk alleles in PNPLA3 rs738409, PNPLA3 rs2281135, PNPLA3 rs2294918, MBOAT7 rs641738, and low prevalence of TM6SF2 rs58542926 observed in our study may have a significant clinical impact. NAFLD is a disease that exposes a large cross-section of a population to its complications. However, only a small proportion of patients would progress further into developing complications. Thus, extensive population screening has shown to be a failed strategy [23]. In order to overcome this, risk stratification and screening of high-risk groups are becoming the trend for the future [24]. However, risk factors for risk stratification are rapidly evolving with the emergence of new data.
In Precision screening, the patients are selected based on the risks and the tool for screening is individualized. Combining genetic risk with conventional risk factors could give a greater validity for precision screening in NALD-HCC. This risk prediction can facilitate the early detection of high-risk individuals and develop personalized health measures to diminish disease onset and progression. The clinical implication of such genetic discoveries is still evolving but calculating polygenic risk scores (PRS) is currently practiced to determine the total burden of risk alleles of an individual for clinical utility. PRS is calculated as a weighted sum of the number of risk alleles and prioritizes them in preventable clinical actions [25].
TM6SF2 rs58542926 which has a relatively low MAF in our data and in many Asian populations (Table 4) may have a protective role against NAFLD-HCC. Rare variants with distinctive protective roles can be used as precise genetic targets in clinical practices [26]. If the TM6SF2 rs58542926 variant works protectively against NAFLD-HCC, this can be a future potential target gene that can be used in treatments. However, a wider understanding of this gene is needed before making such conclusions.
HCC is known to be an aggressive tumor [27]. About 70%-85% of patients present at an advanced or unresectable stage [28]. NASH-related HCC is known to present at an advanced stage with a poor prognosis [29]. Tumor size [27, 30, 31], vascular invasion and nodularity [30, 32] are independent prognostic factors for HCC at diagnosis. There are few attempts to discover associations between PNPLA3 variations and HCC poor prognostic factors. PNPLA3 rs738409 GG genotype was prominently (46.2%) observed in a Japanese cohort of HCC patients with non-viral etiology where the majority had NAFLD as the metabolic disease and they exhibited a tendency towards smaller tumors than those with CC or CG genotypes [33]. Also, Valenti et al reported that the PNPLA3 rs738409 G allele was significantly associated with multiple HCC nodules at presentation in liver disease patients with alcoholic and non-alcoholic etiologies [34]. In our cohort, PNPLA3 rs738409, PNPLA3 rs2281135 and PNPLA3 rs2294918 variants had significant associations with the tumor size, and PNPLA3 rs2294918 and MBOAT7 rs641738 variants had a significant association with single-nodular HCC indicating a relationship between PNPLA3 variations and tumor aggressiveness.
In this study, LD analysis results showed a strong correlation between PNPLA3 rs738409 and PNPLA3 rs2294918. But our genotyping was limited to a small region when compared to the genome and it had gaps between the SNP data. Also, there should be high-density population samples to map the block structures in a reliable manner. Other than that, mapping LD with regard to HCC would have a greater value in identifying the genetic basis of the disease pathogenesis.