Study participants
The study was conducted in a large shipyard in Shanghai and comprised several phases. It was initiated in June 2015, after which the employees underwent occupational health examinations annually. The most recent auditory and demographic data were collected from a total of 6,840 Chinese noise-exposed workers, primarily comprising grinders, welders, stampers, and cutters, all of whom are exposed to high levels of noise. A structured questionnaire was filled through face-to-face interviews to collect demographic features, occupational history, medical history, smoking and alcohol drinking habits, family history including genetic and drug-related hearing loss, the use of hearing protection devices, and exposure to other harmful chemicals. The exclusion criteria are shown in Fig. 1. This study was approved by the Institutional Ethics Review Board of the Shanghai Sixth People’s Hospital affiliated with Shanghai Jiao Tong University (Approval No: 2017 − 136) and was registered in the Chinese Clinical Trial Registry (registration number: ChiCTR-RPC-17012580). Potential consequences and benefits were explained, and written informed consent was obtained from each participant before this study.
Noise exposure estimation
Industrial noise was measured with an ASV5910-R digital recorder (Aihua Instruments; Hangzhou, China) across the work areas of different jobs according to the national standard of China[31]. The long-term equivalent (Leq) noise level was adopted as the primary exposure metric and measured three times at each spot. The mean Leq value in each spot was transformed into 8 h of continuous equivalent A-weighted sound pressure level (Leq8h, shown in Table 1). Cumulative noise exposure (CNE) was calculated using LAeq8h over the years of on-duty time (Leq-total, dBA·year): CNE = Leq8h + 10*lg(T), where T is the duration of employment in years[32].
Table 1
Characteristics of the total samples for machine learning modelling.
Variables
|
B-HL group (n = 1926)
|
W-HL group
(n = 3613)
|
P value
|
Age (yrs)
|
33.6 ± 7.4
|
42.3 ± 8.3
|
< 0.001
|
Sex, n (%)
Male
Female
|
1518(78.8)
408(21.2)
|
3169(87.7)
444(12.3)
|
< 0.001
|
Career length (yrs)
|
6.7 ± 4.4
|
8.2 ± 4.7
|
< 0.001
|
CNE (dB*year)
|
90.1 ± 6.4
|
93.7 ± 7.5
|
< 0.001
|
Leq8h dB(A)
|
85.1 ± 4.2
|
87.1 ± 5.4
|
< 0.001
|
Smoking, n (%)
Currently
Never
|
675(35.0)
1251(65.0)
|
1387 (38.4)
2226 (61.6)
|
0.014
|
Drinking n, (%)
Currently
Never
|
846(34.0)
1641(66.0)
|
1402(38.8)
2211(61.2)
|
< 0.001
|
PTA0.5−2 kHz
|
14.8 ± 4.6
|
20.9 ± 7.7
|
< 0.001
|
PTA3 − 6 kHz
|
15.4 ± 6. 2
|
39.4 ± 16.0
|
< 0.001
|
PTA10 − 12.5 kHz
|
19.5 ± 9.9
|
51.3 ± 17.9
|
< 0.001
|
PTA3 − 6 & 10−12 kHz
|
17.0 ± 5.4
|
44.2 ± 14.0
|
< 0.001
|
Data are presented as mean ± standard deviation or numerical (%). The p values are the result of the Mann–Whitney test. The B-HL group is the better hearing group, The W-HL group is the worse hearing group. CNE, cumulative noise exposure. PTA0.5−2 kHz represents the mean of 0.25, 0.5, 1, and 2 kHz for both ears. PTA3 − 6 kHz represents the mean of 3, 4, and 6 kHz on both ears. PTA10 − 12.5 kHz represents the mean of 10 and 12.5 kHz on both ears. PTA3 − 6 & 10−12 kHz represent the mean of 3, 4, 6 kHz and 10, 12.5 kHz on both ears. |
Audiometric evaluation and hearing impairment definition
Audiological evaluations, including tympanometry and pure-tone audiometry (PTA, range from 0.25-16 kHz) were performed on-site by qualified medical assistants in a multifunctional screening vehicle equipped with five soundproof chambers. The tests were performed at least 12 h after the participant’s last shift in the noise-exposed job. First, otoscopic inspection and tympanometry were performed on each participant to establish the normal function of their external and middle ears. Subsequently, audiometry tests were carried out to determine hearing thresholds. Tympanograms were measured using a TympStar tympanometer (Grason-Stadler; Eden Prairie, MN, USA). The passing criterion was a type A tympanogram (peak between − 100 and + 100 daPa). Pure-tone air-conduction thresholds were measured from each of the two ears separately using Type 1066 manual audiometers (Natus Hearing & Balance; Taastrup, Denmark) coupled with TDH-39 headphones (Telephonics; Farmingdale, NY, USA) for conventional frequencies (0.25-8 kHz), and Sennheiser (Wedemark, Germany) HDA-300 headphones for the EHFs (10, 12.5, and 16 kHz). All thresholds were calculated in decibel hearing level (dB HL) and audiometers were calibrated annually according to the ISO 389-5-2006 standard. If a participant did not respond to the maximum intensity output of the audiometer for the EHFs, which were 90, 80, and 60 dB HL for 10, 12.5, and 16 kHz, respectively, the threshold was marked as 95, 85, and 65 dB HL, respectively. The bilateral average PTA across 3, 4, 6,10 and 12.5 kHz (or PTA3 − 6 & 10−12.5, for short) was chosen to quantify the degree of NIHL. The threshold above 12.5 kHz was not used because of a significant ceiling effect. The participants were classified into two groups according to the average PTA3 − 6 & 10−12.5 level: <25 dB HL as the better hearing level (B-HL) group and ≥ 25 dB HL as the worse hearing level (W-HL) group.
Machine learning for individual susceptibility assessment
Predictive modelling and performance evaluation
Different ML algorithms have different advantages. The following four supervised algorithms were used for the classification of hearing impairment: adaptive boosting (AdaBoost)[33], multi-layer perceptron (MLP)[34], random forest (RF)[35] and support vector machine (SVM)[36]. Age, sex, CNE, smoking, and alcohol drinking status were used as inputs for predictive modeling of PTA3 − 6 & 10−12.5 dichotomy. To train and validate the models, 10-fold cross-validation was adopted. In short, the entire dataset was randomly divided into 10 datasets using the caret package in R programming language v 3.6.1; nine of them for modeling, and the remaining one for validation. This step was repeated for 10 runs and the parameters of each algorithm were adjusted to ensure that the model had the best classification performance, which was estimated by two indexes: area under the receiver operator characteristic (ROC) curve (AUC) and prediction accuracy. The reported accuracy and AUC are the average over the 10 cycles. The algorithms were implemented using randomForest, adabag, monmlp, and e1071 libraries. When building the AdaBoost and RF models, default parameters were utilized. For the MLP model, we used five nodes in the first hidden layer and 15 ensembles to fit, and set the cut-off value of 0.5 in prediction probability for dichotomous classification. Regarding the SVM algorithm, hyper-parameters including gamma and cost were initially determined by 10-fold cross-validation, and the best of which was applied to train the classifier.
Individual susceptibility assessment and extreme individual selection
Individual susceptibility was assessed based on whether an individual was correctly classified by comparing the predicted label with the actual label. We consider the few participants who were misclassified with abnormal phenotypes to be either highly susceptible or resistant to NIHL. For example, those who were predicted to be in the B-HL group but actually had severe hearing loss were regarded as susceptible to NIHL, and conversely, those who were predicted to be the W-HL group but actually had better hearing were regarded as resistant to NIHL. To avoid errors of the model itself, the selection procedures for extreme individuals were strictly applied to the subsamples selected from the two misclassified groups by all four models for the next step of exome sequencing.
Identification and replication of risk variants
The procedure for genomic DNA preparation and exome sequencing analysis is described in the supplementary material. To identify the most likely pathogenic mutations, functional variants were filtered as follows: (1) considering the limited sample size and false negative signals, the SNPs with p values of Fisher’s exact test for genetic association analysis < 0.05 or marginal significance (0.05 < P < 0.10), were selected; (2) the analysis was restricted to non-synonymous (missense), stop-gain/loss (nonsense), and splicing because changes in amino acids may affect biological functions; (3) minor allele frequencies of the mutation less than 0.1 in one of the 1000 genomic data (1000g_all), gnomAD data (gnomAD_ALL and gnomAD_EAS), and ExAC public database: and (4) mutation loci within candidate genes which have been shown to be involved in several crucial pathways including oxidative stress, potassium ion circulation, heat shock protein, notch signaling, apoptosis signaling, and monogenic gene of hereditary hearing loss[6, 12, 37] (see supplementary Table S2 for gene list). From the filtered results, a truly pathogenic rare mutation can be obtained by removing the diversity locus between individuals. Additional two groups of noise-exposed participants were selected from the total sample for replication: 1,077 individuals with an average hearing threshold < 25 dB HL but as high as possible in terms of age and noise exposure dose were classified as the low-risk group, and 1,031 individuals with an average hearing threshold ≥ 25 dB HL but as low as possible in terms of age and noise exposure dose were classified as high-risk group. The demographic characteristics of the two groups are summarized in Supplementary Table S1. Candidate SNPs were genotyped using the ligation detection reaction and SNaPshot assay. Ten percent of the samples were randomly selected and genotyped repeatedly for quality control, and the concordance was > 99.9%.
Statistics
Continuous data are presented as mean ± standard deviation (SD) and were compared using the Mann–Whitney test between groups given their skewed distribution. Categorical data are expressed as number (%) and were compared using Pearson’s χ2 test. The Hardy–Weinberg equilibrium test was performed before association analysis. The allelic frequencies between the NIHL-susceptible and NIHL-resistant groups were compared using Fisher’s exact test, and logistic regression was used to compare the difference in genotype distributions between the two independent validation groups under an additive model (AA = 0, Aa = 1, aa = 2; a is the minor allele), and odds ratios (ORs) with 95% confidence intervals (CIs) are presented. Statistical analyses were performed using SPSS 24.0 (IBM, Armonk, NY, USA) or PLINK v1.9[38]. Combined ORs from two stages were calculated using a Comprehensive Meta-Analysis (Biostat, Englewood, NJ, USA) with a fixed- or random-effect model after testing for heterogeneity. Differences were considered significant when p < 0.05.