Impaired noise adaptation contributes to the speech intelligibility problems of people with hearing loss

doi:10.21203/rs.3.rs-4382058/v1

Download PDF

Article

Impaired noise adaptation contributes to the speech intelligibility problems of people with hearing loss

https://doi.org/10.21203/rs.3.rs-4382058/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Understanding speech in noisy settings is harder for hearing-impaired (HI) than for normal-hearing (NH) people, even when speech is audible. This is often attributed to the hearing loss altering the neural encoding of temporal and/or spectral speech cues. Here, we investigated whether it may also be due to an impaired ability to adapt to the background noise. For 25 adult hearing-aid users with sensorineural hearing loss, speech reception thresholds (SRTs) were measured for natural and tone-vocoded words embedded in speech shaped noise (SSN). Stimuli were preceded or not by a 1-second adapting SSN precursor. Adaptation was calculated as the SRT difference between the two precursor conditions. Corresponding data for 28 NH listeners were taken from a previously published study. SRTs were worse for HI listeners, confirming that hearing loss diminishes access to speech acoustic cues. Furthermore, noise adaptation was negatively correlated with the age-controlled hearing loss both for natural (rho=-0.29, N=38, p=0.078) and vocoded (rho=-0.38, N=38, p=0.019) words. Impaired adaptation contributed up to 10% to the SRT loss of HI listeners. We conclude that HI listeners suffer from poorer speech in noise recognition not only because of their impaired access to speech acoustic cues but also because they are less able to adapt to the background noise.

Biological sciences/Neuroscience/Auditory system

Biological sciences/Psychology/Human behaviour

hearing aids

adaptation

speech spectrum

envelope

temporal fine structure

People with hearing loss find it unusually hard to understand speech in noisy environments even when they use hearing aids to make sounds audible. The reason is unclear. Here we measure the recognition of words in noise and demonstrate that whereas normal-hearing listeners benefit from presenting an adapting noise before the noisy words, aided hearing-impaired listeners show less or no benefit. Our data show that people with hearing loss have problems understanding audible speech in noise not only because they receive less or degraded speech cues, but also because they adapt less to the background noise.

Hearing loss affects millions of people worldwide and has a negative impact on many areas, including speech intelligibility. Hearing-impaired (HI) listeners show poorer speech recognition in noisy environments than normal-hearing (NH) listeners even when speech is presented at suprathreshold levels (Başkent et al., 2006; Duquesnoy and Plomp, 1983; Hopkins et al., 2008; Moore et al., 1999; Plomp, 1978, 1986; Summers et al., 2013; reviewed in Lopez-Poveda, 2014). This has prompted many studies to investigate the cause of non-audibility related deficits (Baer and Moore, 1994; Lorenzi et al., 2006; Lopez-Poveda et al., 2017) and led to the search for new sound-processing algorithms to improve speech-in-noise intelligibility (e.g., Lopez-Poveda et al., 2022; San-Victoriano et al., 2023). So far, however, studies have overlooked that the greater difficulties experienced by HI listeners understanding speech in noise could be partly due to an impaired ability to adapt to the background noise. The aim of the present study is to investigate the potential negative impact of hearing loss on adaptation to noise in speech recognition.

The ability of NH listeners to recognize words or syllables in noisy settings can improve as the speech tokens are delayed from the onset of the noise. This has been interpreted as evidence that NH listeners adapt to the background noise (Cervera and Ainsworth, 2005; Cervera and Gonzalez-Alvarez, 2007; Marrufo-Pérez et al., 2018a, 2020). Adaptation to noise occurs in pure tone detection (also known as “overshoot”; Zwicker, 1965), amplitude modulation (AM) detection (Marrufo-Pérez et al., 2018b, 2019), syllable recognition (Ainsworth and Meyer, 1994), and word recognition (Marrufo-Pérez et al., 2018a, 2020). This adaptation has a time course of about 300-350 ms (Zwicker, 1965; Ben-David et al., 2012, 2016) and its benefits can be quite large for NH listeners. For example, for speech presented at a fixed signal-to-noise ratio (SNR), adaptation results in up to ~30% more recognized syllables (Ainsworth and Meyer, 1994; Cervera and Ainsworth, 2005; Cervera and Gonzalez-Alvarez, 2007). When varying the SNR adaptively to estimate the SNR at 50% word recognition, speech reception thresholds (SRTs) can be up to 5 dB better when words are delayed 300 ms from the noise onset than when the words and the noise start at the same time (Ben-David et al., 2012, 2016; Marrufo-Pérez et al., 2018a).

Hearing-impaired listeners vary widely in their ability to understand audible speech in noise (or speech in speech). Although previous studies have considered different predictors to explain this variability (e.g., Johannesen et al., 2016; Lopez-Poveda et al., 2017; Saunders and Forsline, 2006; Souza et al., 2019; Tognola et al., 2019; Wu et al., 2021, 2022), it has been neglected that HI listeners may adapt more or less to the noise depending on the degree of hearing loss. Evidence exists that adaptation to noise in pure tone or AM detection is less for HI than for NH listeners (Bacon and Takahashi, 1992; Carlyon and Sloan, 1987; Strickland and Krishnan, 2005; Jennings et al., 2016, 2018a; Marrufo-Pérez and Lopez-Poveda, 2022). In AM detection, HI listeners show overall better AM detection thresholds than NH listeners, and do not benefit from adding an adapting noise (or precursor) before the AM tone, whereas NH listeners do (Jennings et al., 2018a). The better AM thresholds are thought to be a consequence of HI listeners having more linear basilar membrane (BM) responses because of outer hair cell (OHC) loss, which results in an enhanced AM representation at the BM. The smaller adaptation for HI listeners has been interpreted to be a consequence of medial olivocochlear (MOC) efferent effects having a smaller range of action to linearize BM responses (Jennings et al., 2018a).

Because speech recognition for HI listeners relies more on envelope (AM) than on temporal fine structure (TFS) cues (Hopkins et al., 2008; Lorenzi et al., 2006; Moore, 2008) and HI listeners show less than normal adaptation to noise in AM detection, adaptation to noise in speech recognition should be less for HI than for NH listeners, either when speech is natural or processed (vocoded) to preserve envelope cues and discard TFS cues. To our knowledge, only Ben-David et al. (2012) have investigated adaptation to noise in speech recognition including a group of HI listeners. They showed that older HI listeners do not adapt to multi-talker babble noise but adapt to steady noise like younger NH listeners do. It is uncertain, however, if these results are related with age, with hearing loss, or with the two factors. In addition, because Ben-David et al. (2012) aimed at investigating the effects of ageing but not of hearing loss on adaptation to noise, listeners performed the experiment using their better ear, and the audiometric thresholds of their HI listeners did not differ much from those of their NH listeners [the thresholds of their HI listeners were ~10 dB HL at low-to-mid frequencies (mean across 0.5, 1, and 2 kHz) and ~32 dB HL at mid-to-high frequencies (mean across 3, 4, 6, and 8 kHz)]. Further research including participants with larger audiometric losses and controlling for the possible effect of age on adaptation is necessary to shed light on the impact of hearing loss on adaptation.

Here, we compared adaptation to noise for NH and HI listeners presented with natural and vocoded words in steady noise. HI listeners were provided with frequency-specific, linear amplification to compensate for reduced audibility and the possible effect of age on adaptation was controlled for statistically.

Experimental design

A group of 25 HI listeners participated in the study. We measured SRTs (i.e., the SNR at 50% word recognition) for natural (unprocessed) and tone-vocoded words embedded in simultaneous speech-shaped noise (SSN). Words could be preceded or not by 1-second-long noise precursor with identical spectrum and level as the simultaneous noise. Adaptation to noise was calculated as the SRT improvement caused by the precursor relative to the no-precursor condition. Stimuli were presented monaurally and were amplified using a software-based multichannel hearing-aid (HA) which provided frequency specific linear amplification according to National Acoustics Laboratory – Revised (NAL-R) gain prescription (Byrne and Dillon, 1986). Results for HI listeners were compared with those of NH listeners published in Marrufo-Pérez et al. (2020).

Participants

Twenty-five HI listeners (9 men) participated in the experiment (mean age = 61.5 years; standard deviation (SD) = 9.3 years). Participants were native Spanish speakers, except one of them, who was a native French speaker but had a high level of proficiency in Spanish.

Air- and bone-conduction hearing thresholds were measured using a clinical audiometer (Interacoustics AD229e). Air-conduction thresholds were measured in the two ears at octave frequencies between 0.25 and 8.0 kHz, as well as at 6.0 kHz (Fig. 1A). All but two participants had symmetrical hearing loss (mean air-conduction threshold difference between the two ears <15 dB at 0.5, 1, and 2 kHz, and <30 dB at 4 and 6 kHz; AAO-HNS, 1993). The two participants with asymmetrical hearing loss were tested in the ear with the worst air-conduction thresholds (the right ear in the two cases). For participants with symmetrical hearing loss, the test ear was chosen at random (14 left, 9 right ears).

Bone-conduction hearing thresholds were measured at octave frequencies between 0.5 and 4.0 kHz for the participants with a mean (0.5-4 kHz) air-conduction threshold larger than 30 dB hearing level (HL) in the tested ear. The exception was one participant with mean air-conduction thresholds of 82 dB HL for whom we did not measure bone-conduction thresholds. For participants with symmetrical hearing loss, bone-conduction thresholds were measured in one ear without masking the contralateral ear. The measured ear was chosen at random as we assumed that the interaural attenuation in bone conduction was zero or close to zero (Hood, 1960). For the two participants with asymmetrical hearing loss, bone-conduction thresholds were measured in the test ear with masking in the contralateral ear. Mean bone-conduction thresholds were >20 dB HL for all participants, which indicates that they had sensorineural hearing loss (Fig. 1B).

The results for HI listeners were compared with the results for NH listeners from Marrufo-Pérez et al. (2020), whose hearing thresholds are shown in Fig. 1C. NH listeners were younger (mean age = 26.6 years; SD = 8.3 years) than HI listeners (Fig. 1D). For that reason, the correlation between hearing loss and adaptation to noise was controlled for age as described later.

Participants were volunteers and not paid for their time. All of them signed an informed consent to participate in the study. Methods were approved by the Ethics Committee of the University of Salamanca (Spain) and performed in accordance with the Declaration of Helsinki.

Stimuli

Hearing-impaired listeners

SRTs for noise-masked words were measured monaurally in the presence and in the absence of a noise precursor. Both the simultaneous and precursor noises were steady, had a speech-shaped spectrum, and had a level of 65 dB SPL before HA processing. In the condition without precursor, the simultaneous masking noise started 50 ms before the word onset and finished 50 ms after the word offset. The precursor noise had a duration of 1 second. Both the precursor and simultaneous noise had 20-ms raised-cosine onset and offset ramps. The time gap between the precursor noise and the simultaneous masking noise was 0 ms. Different precursor and simultaneous noises were generated for each word presentation (i.e., noises were not ‘frozen’). Adaptation to noise was calculated as the SRT improvement with the precursor relative to the non-precursor condition.

SRTs were measured for natural (Cárdenas and Marrero, 1994) and tone-vocoded disyllabic words (one participant was tested only with natural words). The vocoder was applied to the words but not to the noise. The vocoder included a high-pass pre-emphasis filter (first-order Butterworth filter with a 3-dB cutoff frequency of 1.2 kHz); a bank of 12, sixth-order Butterworth band-pass filters whose 3-dB cutoff frequencies followed a modified logarithmic distribution between 100 and 8500 Hz; and envelope extraction via full-wave rectification and low-pass filtering (fourth-order Butterworth low-pass filter with a 3-dB cutoff frequency of 400 Hz). The envelope for each frequency channel was used to modulate a sinusoidal carrier at the channel center frequency, and the modulated signals were sample-wise added to obtain the vocoded speech.

Stimuli (speech and noise) were processed through a 12-channel, software-based HA implemented in the frequency domain in Matlab (The Mathworks, v2017a). The HA was the same as used in San-Victoriano et al. (2023) but set with linear amplification according to the NAL-R gain prescription (Byrne and Dillon, 1986).

Normal-hearing listeners

The results of the present study for HI listeners were compared with the results for NH listeners from our previous study (Marrufo-Pérez et al., 2020). The stimuli used in the two studies were similar except that a precursor noise of 1.0 second duration was used for HI listeners while for NH listeners words were delayed in the noise by 1.5 seconds. The difference in the preceding noise duration is not expected to affect the amount of adaptation because the time course of adaptation is ~350 ms and SRTs in noise do not improve further for noise-word onset delays beyond 600 ms. This time course is independent from age or hearing status (Ben-David et al., 2012).

Procedure

The phonetically balanced word lists from Cárdenas and Marrero (1994) were used to measure the SRTs. Before starting an SRT measurement, the experimenter chose one of the ten available lists at random. Each list contained 25 disyllabic words. Words from a given list were presented in random order across test conditions to minimize the possibility that participants remembered the words. To measure an SRT, the noise level was fixed. The speech level varied adaptively using a one-down, one-up adaptive rule, i.e., it decreased after a correct response and increased after an incorrect response. The SRT was thus defined as the SNR giving 50% correct word recognition in the psychometric function (Levitt, 1971). The initial SNR was 10 dB. The speech level changed in 4-dB steps between words 1 and 8, and in 2-dB steps between words 9 and 25. The SRT was calculated as the mean of the SNRs for the final 15 words (the SNR for the 16th word was calculated and used in the SRT estimate but not actually presented). Feedback was not given to the participants on the correctness of their responses.

The SRTs with and without precursor were measured in pairs but in random order. An SRT was discarded immediately after measuring it and a new SRT was measured when the SD within the measure was greater than 3 dB. Three SRTs were obtained for each condition, but if the experimenter deemed that the across-measures SD was high (no objective criteria was used), a fourth pair of SRTs was obtained. The mean of the three or four SRTs was taken as the final SRT.

During the measurements, participants were seated in a double-wall sound attenuating booth and the presentation of each word was controlled by the experimenter, who was sitting outside the booth without visual interaction with the participant. A sound cue (1-kHz pure tone with 500 ms duration) was presented 500 ms before the stimulus onset to warn the listener about the stimulus presentation and to focus his/her attention on the speech recognition task. The level of the cue was 83 dB SPL for HI listeners and 66 dB SPL for NH listeners.

Apparatus

Stimuli were presented through custom-made Matlab software (The Mathworks, v2017a). Stimuli were played via an RME Fireface UCX soundcard at a sampling rate of 44.1 kHz and with 24-bit resolution. For NH listeners, stimuli were presented to the listeners using insert earphones (ER2, Etymotic Research). Sound pressure levels were calibrated by inserting the earphones in a Zwislocki DB-100 coupler connected to a sound level meter (Brüel Kjaer, mod. 2238). For HI listeners, stimuli were presented to the listeners using circumaural headphones (Sennheiser HD580), which provided higher output levels than the insert earphones and prevented clipping from occurring when the hearing loss was severe. SPLs were calibrated by placing the headphones on a KEMAR head (Knowles Electronics) equipped with a Zwislocki (Knowles Electronics DB-100) artificial ear connected to a sound level meter (Brüel & Kjaer 2238). For the two transducers, calibration was performed at 1 kHz and the obtained sensitivity was used at all other frequencies.

Statistical analyses

Statistical analyses were performed using IBM SPSS Statistics, version 28. The Shapiro-Wilk test of normality revealed that the data did not conform to a Gaussian distribution. Hence, non-parametric statistical tests were used to compare SRTs or adaptation to noise across participant groups (NH and HI listeners) or across conditions (no precursor/precursor) as well as to correlate different measures. We report two-tailed significance levels for all tests.

Figure 2A shows the SRTs for NH listeners presented with natural and vocoded words, without and with precursor. Fig. 2B shows corresponding data for HI listeners. Let us first analyze the SRTs measured without the noise precursor to assess the effect of hearing loss on intelligibility independent of adaptation. In that case, SRTs were overall worse (higher) for HI than for NH listeners, both for natural (Mann-Whitney U=32.5, p<0.001) and vocoded words (U=119.0, p=0.078). In addition, SRTs worsened as hearing loss increased both for natural (Spearman Rho r=0.74, N=52, p<0.001) and vocoded (r=0.36, N=39, p=0.024) words (Fig. 3A). Because stimuli were presented to HI listeners through a HA with frequency specific linear amplification (Byrne and Dillon, 1986), the worse SRTs for HI listeners were unlikely related to audibility deficits. Instead, results suggest that HI listeners are less able to use speech-related acoustic cues as hearing loss increases. The specific cues will be discussed later.

To investigate whether listeners showed adaptation to noise, Fig. 2A,B allows comparing SRTs without and with the noise precursor. For NH listeners, Wilcoxon signed rank test showed that the noise precursor improved SRTs for both natural (N=27; Z= -2.73; p=0.006) and vocoded (N=15; Z= -3.0; p=0.003) words, although adaptation was larger for vocoded words (N=14; Z= -2.2; p=0.026) (Fig. 2C). HI listeners did not show statistically significant adaptation for natural words (N=25; Z= -1.5; p= 0.122) but they did show adaptation for vocoded words (N=24; Z= -2.3; p= 0.019) (Fig. 2D). The amount of adaptation to noise was small overall and not statistically different for the two listener groups when listeners were presented with natural words (0.5 dB for NH; 0.4 dB for HI; Mann-Whitney U=332.0; p=0.920). In contrast, adaptation was larger for NH (1.7 dB) than for HI listeners (0.7 dB) when they were presented with vocoded words (U=113.0; p=0.053).

Because HI listeners did not show adaptation to noise with natural words, and showed smaller than normal adaptation with vocoded words, the results show that hearing loss impairs adaptation to noise. To investigate this further, Fig. 3A,B shows the relationship between the pure tone average (PTA) threshold (0.5-4 kHz) and SRTs measured with and without the noise precursor. Clearly the effect of the precursor on SRT decreases as the PTA increases, indicating that hearing loss impaired adaptation to noise. [Similar results were found when the PTA threshold was calculated across all audiometric frequencies (0.25-8kHz) (not shown)]. Figure 3C,D shows the amount of adaptation as a function of PTA. The correlation was not statistically significant for natural words (Spearman Rho r=-0.19, N=52, p=0.177) but it was significant for vocoded words (r=-0.41, N=39, p=0.009). For the two word types, listeners no longer benefit from adaptation when the PTA is 70 dB HL or higher (i.e., the regression line approaches zero).

The relationship between PTA and adaptation might, however, be reflecting the effect of a mixture of variables because PTA is correlated with age (r=0.705, N=53, p<0.001) as well as with the baseline SRT measured without the precursor (SRT_NoPrec) (natural: r=0.743, N=52, p<0.001; vocoded: r=0.361, N=39, p=0.024). It has been shown elsewhere that age (Ben-David et al., 2012) and the baseline SRT (Marrufo-Pérez et al., 2018a) can affect the amount of adaptation. To investigate the specific impact of hearing loss on adaptation, we performed a semi-partial correlation between PTA and adaptation to noise while controlling for the effects of age and SRT_NoPrec on PTA. [When we performed the correlation for natural words, we used the SRT_NPfor vocoded words as a control variable and vice versa. This is the reason why N=38 for natural and vocoded words]. Spearman's rank-order semi-partial correlation revealed a correlation between PTA and adaptation to noise for natural (r=-0.29, N=38, p=0.078) and vocoded words (r=-0.38, N=38, p=0.019) (Fig. 3E,F). This suggests that the hearing loss per se, rather than age or the baseline intelligibility, degraded adaptation to noise.

The present study aimed at investigating the impact of hearing loss on adaptation to noise in speech recognition. We found that, in aided conditions, the SRT for words in competition with SSN worsens with increasing hearing loss, both for natural and vocoded words. We also found that adaptation to noise decreases as hearing loss increases, both for natural and vocoded words. These relationships remained when the effect of age on PTA was partialled out, which demonstrates that the loss of adaptation is related to the hearing loss and not to ageing.

The loss of adaptation to noise adds to the loss of speech information

Previous studies have shown that hearing loss impairs the encoding of and thus the access to speech TFS (Hopkins et al., 2008; Kale and Heinz, 2010; Lorenzi et al., 2006; Moore, 2008, 2011) and envelope information (Fullgrabe et al. 2003; Kale and Heinz, 2010; Ozmeral et al., 2018). It also degrades the resolution of the long-term speech spectrum in the auditory system because of poor frequency selectivity (Başkent, 2006; Henry et al. 2005; Leek and Summers, 1996; Noordhoek et al., 2000; Stelmachowicz et al., 1985; Turner et al., 1999). In addition, hearing loss at frequencies higher than those commonly measured in the audiogram (8-20 kHz) contributes to impaired speech recognition in broadband noise (Zadeh et al., 2019). Here, we demonstrate that hearing loss not only impairs access to speech acoustic cues, but also impairs the ability of listeners to adapt to the noise.

Figure 4 illustrates the contribution of various factors to the loss of intelligibility for the present HI listeners with the largest hearing loss (supplementary Fig. S1 shows similar data for participants with smaller losses). Panel A shows the SRTs for the six NH listeners with the best PTA thresholds (from 0 to 3 dB HL; mean=1.7 dB HL) and for the six HI listeners with the worst PTA thresholds (from 64 to 83 dB HL; mean=70.0 dB HL). NH listeners showed better SRTs for natural than for vocoded speech and even better SRTs when they had the opportunity to adapt to the noise. In contrast, HI listeners showed worse SRTs than NH listeners (p<0.05) and virtually constant SRTs across conditions (p≥0.293). The results can be interpreted as follows. First, because the vocoder preserved only the speech envelope and spectral information below 8.5 kHz (the highest cutoff frequency of the filter bank in the vocoder), the worse SRTs for vocoded words for HI than for NH listeners (Voc_NoPrec in Fig. 4A,B) reveals an impaired ability of HI listeners to use envelope and/or spectral information below 8.5 kHz. This impairment explains 52% of the total SRT loss (Fig. 4C). Second, because the recognition of natural speech depends on all the same cues as the recognition of vocoded speech plus TFS (Moore, 2011) and high-frequency (> 8.5 kHz) information (Zadeh et al., 2019), the difference between natural and vocoded words reveals the ability to use TFS and high frequency spectral information. Because NH but not HI listeners showed better SRTs for natural than vocoded words (Voc_NoPrecvsNat_NoPrec in Fig. 4A), our results suggest that NH but not HI listeners benefit from adding TFS or high-frequency spectral information to speech. The lack of TFS and high frequency spectral speech information explains the 38% of the total SRT loss (Fig. 4C). Lastly, the improvement in SRT when adding a precursor shows the benefit of adaptation to noise. NH but not HI listeners showed adaptation (Fig. 4A), and the loss of adaptation explains the remaining 10% of the total SRT loss (Fig. 4C). Altogether, the present results show that HI listeners have difficulties recognizing audible speech in noise not only because they are less able to use/encode speech acoustic cues but also because they are less able to adapt to the noise background.

On the reduced adaptation to noise for HI listeners

We have found that hearing loss impairs adaptation to noise (Fig. 3, 4). Multiple mechanisms can underlie this result (Marrufo-Pérez and Lopez-Poveda, 2022; Willmore and King, 2023). Among them may be a smaller MOC reflex (MOCR)-mediated adjustments of the cochlear gain in the damaged cochlea. MOC fibers are reflexively activated with a time course of 277±62 ms (Backus and Guinan, 2006). MOC efferents terminate upon OHCs and their activation inhibits the cochlear amplifier gain, linearizing BM responses and reducing compression (Murugasu and Russell, 1996). Jennings et al. (2018a) reasoned that HI listeners do not show adaptation in AM detection because they have more linear BM responses and hence less MOCR-mediated BM linearization. A smaller adaptation to noise in speech recognition might also occur if the smaller MOCR-mediated BM linearization produces less enhancement of the speech envelope at the output of the BM responses in HI listeners. Some studies, however, have shown that adaptation to noise in speech recognition or AM detection can occur without MOCR effects (Marrufo-Pérez et al., 2018a, 2019).

Another proposed mechanism for adaptation to noise is a shift of the dynamic range of auditory neurons toward the most common level in the noise preceding the word (Ainsworth and Meyer, 1994; Marrufo-Pérez et al., 2018a, 2019, 2020; Marrufo-Pérez and Lopez-Poveda, 2022). When auditory neurons are presented with a low-varying-level noise, neurons shift their dynamic ranges toward the most common noise level so long as the level in question is above the neuron’s threshold. This adaptation increases a neuron’s sensitivity to changes in sound level (Dean et al., 2005; Wen et al., 2009). The improvement in sensitivity to level changes, however, is smaller when the variance in the stimulus level is large (Rabinowitz et al., 2011). Because audiometric hearing loss is often associated with the loss or dysfunction of inner hair cells (IHCs) and OHCs (Liberman and Dodds, 1984; Wu et al., 2020), it is possible that the reduced auditory peripheral compression due to OHC loss makes the IHC receptor potential representation more fluctuating than normal. This could result in neurons not having a prevailing level to adapt to, thus in less adaptation to noise for HI than for NH listeners (Marrufo-Pérez et al., 2020). It is yet to be shown, however, whether the inherent fluctuations of a steady noise at the output of a linear BM response are fluctuating enough to impair dynamic range adaptation to the sound level statistics.

A third potential mechanism is related with the disruption of the cues used by listeners to segregate the speech and noise stimuli. Jennings et al. (2018b) measured pure tone detection thresholds for short tones presented 2 or 197 ms after the onset of a 400 ms narrowband noise masker with flattened or with inherent fluctuating temporal envelope. They found that, when the probe was delayed in the noise, detection thresholds improved for the tones presented in the flattened noise but worsened for the tones presented in the fluctuating noise. Jennings et al. (2018b) reasoned that listeners relied on a temporal envelope-based cue to detect the probe and that the fluctuations of the preceding noise disrupted this cue. It has been shown that the amplitude fluctuations of a steady noise can also impair speech recognition when presented simultaneously with the speech, presumably because it is hard to distinguish the noise fluctuations from the amplitude fluctuations that convey speech information (Stone et al., 2011, 2012). However, it is uncertain if the precursor fluctuations can hinder speech recognition. If they did, the more linear BM responses of HI listeners would enhance the precursor noise fluctuations, making it harder for HI listeners to distinguish between noise and speech fluctuations, and resulting in smaller adaptation for HI than for NH listeners. This, however, does not seem to be the case here because although sometimes SRTs were worse with the precursor, the worsening occurred for NH and HI listeners and so it does not seem to be related with the hearing loss (points below zero in Fig. 3C, D).

Implications

The present study shows that adaptation to noise in speech recognition decreases with increasing hearing loss, even when the potential confounding effect of age is factored out. This finding is relevant in at least two ways. First, as explained earlier, HI listeners find it harder to recognize audible speech in noisy settings than NH listeners (Başkent et al., 2006; Duquesnoy and Plomp, 1983; Hopkins et al., 2008; Lopez-Poveda et al., 2014; Moore et al., 1999; Summers et al., 2013). Impaired access to speech acoustic cues (envelope, TFS, spectrum) has been shown to contribute to the impaired intelligibility. The present study reveals that the impact of hearing loss on speech recognition can be underestimated if adaptation is disregarded (Fig. 4). Second, research on the factors that can affect speech-in-noise intelligibility (for both NH and HI listeners) is often conducted disregarding adaptation or, more generally, temporal effects. Indeed, the speech-to-noise onset delay varies widely across studies (e.g., 500 ms in Johannesen et al., 2016; 3 seconds in Souza et al., 2019), sometimes is not even reported (Saunders and Forsline, 2006; Tognola et al., 2019; Wu et al., 2021a, 2021b), and most times is not justified. The present study suggests that the relevance of some factors may be different depending on the time when the speech is presented relative to the noise onset. For instance, cochlear mechanical dysfunction does not predict intelligibility in speech maskers for HI listeners (e.g., Lopez-Poveda et al., 2017), but it could predict the SRT impairment related to a loss of adaptation to noise.

We measured aided SRTs for words in competition with steady SSN for listeners with different degrees of sensorineural hearing loss. Words were preceded or not by 1-second-long precursor adapting noise. We found that SRTs worsen and that adaptation to noise in speech recognition decreases with increasing hearing loss, both when speech is natural and vocoded. Overall, the present results suggest that HI listeners show impaired intelligibility in noisy environments not only because they are less able to use speech acoustic cues but also because they are less able to adapt to the background noise.

ACKNOWLEDGEMENTS

Work supported by the Spanish Ministry of Science and Innovation (grant PID2019-108985GB-I00), and the European Regional Development Fund.

ADDITIONAL INFORMATION

The authors declare no competing interests.

DATA AVAILABILITY

All data are provided in the manuscript figures and supplementary information files. The actual data values will be provided by the corresponding author (E.A.L.P.) upon reasonable request.

AUTHOR CONTRIBUTIONS

M.I.M.P. and E.A.L.P. designed the research; M.I.M.P. and M.J.F. performed the research; A.E.M. contributed research tools; M.I.M.P analyzed the data; M.I.M.P. and E.A.L.P. wrote the paper. E.A.L.P. secured funding and supervised the research.

AAO-HNS (1993) Academy responses to the FDA request for comment on hearing aid regulations. Am Acad Otolaryngol—Head Neck Surg Bull 16-17:26-28.
Ainsworth WA, Meyer GF (1994) Recognition of plosive syllables in noise: Comparison of an auditory model with human performance. J Acoust Soc Am 96:687-694.
Backus BC, Guinan JJ (2006) Time-course of the human medial olivocochlear reflex. J Acoust Soc Am 119:2889-2904.
Bacon SP, Takahashi GA (1992) Overshoot in normal-hearing and hearing-impaired subjects. J Acoust Soc Am 91(5):2865-2871.
Baer T, Moore BC (1994) Effects of spectral smearing on the intelligibility of sentences in the presence of interfering speech. J Acoust Soc Am 95(4):2277-2280. doi: 10.1121/1.408640.
Başkent D (2006) Speech recognition in normal hearing and sensorineural hearing loss as a function of the number of spectral channels. J Acoust Soc Am 120(5 Pt 1):2908-2925. doi: 10.1121/1.2354017.
Ben-David BM, Tse VY, Schneider BA (2012) Does it take older adults longer than younger adults to perceptually segregate a speech target from a background masker? Hear Res 290:55–63.
Ben-David BM, Avivi-Reich M, Schneider BA (2016) Does the degree of linguistic experience (native versus nonnative) modulate the degree to which listeners can benefit from a delay between the onset of the maskers and the onset of the target speech? Hear Res 341:9-18.
Byrne D, Dillon H (1986) The National Acoustic Laboratories' (NAL) new procedure for selecting the gain and frequency response of a hearing aid. Ear Hear 7(4):257-265. doi: 10.1097/00003446-198608000-00007.
Cárdenas MR, Marrero V (1994) Cuaderno de logoaudiometría. Madrid, Spain: Universidad Nacional de Educación a Distancia.
Carlyon RP, Sloan EP (1987) The "overshoot" effect and sensory hearing impairment. J Acoust Soc Am 82(3):1078-1081. doi: 10.1121/1.395329.
Cervera T, Ainsworth WA (2005) Effects of preceding noise on the perception of voiced plosives. Acta Acust United Acust 91:132-144.
Cervera T, Gonzalez-Alvarez J (2007) Temporal effects of preceding bandpass and band-stop noise on the recognition of voiced stops. Acta Acust United Acust 93:1036-1045.
Dean I, Harper NS, McAlpine D (2005) Neural population coding of sound level adapts to stimulus statistics. Nat Neurosci 8:1684-1689.
Duquesnoy AJ, Plomp R (1983) The effect of a hearing aid on the speech-reception threshold of hearing-impaired listeners in quiet and in noise. J Acoust Soc Am 73(6):2166-2173. doi: 10.1121/1.389540
Fullgrabe C, Meyer B, Lorenzi C (2003) Effect of cochlear damage on the detection of complex temporal envelopes. Hear Res 178:35-43. doi: 10.1016/S0378-5955(03)00027-3.
Henry B, Turner CW, Behrens A (2005) Spectral peak resolution and speech recognition in quiet: Normal-hearing, hearing-impaired, and cochlear implant listeners. J Acoust Soc Am 118:1111–1121.
Hood D (1960) The principles and practice of bone conduction audiometry: A review of the present position. Laryngoscope 70:1211-1228.
Hopkins K, Moore BC, Stone MA (2008) Effects of moderate cochlear hearing loss on the ability to benefit from temporal fine structure information in speech. J Acoust Soc Am 123(2):1140-1153. doi: 10.1121/1.2824018.
Hopkins K, Moore BC (2009) The contribution of temporal fine structure to the intelligibility of speech in steady and modulated noise. J Acoust Soc Am 125(1):442-446. doi: 10.1121/1.3037233.
Jennings SG, Ahlstrom JB, Dubno JR (2016) Effects of age and hearing loss on overshoot. J Acoust Soc Am 140(4):2481-2493.
Jennings SG, Chen J, Fultz SE, Ahlstrom JB, Dubno JR (2018) Amplitude modulation detection with a short-duration carrier: Effects of a precursor and hearing loss. J Acoust Soc Am 143(4):2232-2243.
Jennings SG, Sivas K, Stone C (2018) Effects of masker envelope fluctuations on the temporal effect. J Assoc Res Otolaryngol 19(6):717-727. doi: 10.1007/s10162-018-00688-x.
Johannesen PT, Pérez-González P, Kalluri S, Blanco JL, Lopez-Poveda EA (2016) The influence of cochlear mechanical dysfunction, temporal processing deficits, and age on the intelligibility of audible speech in noise for hearing-impaired listeners. Trends Hear 20:2331216516641055. doi: 10.1177/2331216516641055.
Kale S, Heinz MG (2010) Envelope coding in auditory nerve fibers following noise-induced hearing loss. J Assoc Res Otolaryngol 11(4):657-673. doi: 10.1007/s10162-010-0223-6.
Leek M, Summers V (1996) Reduced frequency selectivity and the preservation of spectral contrast in noise. J Acoust Soc Am 100:1796-1806.
Levitt H (1971) Transformed up-down methods in psychoacoustics. J Acoust Soc Am 49:467-677.
Liberman MC, Dodds LW (1984) Single-neuron labeling and chronic cochlear pathology. III. Stereocilia damage and alterations of threshold tuning curves. Hear Res 16:55-74.
Lopez-Poveda EA (2014). Why do I hear but not understand? Stochastic undersampling as a model of degraded neural encoding of speech. Front Neurosci 8:348. doi: http://doi.org/10.3389/fnins.2014.00348
Lopez-Poveda EA, Eustaquio-Martín A, San-Victoriano FM (2022) Binaural pre-processing for contralateral sound field attenuation and improved speech-in-noise recognition. Hear Res 418:108469. doi: 10.1016/j.heares.2022.108469.
Lopez-Poveda EA, Johannesen PT, Pérez-González P, Blanco JL, Kalluri S, Edwards B (2017) Predictors of hearing-aid outcomes. Trends Hear 21:2331216517730526. doi: 10.1177/2331216517730526.
Lorenzi C, Gilbert G, Carn H, Garnier S, Moore BC (2006) Speech perception problems of the hearing impaired reflect inability to use temporal fine structure. Proc Natl Acad Sci 103(49):18866-18869. doi: 10.1073/pnas.0607364103.
Marrufo-Pérez MI, Eustaquio-Martín A, Fumero MJ, Gorospe JM, Polo R, Gutiérrez Revilla A, Lopez-Poveda EA (2019) Adaptation to noise in amplitude modulation detection without the medial olivocochlear reflex. Hear Res 377:133-141. doi: 10.1016/j.heares.2019.03.017
Marrufo-Pérez MI, Eustaquio-Martín A, López-Bascuas LE, Lopez-Poveda EA (2018b) Temporal effects on monaural amplitude modulation sensitivity in ipsilateral, contralateral and bilateral noise. J Assoc Res Otolaryngol 19(2):147-161.
Marrufo-Pérez MI, Eustaquio-Martín A, Lopez-Poveda EA (2018a) Adaptation to noise in human speech recognition unrelated to the medial olivocochlear reflex. J Neurosci 38(17):4138-4145.
Marrufo-Pérez MI, Lopez-Poveda EA (2022) Adaptation to noise in normal and impaired hearing. J Acoust Soc Am 151(3):1741–1753. doi: 10.1121/10.0009802.
Marrufo-Pérez MI, Sturla-Carreto DP, Eustaquio-Martín A, Lopez-Poveda EA (2020) Adaptation to noise in human speech recognition depends on noise-level statistics and fast dynamic-range compression. J Neurosci 40(34):6613–6623.
Moore BC (2008) The role of temporal fine structure processing in pitch perception, masking, Byrne and speech perception for normal-hearing and hearing-impaired people. J Assoc Res Otolaryngol 9(4):399-406. doi: 10.1007/s10162-008-0143-x.
Moore BC, Peters RW, Stone MA (1999) Benefits of linear amplification and multichannel compression for speech comprehension in backgrounds with spectral and temporal dips. J Acoust Soc Am 105(1):400-411. doi: 10.1121/1.424571.
Moore BCJ (2011) The importance of temporal fine structure for the intelligibility of speech in complex backgrounds. Proc Intl Symp Auditory Audiol Res 3:21–32.
Murugasu E, Russell IJ (1996) The effect of efferent stimulation on basilar membrane displacement in the basal turn of the Guinea pig cochlea. J Neurosci 16:325-332.
Noordhoek IM, Houtgast T, Festen JM (2000) Measuring the threshold for speech reception by adaptive variation of the signal bandwidth. II. Hearing-impaired listeners. J Acoust Soc Am 107:1685–1696.
Ozmeral EJ, Eddins AC, Eddins DA (2018) How do age and hearing loss impact spectral envelope perception? J Speech Lang Hear Res 61(9):2376-2385.
Plomp R (1978) Auditory handicap of hearing impairment and the limited benefit of hearing aids. J Acoust Soc Am 63(2):533-549. doi: 10.1121/1.381753.
Plomp R (1986) A signal-to-noise ratio model for the speech-reception threshold of the hearing impaired. J Speech Hear Res 29(2):146-154. doi: 10.1044/jshr.2902.146.
Rabinowitz NC, Willmore BD, Schnupp JW, King AJ (2011) Contrast gain control in auditory cortex. Neuron 70(6):1178-1191. doi: 10.1016/j.neuron.2011.04.030.
San-Victoriano FM, Eustaquio-Martín A, Lopez-Poveda EA (2023) Binaural pre-processing for contralateral sound field attenuation can improve speech-in-noise intelligibility for bilateral hearing-aid users. Hear Res 432:108743. doi: 10.1016/j.heares.2023.108743
Saunders GH, Forsline A (2006) The Performance-Perceptual Test (PPT) and its relationship to aided reported handicap and hearing aid satisfaction. Ear Hear 27(3):229-242. doi: 10.1097/01.aud.0000215976.64444.e6
Souza P, Arehart K, Schoof T, Anderson M, Strori D, Balmert L (2019) Understanding variability in individual response to hearing aid signal processing in wearable hearing aids. Ear Hear 40(6):1280-1292. doi: 10.1097/AUD.0000000000000717.
Stelmachowicz PG, Jesteadt W, Gorga MP, Mott J (1985) Speech perception ability and psychophysical tuning curves in hearing impaired listeners. J Acoust Soc Am 77:620–627.
Stone MA, Füllgrabe C, Mackinnon RC, Moore BC (2011) The importance for speech intelligibility of random fluctuations in "steady" background noise. J Acoust Soc Am 130(5):2874-2881. doi: 10.1121/1.3641371.
Stone MA, Füllgrabe C, Moore BC (2012) Notionally steady background noise acts primarily as a modulation masker of speech. J Acoust Soc Am 132(1):317-326. doi: 10.1121/1.4725766.
Strickland EA, Krishnan LA (2005) The temporal effect in listeners with mild to moderate cochlear hearing impairment. J Acoust Soc Am 118(5):3211-3217.
Summers V, Makashay MJ, Theodoroff SM, Leek MR (2013) Suprathreshold auditory processing and speech perception in noise: hearing-impaired and normal-hearing listeners. J Am Acad Audiol 24(4):274-292. doi: 10.3766/jaaa.24.4.4.
Tognola G, Mainardi A, Vincenti V, Cuda D (2019) Benefit of hearing aid use in the elderly: the impact of age, cognition and hearing impairment. Acta Otorhinolaryngol Ital 39(6):409-418. doi: 10.14639/0392-100X-2165.
Turner CW, Chi SL, Flock S (1999) Limiting spectral resolution in speech for listeners with sensorineural hearing loss. J Speech Lang Hear Res 42(4):773-784. doi:10.1044/jslhr.4204.773
Wen B, Wang GI, Dean I, Delgutte B (2009) Dynamic range adaptation to sound level statistics in the auditory nerve. J Neurosci 29:13797-13808.
Willmore BDB, King AJ (2023) Adaptation in auditory processing. Physiol Rev 103(2):1025-1058. doi: 10.1152/physrev.00011.2022
Wu M, Cañete OM, Schmidt JH, Fereczkowski M, Neher T (2021) Influence of three auditory profiles on aided speech perception in different noise scenarios. Trends Hear 25:23312165211023709. doi: 10.1177/23312165211023709
Wu M, Christiansen S, Fereczkowski M, Neher T (2022) Revisiting auditory profiling: Can cognitive factors improve the prediction of aided speech-in-noise outcome? Trends Hear 26:23312165221113889. doi: 10.1177/23312165221113889.
Wu PZ, O'Malley JT, de Gruttola V, Liberman MC (2020) Age-related hearing loss is dominated by damage to inner ear sensory cells, not the cellular battery that powers them. J Neurosci 40(33):6357-6366. doi: 10.1523/JNEUROSCI.0937-20.2020
Zadeh ML, Silbert NH, Sternasty K, Swanepoel W, Hunter LL, Moore DR (2019) Extended high-frequency hearing enhances speech perception in noise. Proc Natl Acad Sci 19, 116(47):23753-23759. doi: 10.1073/pnas.1903315116.
Zwicker E (1965) Temporal effects in simultaneous masking and loudness. J Acoust Soc Am 38:132-141.

No competing interests reported.

SupplementaryFigure1.pptx

Download PDF

Editorial decision: Revision requested
28 Jul, 2024
Reviews received at journal
26 Jul, 2024
Reviewers agreed at journal
06 Jul, 2024
Reviews received at journal
20 May, 2024
Reviewers agreed at journal
15 May, 2024
Reviewers invited by journal
13 May, 2024
Editor assigned by journal
13 May, 2024
Editor invited by journal
13 May, 2024
Submission checks completed at journal
13 May, 2024
First submitted to journal
07 May, 2024

You are reading this latest preprint version

Impaired noise adaptation contributes to the speech intelligibility problems of people with hearing loss

Status:

Version 1

Abstract

Figures

SIGNIFICANCE STATEMENT

INTRODUCTION

MATERIAL AND METHODS

Experimental design

Participants

Stimuli

Hearing-impaired listeners

Normal-hearing listeners

Procedure

Apparatus

Statistical analyses

RESULTS

DISCUSSION

The loss of adaptation to noise adds to the loss of speech information

On the reduced adaptation to noise for HI listeners

Implications

CONCLUSIONS

Declarations

ACKNOWLEDGEMENTS

ADDITIONAL INFORMATION

DATA AVAILABILITY

AUTHOR CONTRIBUTIONS

References

Additional Declarations

Supplementary Files

Status:

Version 1