Functional near-infrared spectroscopy (fNIRS) is an emerging neuroimaging technique to study functional hearing in the infant population because it is baby-friendly, low-cost, silent, portable, and less susceptible to motion and electrical artefacts (1, 2). fNIRS measures local changes in the concentration of oxy- (HbO) and deoxyhaemoglobin (HbR) due to neurovascular coupling. Most fNIRS studies of speech perception in infants have reported a canonical-shaped response morphology, where the HbO increases from the baseline, the HbR decreases from the baseline and both return to baseline within a predictable period post-stimulus onset (3–5). However, some studies have reported an inverted response shape (6–8), and a review has suggested the differences arise from the variation in experiment design and stimulus complexity (9). The variable morphology of fNIRS responses in infants limits the use of the existing inference framework, which is based on adult data. The framework assumes that the canonical-shaped response to a stimulus is static throughout the experiment (10), an assumption that we found to be violated in our data from sleeping infants. In our pilot study, we observed that the speech-evoked responses were neither canonical nor inverted, and they did not return to baseline within the relatively short experimental epoch as predicted from a canonical response. We, therefore, extended the inter-block silence interval to more than 22 s allowing us to further investigate this unexpected morphology.
In this study, we describe the morphology of fNIRS response recorded from 16 naturally sleeping infants (infant shown in Fig. 1A, recording montage shown in 1B). Each participant was presented with a 5.4 s speech stimulus block, consisting of 12 concatenated repetitions of a 450 ms long "ba" speech token; the stimulus block was repeated 20 times and separated by a silence period randomised between 22.0 to 32.0 s (Fig. 1C). The morphology of fNIRS responses for every five sequential trials was first investigated, to see how the morphology might change over the course of the 20 epochs. After discovering systematic changes in morphology with experiment duration, we argue that the response morphology seen must be due to a sum of two independent and simultaneous responses evoked by the auditory stimulus. Lastly, we show, using a modelling approach, that the data are consistent with the latter proposal. We hypothesised that the two independent responses reflect independent mechanisms: an obligatory auditory response to auditory stimulation and a response related to activation of the arousal system during sleep. The finding of this study not only changes the inference framework of fNIRS response in sleeping infants but also provides insight into the coalescence of brain processes in response to speech stimulus during sleep.
Results
The morphology of the fNIRS response in sleeping infants changes during the experiment.
Figure 2 shows the grand average HbO response within each region of interest (ROI) averaged across every five trials. It can be seen that, in each ROI, the morphology of the five trial averages changed over the course of the session, a result not reported in other studies, where generally only a session average is reported, and shorter epochs are used. We observed a positive peak around 5.0–6.0 s, as expected from a standard canonical response, followed by a wide negative trough peaking around 10.0–15.0 s - well after the stimulus offset at 5.4 s. However, the negative trough is dominant only in the first five trials (black line) and adapts over subsequent trials.
The canonical-like positive HbO peak can be seen in almost all channels (Extended Data Fig. 1), suggesting, in sleeping infants, speech stimuli evoke a positive response not only in the temporal ROIs but also in the pre-frontal ROIs, consistent with the findings in previous studies (4, 5).
Often, a negative trough after a positive peak has been modelled as an undershoot (11) as the HbO returns to baseline: that is, it forms part of the same response to the stimulus as the positive peak. In such a model, the negative trough is assumed to be a passive recovery from the expansion of the blood capillary following the positive peak, as suggested by the balloon model (12) in the BOLD response of functional magnetic resonance imaging (fMRI). However, the pattern observed in our data is inconsistent with this explanation. If the negative trough is caused by an overshoot of recovery from capillary expansion, the magnitude of the trough would be correlated with the size of the positive peak. In contrast, the negative trough is initially much larger than the positive peak, has a longer latency than feasible for an overshoot, and reduces in size rapidly over trials independently of the size of the positive peak. The data are more consistent with the notion that two independent responses occur simultaneously: one being the positive canonical response to the stimulus, and the other being a long-latency negative response that rapidly adapts over trials. Evidence to support the notion of two simultaneous responses is provided by the fact that the latency of the positive peak in Trials 1–5 is shorter than in the later trials, which can be explained by the positive peak being partially cancelled by its summation with the large negative response as they overlap in time. In the next section, we show how a model with two overlapping independent responses can explain our observed overall response morphology and how it changes over the experiment duration.
Model of the data as the summed effect of two simultaneous independent responses.
First, we determined the grand average response for each trial from Trial 1 to Trial 20. Since the patterns of responses were similar in all ROIs, we averaged across all recording channels for the purpose of the modelling only. We fitted these grand average single-trial responses, with a summation of two separate Gaussian functions, representing two independent responses that sum to create the observed overall response morphology. The model included five parameters per function: the latency and width of the Gaussian, and three parameters that allowed the amplitude of the Gaussians to exponentially decay over trials. The model was fit to the data with an optimization technique; the process is detailed in the Methods section.
Figure 3A shows the fitted model’s positive response had an estimated peak latency of 5.8 s from stimulus onset with a width of 4.6 s, and the negative response had an estimated peak latency of 15.6 s from stimulus onset (10.2 s after stimulus offset) with a width of 11.6 s. Both modelled responses are displayed with unity peak amplitude for visual purposes. Figure 3B shows the best-fit model (red line) along with the raw data (blue line). The fitted model parameters are illustrated in Fig. 3C-3E. The modelled positive response had no change in amplitude across trials, whereas the modelled negative response had twice the amplitude of the positive response at Trial 1, and a rapid reduction of peak amplitude from Trial 1 to Trial 10. The model result shows that the data are consistent with the hypothesis that two separate response mechanisms are simultaneously evoked by a speech stimulus in sleeping infants.
Discussion
Two simultaneous speech-evoked brain responses in sleeping infants.
The modelling shows that our observed overall responses are consistent with the hypothesized sum of two simultaneous and independent responses: a positive HbO response that peaked at 5.8 s from stimulus onset and maintained a constant amplitude with repeated stimulus presentation; and a wide negative HbO response that peaked at 15.6 s from stimulus onset and rapidly adapted in amplitude across trials.
The modelled positive response is consistent with an obligatory auditory response mechanism. This response, with a typical canonical shape, has been consistently reported in fNIRS studies investigating the haemodynamic response to sound, no matter whether the participants were tested while asleep (4, 5) or awake (3, 13). A similar response shape has also been reported in an fMRI study of speech perception in awake and asleep infants (14). These pieces of evidence show that an infant's brain is activated by sound even during sleep, reflecting the need to stay alert to the environment.
The independent negative response illustrated in Fig. 3A is unusual because the response peaked at 10.2 s after the stimulus offset and it did not return fully to baseline 20 s post-stimulus offset. This characteristic means that the mechanism evoking this reduction in neural activity remains active long after the stimulus ends. Another unique characteristic of this negative response is that it has a greater magnitude than the positive response in the initial trials but adapts strongly to repeated stimulus presentation. This pattern suggests that the brain reacts to salient or novel stimuli during sleep, stays active after the stimulus offset to process the relevancy of the information, and reduces its response to the same stimulus after repeated presentations, when the sound is no longer novel (i.e. the negative response adapts or habituates). The habitation occurs in sleep after the sensory stimuli are familiarised and there is no need to evoke arousal or awakening.
The above descriptions are consistent with the role and the response characteristics of the reticular activating system (RAS). RAS integrates sensory information and has a role in regulating the sleep-arousal-wake cycle (15). RAS is also responsible for the fight-or-flight response (16); it reacts to salient or novel stimuli and controls habituation (17). Bangash, Xie, Skatrud, Reichmuth, Barczi and Morgan (18) measured the cerebrovascular response at the middle cerebral artery when a group of sleeping adults were aroused with pure tone stimuli at increasing intensity levels. They found a long cerebrovascular response with a peak latency of about 9 s post-stimulus offset and the response did not return to the baseline till about 30 s post-stimulus offset. This finding indicates that the RAS, which regulates arousal during sleep, has a long activation period, similar to the characteristic of the negative response seen in our study. Another important characteristic of the RAS that is consistent with our data is that the arousal response in RAS habituates following the repetition of the same stimulus (19, 20). The direct and repeated electrical stimulation of the brainstem in an animal study has shown that the arousal response habituates within five to ten trials in rats and twenty trials for cats (19). In sleeping infants, habituation of arousal responses was observed when they were evoked by repeated tactile stimulation (21, 22). In short, RAS has a prolonged activation period and adapts to repeated stimulation, the same characteristics we observed in this study's negative response.
Alternative mechanisms for the negative haemodynamic response.
In fMRI studies, three potential mechanisms have been suggested to explain a negative haemodynamic response: an active suppression of neural activity in the area under observation, a reversed baseline state, or a reduction of blood supply to non-activated areas due to the reallocation of blood resources to the activated areas - also termed "blood stealing".
The active suppression or neural inhibitory activity at the observed area has been demonstrated by a study that shows a coupled decrease in the BOLD response of fMRI and the local field potential of an electrode tip that was inserted into the area V1 (occipital lobe) of monkeys (23). In humans, a concurrent fMRI and EEG study has reported that regions that showed large negative BOLD response also showed large inhibitory cortical potentials (24). The precise control of RAS on the arousal response remains unclear. However, the diffuse characteristic of the negative response in this data (it being seen across all ROIs) could be a result of the non-specific, widespread projection of neurons from the RAS together with other neuromodulatory systems to the cortex (25, 26). Some studies provided an alternative argument that active suppression is a mechanism to counteract the arousal effects (27, 28).
A reduction of activity in the default mode network (DMN) is another possible mechanism of the negative response observed in our study. The DMN is active in infants during sleep (29). Research has shown that a reduction of activity in the DMN is important to process relevant information (30). McKiernan, Kaufman, Kucera-Thompson and Binder (31) reported the magnitude of BOLD response deactivation increases with task difficulties. The authors suggested that the negative BOLD response is a suspension of activity in the DMN to reallocate resources for information processing.
We have no evidence to suggest that blood stealing is the mechanism for the negative response in this study. There is no evidence, at least in our study, of an increase in activity in the neighbouring regions that would draw blood away from the recording site.
In fNIRS studies, negative responses have been reported during a breath-holding task (32, 33). Even though we did not measure the breathing rate, we think it is unlikely the infant would hold their breath when the stimuli were presented because they were asleep and not actively paying attention to them. Furthermore, there are no significant changes to the BOLD response if a breath is held for only 3 to 5 s (34). Additionally, the modelled negative response begins immediately after stimulus onset, which is too early to be an effect of breath holding. The study of the arousal response in sleeping adults mentioned earlier also reported that the breathing rate increases when the participant is aroused (18).
With the evidence presented above, we hypothesize that auditory stimulation simultaneously evokes two independent responses in sleeping infants. The positive HbO response is hypothesized to be an obligatory response and the negative HbO response is hypothesized to be a process related to the arousal to sensory stimulation during sleep. The interpretation of stimulus-evoked fNIRS response in sleeping infants should not be restricted to the positive peak response post-stimulus onset. A better model to describe the haemodynamic response function of infants would incorporate the negative response and the differing adaptation of the two fNIRS responses with repeated stimulus presentation.