Decoding Predicted Musical Notes from Omitted Stimulus Potentials: Comparison of Familiar and Unfamiliar Melodies

doi:10.21203/rs.3.rs-3888249/v1

Download PDF

Article

Decoding Predicted Musical Notes from Omitted Stimulus Potentials: Comparison of Familiar and Unfamiliar Melodies

https://doi.org/10.21203/rs.3.rs-3888249/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Electrophysiological studies have investigated predictive processing in music by examining event-related potentials (ERPs) elicited by the violation of musical expectations. While several studies have reported that the predictability of stimuli can modulate the amplitude of ERPs, it is unclear how specific the representation of the expected note is. The present study addressed this issue by recording the omitted stimulus potentials (OSPs) to avoid contamination of bottom-up sensory processing with top-down predictive processing. Decoding of the omitted content was attempted using a support vector machine, which is a type of machine learning. ERP responses to the omission of four target notes (E, F, A, and C) at the same position in familiar and unfamiliar melodies were recorded from 24 participants. The results showed that the omission N1 and the omission mismatch negativity were larger in the familiar melody condition than in the unfamiliar melody condition. The decoding accuracy of the four omitted notes was significantly higher in the familiar melody condition than in the unfamiliar melody condition. These results suggest that the OSPs contain discriminable predictive information, and the higher the predictability, the more the specific representation of the expected note is generated.

Biological sciences/Neuroscience

Biological sciences/Psychology

Behavioral^1–4 and neuroscientific^5–8 studies have shown that the human brain predicts incoming sounds when listening to music^9–11. In particular, previous studies have provided empirical evidence of expectation in the dimensions of tonality^2,6,12 and meter^13,14. However, it is unclear whether the prediction can be specific (“the note”) or whether it is only vague (“some notes”).

According to the predictive coding framework, the size of the prediction error (i.e., the difference between the predicted input and the actual input) has been used in previous research as an indicator of predictability¹⁵. In the auditory domain, event-related potential (ERP) components, such as the N1 and the mismatch negativity (MMN^16–19), have been used as prediction error signals. For example, the MMN is calculated as the difference in ERP amplitude between unexpected and expected tones^20–22. Several studies have shown that the MMN amplitude for deviant notes in melodies is smaller when uncertainty is higher^23,24. However, interpretation of the ERP amplitude elicited by deviant tones is difficult because the ERP amplitude changes not only with the degree of violation but also with the physical parameters of the expected and unexpected tones²⁵. Instead of presenting unexpected tones, the present study used unexpected omissions and aimed to investigate whether the predictability of notes was reflected in ERPs elicited by an omission in a specific musical context.

By recording ERP responses to the omission of a sound, it is possible to avoid confounding sensory-evoked potentials with prediction error signals. For example, neural omission responses (hereafter, omitted stimulus potentials: OSPs), such as omission N1 (oN1)^26–29 and omission MMN (oMMN)^30–32, have been observed when a sound that is predictable in timing and content is unexpectedly omitted. They are considered a prediction error signal. The oN1 has been reported as a neural response to the omission of auditory stimuli generated by the participant’s button press or the omission of a sound that is usually presented with a visual stimulus^26–29. The oMMN is calculated as the difference in ERP response between the unexpected omission and the expected tone^30,32 or omission³¹. In the case of omission, since no external stimuli are presented and bottom-up input is absent, the omission response is considered a pure reflection of top-down predictive information²⁷. Furthermore, previous studies have shown that the amplitudes of oN1 and oMMN are larger when an omission occurs in a context where the content of the sound is predictable than when it is unpredictable (for oN1^27,29,33 and oMMN³⁰). Therefore, OSPs would be a better indicator of the predictive process in music than ERPs elicited by deviant tones.

The present study investigated whether the predictability of the specific content of an upcoming sound affects the omission-related neural response during passive music listening. Familiar melodies of well-known Japanese songs and newly created unfamiliar melodies were used for predictable and unpredictable conditions, respectively. In both types of melodies, four types of notes (E, F, A, and C) were presented or omitted with equal probability at the same position in each melody. If the OSPs are modulated by specific predictive information about the content of the note, the amplitudes of oN1 and oMMN will be larger in the familiar melody condition than in the unfamiliar melody condition, reflecting the predictability of the note.

To provide further evidence that the OSPs contain specific information about the content of the predicted but omitted note, the present study attempted to decode the identity of the omitted note from them. Support vector machine (SVM), a type of supervised machine learning, was used for decoding, because Trammel et al.³⁴ demonstrated that SVM performed better than other machine learning methods, such as linear discriminant analysis and random forest, in decoding ERPs elicited by related or unrelated words. Using the SVM, Bae and Luck³⁵ decoded 16 directions of stimuli from ERP responses recorded from participants who performed a direction judgment task for a visual stimulus (teardrop shape). Furthermore, Salehzadeh et al.³⁶ decoded 12 categories of finger-numerical configurations (i.e., positioning fingers in relation to numerical concepts or counting) from ERP responses. In line with these studies, the present study attempted to decode the four pitch categories of omitted notes from OSPs. If the predictive information about the identity of the note was contained in the OSP, the decoding accuracy would be higher in the familiar melody condition than in the unfamiliar melody condition.

Figure 1 shows the OSP waveforms and the accuracy of decoding the identity of the omitted notes. The oN1 amplitude was significantly larger in the familiar melody condition (M = − 1.95 µV, SD = 1.17 µV) than in the unfamiliar melody condition (M = − 1.37 µV, SD = 0.98 µV), t(24) = − 3.67, p = .001, dz = − 0.73, BF₁₀ = 29.41. The ERP amplitude in the MMN interval was significantly negative in the familiar melody condition (M = − 1.87 µV, SD = 1.19 µV), t(24) = − 7.84, p < .001, dz = 1.57, BF₁₀ = 315656.44, but not in the unfamiliar melody condition (M = − 0.13 µV, SD = 0.67 µV), t(24) = − 0.95, p = .352, dz = − 0.19, BF₁₀ = 0.32. The oMMN amplitude was significantly larger in the familiar melody condition than in the unfamiliar melody condition, t(24) = − 6.98, p < .001, dz = − 1.40, BF₁₀ = 50471.61.

Decoding accuracy was higher in the familiar melody condition than in the unfamiliar melody condition. The cluster-based permutation test revealed a significant difference in the decoding accuracy between the familiar and unfamiliar melody conditions in the 58–83 ms interval, t_sum = 87.11, p < .001. The mean accuracy of this interval was above chance level in the familiar melody condition (M = 30.2%, SD = 8.2), t(23) = 3.10, p = .005, dz = 0.63, BF₁₀ = 8.67, but not in the unfamiliar melody condition (M = 24.0%, SD = 7.0), t(23) = − 0.72, p = .479, dz = − 0.15, BF₁₀ = 0.27. The results of the correlation analysis between OSP amplitudes, decoding accuracy, and participants’ musical abilities are reported in the supplementary information.

Figure 1. Omitted stimulus potentials and decoding accuracy

Note. The upper left panel shows the ERP waveforms (means of the four frontal-central electrodes: Fz, FC1, FC2, and Cz) with 95% confidence intervals (CIs) during the omission of notes in the familiar and unfamiliar melody conditions. The scalp topographies and amplitudes of the oN1 response (99–119 ms) are also shown. The upper right panel shows the difference waveforms (i.e., unexpected omission − expected omission) in the familiar and unfamiliar melody conditions. The scalp topographies and amplitudes of the oMMN response (90–130 ms) are also shown. The lower left panel shows the four omitted notes. The lower right panel shows the decoding accuracy with 95% CI in the whole period and the mean accuracy in the 58–83 ms interval. Large dots and error bars in the raincloud plots indicate the mean ERP amplitudes and accuracy across participants and their 95% CIs.

The present study investigated the specificity of the predictive information generated according to the predictability of omitted notes in the musical context. The predictability of the notes was manipulated by melody familiarity, and sensory-evoked responses were eliminated by recording OSPs elicited by omitted target notes. Consistent with the predictive coding framework, unexpected omissions in the familiar melody condition elicited larger oN1 and oMMN responses than those in the unfamiliar melody condition. These results suggest that the predictability of the omitted notes was reflected in the OSPs during passive listening. Moreover, the decoding accuracy of the omitted notes was significantly higher in the familiar melody condition than in the unfamiliar melody condition. Thus, the present study suggests that the more predictable the notes, the more specific the predictive representation.

Previous studies have reported that a larger oN1 response occurs in predictable contexts than in unpredictable contexts^27,29,33. This larger neural response is consistent with the concept of precision-weighted prediction error^37–41. Precision is the inverse of the variance of a (probabilistic) distribution and reflects certainty about a variable such as sensory input^39,40. The higher the precision (i.e., high predictability and low uncertainty), the higher the sensitivity and the higher the gain of sensory input^37,38. In the present study, the uncertainty of familiar melodies was lower than that of unfamiliar melodies because the memory representation of the melody facilitates the generation of the prediction. Thus, the occurrence of the unexpected omission may be more salient in the familiar melody than in the unfamiliar melody, where the occurrence of the note is ambiguous, and this would result in a larger oN1 response in the familiar melody condition.

The oN1 was elicited even in the unfamiliar melody condition. This may be because the expectation that any melody would continue was violated by the omission. Dercksen et al.³² reported that the oN1 occurred when the timing of a stimulus was predictable but its content was not. In the present study, consistent with their findings, omissions in the unfamiliar melody condition elicited an oN1 when the omission occurred in a continuous melodic context. This temporal prediction should be inherent in musical materials. A sense of beat and rhythm may facilitate better temporal prediction and reduce latency jitters, which may prevent the stable recording of early OSPs⁴¹. Thus, the present study shows that oN1 occurs even in the “I don’t know what the upcoming stimulus is, but some stimulus is coming in this time sequence” situation by using an ecologically valid stimulus with clear timing information.

Like the oN1 response, the oMMN response was larger in the familiar melody condition than in the unfamiliar melody condition. Several studies have calculated the oMMN by subtracting the ERP response in the tone from that in the omission^30,32. However, processing an omission without sensory input and processing a tone with sensory input are qualitatively different, so interpreting the difference between tone and omission is difficult. In the present study, similar to Prete et al.³¹, the manifestation of prediction error was extracted as the difference between expected omissions (i.e., pauses in the melody) and unexpected omissions. Even with the improvement of the subtraction method, the results of the present study were consistent with those of Bendixen et al.³⁰, who demonstrated that the oMMN response to omitted syllables was larger for predictable words than for unpredictable words. These findings suggest that the oMMN reflects predictability in an ecologically valid auditory context.

The present results may reflect different functions of the oN1 and the oMMN. The auditory N1 is considered a transient neural response of the cortical system that monitors changes in auditory input⁴³. Thus, the current oN1 may reflect the detection of changes in sensory input, such as an omission in a continuous melody. The oMMN in the present study was extracted as the difference between the unexpected omission (deviant) and the expected omission (standard), as described by Prete et al.³¹. This subtraction is similar to the method used in previous studies using a self-stimulation task, in which a stimulus expected after the participant pressed the button was unexpectedly omitted. The oN1 was extracted by subtracting the ERP waveforms in a no-sound motor control condition from those in the omission condition^26,27,33. Thus, the oN1 in the previous studies and the oMMN in the present study may be a similar type of OSP. In the unfamiliar melody condition, the oN1 was observed, while the oMMN was seldom observed. This may be because the timing of the pauses (i.e., the expected omissions) was as unpredictable as the unexpected omissions in the unfamiliar melody condition. Therefore, in the present study, the oN1 seems to mostly reflect the prediction of timing, while the oMMN seems to mostly reflect the prediction of content.

The accuracy of decoding the omitted note identity was higher in the familiar melody condition than in the unfamiliar melody condition. The significant differences were found in an early latency range (58–83 ms). While the amplitudes of oN1 and oMMN reflect the predictability of melody notes, they do not directly reflect the specificity and clarity of the prediction of note identity based on familiarity. The SVM decoding of the omitted notes from OSPs allows for a more direct examination of the prediction of note identity content compared to examining ERP amplitudes. The results support the notion that the OSPs reflect predictive signals containing specific information about the upcoming stimulus⁴⁴. SanMiguel et al.²⁸ suggested that prediction inducted a sufficient sensory template for the expected sound, at least up to the oN1 latency range (56–112 ms). Bendixen et al.⁴⁵ also suggested that the brain is set up to process the expected tone by default and only interrupts processing when an omission is detected. The fact that the latency range with high decoding accuracy (i.e., 58–83 ms) was different from the latency range of oN1 and oMMN may reflect that the predictive representation was strongly retained before the omission was detected and the prediction error was elicited. These results suggest that when the predictability of the musical context is high, ERP responses during omissions contain information about more specific pitch expectations, at least immediately after the onset of the omission. The decoding results should be interpreted carefully. First, the time intervals of the clusters identified by the cluster-based permutation test do not necessarily indicate the onset and offset points of the effects⁴⁶. Further research is needed to determine whether the familiarity effect on decoding accuracy is observed only before the elicitation of prediction errors or whether it also occurs in the oN1 and oMMN time windows. Second, decoding accuracy was above chance even for the unfamiliar melodies, especially after 100 ms. Although the melodies were unfamiliar, repeated exposure during the experiment might have led to the melodies being learned, creating dynamic expectations. Nonetheless, the observed effect of predictability due to familiarity may be because familiarity had a stronger influence on predictions.

In conclusion, the present study demonstrated that unexpected omissions in the familiar melody condition elicited a larger oN1 and oMMN than unexpected omissions in the unfamiliar melody condition. These findings suggest that the oN1 and oMMN reflect the predictability of the pitch in melody based on the melody’s familiarity. Moreover, the SVM successfully classified the identity of omitted notes, and the decoding accuracy was higher in the familiar melody condition than in the unfamiliar melody condition. These results provide evidence that the ERP during the omission contains distinguishable predictive information, and the higher the predictability, the more the specific representation of the expected note is contained.

Ethics

The protocol of the present study was approved by the Behavioral Research Ethics Committee of the Osaka University School of Human Sciences, Japan (HB023-075) in accordance with the Declaration of Helsinki. Written informed consent was obtained from all participants. Participants received a cash voucher of 2,500 Japanese yen as an honorarium.

Participants

A sample size of 21 was predetermined to ensure the detection of a medium effect size (d_z = 0.66) for the difference in the oN1 amplitude between the high and low predictability. This was calculated using data from SanMiguel et al.²⁷ and reported in Dercksen et al.³³, with power 1 − β = .80 and error rate α = .05. The calculation was performed using G*power⁴⁷. Taking into account data exclusions, 25 participants were recruited. Finally, data from all 25 participants (16 women and 9 men, 19–35 years old, M = 22.0 years) were used for the analysis of the oN1 and oMMN. The decoding analysis was done with the data from 24 participants because one participant did not have a sufficient number of artifact-free trials of each note (less than 80% or 50 trials) to avoid the risk of overfitting. Twenty-three participants were right-handed, and two were left-handed (FLANDERS handedness questionnaire⁴⁸). None reported having hearing impairments or a history of neurological disease. All participants confirmed that they knew the four familiar melodies used in the experiment. The participants’ musical ability was evaluated using the Japanese Gold-MSI questionnaire⁴⁹ (the original is Müllensiefen et al.⁵⁰), which evaluates General Sophistication (M = 62.5, SD = 18.0) as well as subscales of Active Engagement (M = 30.0, SD = 8.3), Perceptual Abilities (M = 38.7, SD = 10.4), Musical Training (M = 20.9, SD = 10.2), Emotions (M = 29.1, SD = 7.3), and Singing Abilities (M = 24.3, SD = 8.7). Four participants had self-reported absolute pitch.

Stimuli and procedure

The stimuli used in this study are available at https://osf.io/4q7x6/. The sample of the stimulus is shown in Fig. 2. The familiar melodies consisted of four famous Japanese songs used as teaching materials for music education in Japan. Table 1 shows the profiles of each melody. Two melodies were in C major, the other melodies were in F and B♭ major, and all melodies were played with piano timbre. All melodies were in the 4/4 time signature. The duration of each melody was 9.6–19.2 s (200 bpm). The quarter notes (300 ms) of E, F, A, and C following the quarter note of G were omitted with a probability of 50%. These four notes were called target notes, and their positions were called target positions. The unfamiliar melodies were created by shuffling the pause and note positions for each of the four familiar melodies, while keeping the target note positions the same as in the familiar melodies. All melodies were presented with an interstimulus interval of 600 ms. Although the familiar and unfamiliar melodies were different, both contained the same notes and the same target positions. Note that the same G note was presented before the omission in both types of melodies. Thus, the difference in omission responses between the familiar and unfamiliar melodies reflects the predictability of the melodies based on familiarity rather than the late ERP components elicited by the preceding note before the omission.

Prior to the EEG recording, the participants completed the FLANDERS questionnaire⁴⁸ and the Japanese Gold-MSI⁴⁹. The EEG recording consisted of four familiar melody blocks and four unfamiliar melody blocks. The order of the eight blocks was randomized. Four melodies were randomly presented 20 times each, resulting in a total of 60–70 presentations for each combination of G – E/F/A/C in the four blocks of familiar melodies and four blocks of unfamiliar melodies. Thus, the tone and omission conditions each consisted of 250 trials in each melody block (i.e., 60 + 60 + 60 + 70 = 250 trials). Participants were asked to ignore the melodies while watching a silent movie. Including the online questionnaire session, electrode preparation, and short breaks between blocks, the entire experiment took approximately 2.5 hours.

Figure 2

Samples of familiar and unfamiliar melodies

Note

The upper part shows the familiar melody “Harugakita,” and the lower part shows its unfamiliar version. As indicated by the gray allows, the unfamiliar version was created by shuffling the positions of pauses and notes of the corresponding familiar melody while keeping the positions of the target notes the same as in the original familiar melody. The tone and omission conditions (50% each) were manipulated at the positions of the orange notes in the familiar melody and the blue notes in the unfamiliar melody. In both types of melodies, the quarter notes E, F, A, and C are omitted after the quarter note G, which is colored green.

Table 1

*Attributes of each familiar melody and the numbers of notes in the target positions*
		Song title
		Momiji	Harugakita	Harunoogawa	Yuuyakekoyake	Sum
Note identity	E			6		6
	F	1			5	6
	A	2		4		6
	C	2	2	3		7
Duration (s)		19.2	9.6	19.2	19.2	67.2
Key		F major	C major	C major	B♭ major

EEG recording

EEG data were recorded using QuickAmp (Brain Products, Germany) with Ag/AgCl electrodes. Thirty-four scalp electrodes were placed according to the 10–20 system (Fp1/2, F3/4, F7/8, Fz, FC1/2, FC5/6, FT9/10, C3/4, T7/8, Cz, CP1/2, CP5/6, TP9/10, P3/4, P7/8, Pz, O1/2, Iz, PO9/10). Additional electrodes were placed on the left and right mastoids, the left and right outer canthi of the eyes, and above and below the right eye. The data were referenced offline to the nose-tip electrode. The sampling rate was 1,000 Hz. The online filter was DC–200 Hz. Electrode impedances were kept below 10 kΩ.

EEG data reduction

EEG data were analyzed using ERPLAB (Delorme & Makeig⁵¹; Version 2023.1) on MATLAB R2022b (The MathWorks Inc., Natick, MA). First, a digital filter of 0.5–25 Hz was applied to the data (Dercksen et al.³³; SanMiguel et al.²⁸). Ocular artifact correction based on independent component analysis was then applied. A period of 500 ms (200 ms before and 300 ms after the target position) was averaged after removing trials with voltages exceeding ± 80 µV at any channel. Baseline correction was applied by subtracting the mean amplitude of the 200 ms prestimulus period from each point of the waveform. For statistical analysis, the front-central electrodes (Fz, Cz, FC1, FC2) were clustered, and the mean ERP waveform of the electrodes was calculated.

The grand mean waveforms of the omissions in the familiar and unfamiliar melody conditions were averaged (averaged grand mean waveforms). Then, the peak of oN1 (110 ms) was detected in the interval of 50–110 ms, and the interval ± 10 ms (i.e., 100–120 ms) from the peak was defined as the oN1 interval. The 50–110 ms interval was determined on the basis of previous studies (Dercksen et al.³³: 42–92 ms; van Laahoven et al.²⁹: 45–80 ms). On average, 240 (126–250) and 246 (222–250) epochs were used to calculate the oN1 amplitudes of familiar and unfamiliar melody conditions, respectively.

To extract the oMMN, the difference waveform was calculated by subtracting the ERP during a pause (i.e., expected omission) from the ERP during an unexpected omission in the familiar and unfamiliar melody conditions. Notes before the pause varied in both the familiar and unfamiliar melodies. This subtraction method is valid because the pause can be considered a highly predictable omission. Prete et al.³¹ used a similar method of subtracting the expected omission response from the unexpected omission response. In the present study, 343 (180–360) and 351 (307–360) epochs were used as expected omissions (pauses) for familiar and unfamiliar melody conditions, respectively. The grand mean difference waveforms of the familiar and unfamiliar melody conditions were averaged (i.e., the average of grand mean difference waveforms of familiar and unfamiliar melody conditions). Then, the peak of oMMN (110 ms) was detected in the interval of 100–200 ms, and the interval ± 20 ms (i.e., 90–130 ms) from the peak was defined as the oMMN interval. Thus, the mean oMMN amplitude of the 90–130 ms interval was calculated separately for familiar and unfamiliar melody conditions.

Decoding

Decoding was performed using the ERPLAB Toolbox (Lopez-Calderon & Luck⁵²; Version 10.02). The classification method was One-vs-Rest. The decoding method used in this study was similar to that of Bae and Luck³⁵, who performed a participant-based approach using the SVM. The SVM was run separately on familiar and unfamiliar omission ERP waveforms for each participant at each time point. Voltages from 34 scalp electrodes were used as feature values. Threefold cross-validation was conducted at each time point to assess the generalizability of the model. In the threefold cross-validation, all trials of each note were randomly divided into three blocks. Two of the three blocks were used for training, and the remaining block was used for testing the classifier to calculate decoding accuracy. This process was repeated three times until all three blocks were used as the test block. The averaged decoding accuracy over the three test datasets was then calculated. For each time point, threefold cross-validation was repeated 20 times (iterations), and the averaged decoding accuracy was calculated. Decoding was performed in the full range of − 200–300 ms after the onset of the omission.

Statistical analysis

Statistical analyses were performed using JASP 0.17.1⁵³. To examine the difference in oN1 amplitude between familiar and unfamiliar melody conditions, a two-tailed paired t-test was conducted on the mean ERP amplitude of the oN1 interval. A Bayesian paired t-test was then performed to assess the evidence for the absence (effect size δ = 0, null hypothesis) or presence (effect size δ > 0, alternative hypothesis) of the difference. To examine the difference in oMMN amplitude between familiar and unfamiliar melody conditions, a two-tailed paired t-test was conducted on the mean ERP amplitude of the oMMN interval. A Bayesian paired t-test was also performed on the oMMN amplitude. The difference in decoding accuracy of the full ERP range (− 200–300 ms) between the familiar and unfamiliar melody conditions was tested using the cluster-based permutation test^54,55. The number of iterations was 10,000. For frequentist hypothesis testing, the significance levels were set at α = .05. For Bayesian hypothesis testing, the Cauchy distribution with a scale parameter r of 0.707 was used as the prior distribution for δ in the t-test. According to the classification scheme of Schönbrodt and Wagenmakers⁵⁶, a Bayes factor (BF₀₁) greater than 3 was considered moderate evidence for the null hypothesis. The stimulus materials and the data necessary to replicate the statistical results are available at https://osf.io/4q7x6/.

Acknowledgements:

This work was supported by JSPS KAKENHI JP22KJ2199.

Competing interests:

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Authorship contributions:

Conceptualization, K.I., T.I., and H.N.; Methodology, K.I., T.I., and H.N.; Analysis, K.I., and T.I.; Investigation, K.I., T.I., and H.N.; Resources, K.I., H.N.; Writing—Original Draf, K.I.; Writing—Review and Editing, K.I., T.I., and H.N.; Visualization, K.I., and T.I.; Supervision, H.N.; Project Administration, K.I.

Data and material availability:

The sound materials used and datasets analyzed for the present paper are available at https://osf.io/4q7x6/.

Bigand, E., Poulin, B., Tillmann, B., Madurell, F., & D’Adamo, D. A. Sensory versus cognitive components in harmonic priming. J. Exp. Psychol. Hum. Percept. Perform. 29, 159–171. https://doi.org/10.1037/0096-1523.29.1.159 (2003).
Marmel, F., Tillmann, B., & Dowling, W. J. Tonal expectations influence pitch perception. Percept. Psychophys.70, 841–852. https://doi.org/10.3758/PP.70.5.841 (2008).
Sears, D. R. W., Pearce, M. T., Spitzer, J., Caplin, W. E., & McAdams, S. Expectations for tonal cadences: Sensory and cognitive priming effects. Q. J. Exp. Psychol. (Hov). 72, 1422–1438. https://doi.org/10.1177/1747021818814472 (2019).
Wall, L., Lieck, R., Neuwirth, M., & Rohrmeier, M. The Impact of Voice Leading and Harmony on Musical Expectancy. Sci. Rep.10, 5933. https://doi.org/10.1038/s41598-020-61645-4 (2020).
Janata, P. ERP measures assay the degree of expectancy violation of harmonic contexts in music. J. Cogn. Neurosci.7, 153–164. https://doi.org/10.1162/jocn.1995.7.2.153 (1995).
Koelsch, S., Gunter, T., Friederici, A. D., & Schröger, E. Brain indices of music processing: “Nonmusicians” are musical. J. Cogn. Neurosci.12, 520–541. https://doi.org/10.1162/089892900562183 (2000).
Patel, A. D., Gibson, E., Ratner, J., Besson, M., & Holcomb, P. J. Processing syntactic relations in language and music: An event-related potential study. J. Cogn. Neurosci.10, 717–733. https://doi.org/10.1162/089892998563121　(1998).
Seger, C. A. et al. Corticostriatal contributions to musical expectancy perception. J. Cogn. Neurosci.25, 1062–1077. https://doi.org/10.1162/jocn (2013).
Koelsch, S., Vuust, P., & Friston, K. Predictive Processes and the Peculiar Case of Music. Trends Cogn. Sci.23, 63–77. https://doi.org/10.1016/j.tics.2018.10.006 (2019).
Rohrmeier, M. A., & Koelsch, S. Predictive information processing in music cognition. A critical review. Int. J. Psychophysiol.83, 164–175. https://doi.org/10.1016/j.ijpsycho.2011.12.010 (2012).
Vuust, P., Heggli, O. A., Friston, K. J., & Kringelbach, M. L. Music in the brain. Nat. Rev. Neurosci.23, 287–305. https://doi.org/10.1038/s41583-022-00578-5 (2022).
Bharucha, J., & Krumhansl, C. L. The representation of harmonic structure in music: Hierarchies of stability as a function of context. Cognition13, 63–102. https://doi.org/10.1016/0010-0277(83)90003-3 (1983).
Vuust, P., Ostergaard, L., Pallesen, K. J., Bailey, C., & Roepstorff, A. Predictive coding of music - Brain responses to rhythmic incongruity. Cortex45, 80–92. https://doi.org/10.1016/j.cortex.2008.05.014 (2009).
Vuust, P., & Witek, M. A. G. Rhythmic complexity and predictive coding: A novel approach to modeling rhythm and meter perception in music. Front. Psychol.5, 1111. https://doi.org/10.3389/fpsyg.2014.01111 (2014).
Friston, K., & Kiebel, S. Predictive coding under the free-energy principle. Philos. Trans. R. Soc. Lond., B, Biol. Sci.364, 1211–1221. https://doi.org/10.1098/rstb.2008.0300 (2009).
Garrido, M. I., Kilner, J. M., Stephan, K. E., & Friston, K. J. The mismatch negativity: A review of underlying mechanisms. Clin. Neurophysiol.120, 453–463. https://doi.org/10.1016/j.clinph.2008.11.029 (2009).
Todd, J., & Robinson, J. The use of conditional inference to reduce prediction error-A mismatch negativity (MMN) study. Neuropsychologia48, 3009–3018. https://doi.org/10.1016/j.neuropsychologia.2010.06.009 (2010).
Wacongne, C. et al. Evidence for a hierarchy of predictions and prediction errors in human cortex. Proc Natl Acad Sci U S A.108, 20754–20759. https://doi.org/10.1073/pnas.1117807108 (2011).
Winkler, I., & Czigler, I. Evidence from auditory and visual event-related potential (ERP) studies of deviance detection (MMN and vMMN) linking predictive coding theories and perceptual object representations. Int. J. Psychophysiol.83, 132–143. https://doi.org/10.1016/j.ijpsycho.2011.10.001 (2012).
Näätänen, R., Paavilainen, P., Rinne, T., & Alho, K. The mismatch negativity (MMN) in basic research of central auditory processing: A review. Clin. Neurophysiol.118, 2544–2590. https://doi.org/10.1016/j.clinph.2007.04.026 (2007).
Näätänen, R., Jacobsen, T., & Winkler, I. Memory-based or afferent processes in mismatch negativity (MMN): A review of the evidence. Psychophysiology42, 25–32. https://doi.org/10.1111/j.1469-8986.2005.00256.x (2005).
Picton, T. W., Alain, C., Otten, L., Ritter, W., & Achim, A. Mismatch negativity: different water in the same river. Audiol. Neurootol.5, 111–139. https://doi.org/10.1159/isbn.978-3-318-00601-8 (2000).
Mencke, I. et al. Prediction under uncertainty: Dissociating sensory from cognitive expectations in highly uncertain musical contexts. Brain Res.1773, 147664. https://doi.org/10.1016/j.brainres.2021.147664 (2021).
Quiroga-Martinez, D. R. et al. Musical prediction error responses similarly reduced by predictive uncertainty in musicians and non-musicians. Eur. J. Neurosci.51, 2250–2269. https://doi.org/10.1111/ejn.14667 (2020).
Kujala, T., Tervaniemi, M., & Schröger, E. The mismatch negativity in cognitive and clinical neuroscience: Theoretical and methodological considerations. Biol. Psychol.74, 1–19. https://doi.org/10.1016/j.biopsycho.2006.06.001 (2007).
Ishida, T., & Nittono, H. (2023). Effects of sensory modality and task relevance on omitted stimulus potentials. Experimental Brain Research, 242(1), 47–57. https://doi.org/10.1007/s00221-023-06726-2
SanMiguel, I., Saupe, K., & Schröger, E. I know what is missing here: Electrophysiological prediction error signals elicited by omissions of predicted “what” but not “when.” Front. Hum. Neurosci.7, 407. https://doi.org/10.3389/fnhum.2013.00407 (2013).
SanMiguel, I., Widmann, A., Bendixen, A., Trujillo-Barreto, N., & Schröger, E. Hearing silences: Human auditory processing relies on preactivation of sound-specific brain activity patterns. J. Neurosci.33, 8633–8639. https://doi.org/10.1523/JNEUROSCI.5821-12.2013 (2013).
van Laarhoven, T., Stekelenburg, J. J., & Vroomen, J. Temporal and identity prediction in visual-auditory events: Electrophysiological evidence from stimulus omissions. Brain Res.1661, 79–87. https://doi.org/10.1016/j.brainres.2017.02.014 (2017).
Bendixen, A., Scharinger, M., Strauß, A., & Obleser, J. Prediction in the service of comprehension: Modulated early brain responses to omitted speech segments. Cortex53, 9–26. https://doi.org/10.1016/j.cortex.2014.01.001 (2014).
Prete, D. A., Heikoop, D., McGillivray, J. E., Reilly, J. P., & Trainor, L. J. The sound of silence: Predictive error responses to unexpected sound omission in adults. Eur. J. Neurosci.55, 1972–1985. https://doi.org/10.1111/ejn.15660 (2022).
Salisbury, D. F. Finding the missing stimulus mismatch negativity (MMN): Emitted MMN to violations of an auditory gestalt. Psychophysiology49, 544–548. https://doi.org/10.1111/j.1469-8986.2011.01336.x (2012).
Dercksen, T. T., Widmann, A., Schröger, E., & Wetzel, N. Omission related brain responses reflect specific and unspecific action-effect couplings. NeuroImage215, 116840. https://doi.org/10.1016/j.neuroimage.2020.116840 (2020).
Trammel, T., Khodayari, N., Luck, S. J., Traxler, M. J., & Swaab, T. Y. Decoding semantic relatedness and prediction from EEG: A classification method comparison. NeuroImage277, 120268. https://doi.org/10.1016/j.neuroimage.2023.120268 (2023).
Bae, G. Y., & Luck, S. J. Dissociable decoding of spatial attention and working memory from EEG oscillations and sustained potentials. J. Neurosci.38, 409–422. https://doi.org/10.1523/JNEUROSCI.2860-17.2017 (2018).
Salehzadeh, R., Rivera, B., Man, K., Jalili, N., & Soylu, F. EEG decoding of finger numeral configurations with machine learning. J. Numer. Cogn.9, 206–221. https://doi.org/10.5964/jnc.10441 (2023).
Arnal, L. H., & Giraud, A. L. Cortical oscillations and sensory predictions. Trends Cogn.Sci.16, 390–398. https://doi.org/10.1016/j.tics.2012.05.003 (2012).
Barascud, N., Pearce, M. T., Griffiths, T. D., Friston, K. J., & Chait, M. Brain responses in humans reveal ideal observer-like sensitivity to complex acoustic patterns. Proc. Natl. Acad. Sci. U S A.113, 616–625. https://doi.org/10.1073/pnas.1508523113 (2016).
Feldman, H., & Friston, K. J. Attention, uncertainty, and free-energy. Front. Hum. Neurosci.4, 215. https://doi.org/10.3389/fnhum.2010.00215 (2010).
Friston, K. The free-energy principle: a rough guide to the brain? Trends Cogn. Sci.13, 293–301. https://doi.org/10.1016/j.tics.2009.04.005 (2009).
Hsu, Y. F., & Hämäläinen, J. A. Both contextual regularity and selective attention affect the reduction of precision-weighted prediction errors but in distinct manners. Psychophysiology58, e13753. https://doi.org/10.1111/psyp.13753 (2021).
Jongsma, M. L. A., Quiroga, R. Q., & Van Rijn, C. M. Rhythmic training decreases latency-jitter of omission evoked potentials (OEPs) in humans. Neurosci. Lett.355, 189–192. https://doi.org/10.1016/j.neulet.2003.10.070 (2004).
Näätänen, R., & Picton, T. The N1 Wave of the Human Electric and Magnetic Response to Sound: A Review and an Analysis of the Component Structure. Psychophysiology24, 375–425. https://doi.org/10.1111/j.1469-8986.1987.tb00311.x (1987).
Bendixen, A., SanMiguel, I., & Schröger, E. Early electrophysiological indicators for predictive processing in audition: A review. Int.J. Psychophysiol.83, 120–131. https://doi.org/10.1016/j.ijpsycho.2011.08.003 (2012).
Bendixen, A., Schröger, E., & Winkler, I. I heard that coming: Event-related potential evidence for stimulus-driven prediction in the auditory system. J. Neurosci.29, 8447–8451. https://doi.org/10.1523/JNEUROSCI.1493-09.2009 (2009).
Sassenhagen, J., & Draschkow, D. Cluster-based permutation tests of MEG/EEG data do not establish significance of effect latency or location. Psychophysiology56, e13335. https://doi.org/10.1111/psyp.13335 (2019).
Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. G* Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav. Res. Methods39, 175–191. https://doi.org/10.3758/bf03193146 (2007).
Okubo, M., Suzuki, H., & Nicholls, M. E. R. A Japanese version of the FLANDERS handedness questionnaire. Shinrigaku Kenkyu85, 474–481. https://doi.org/10.4992/jjpsy.85.13235 (2014).
Sadakata, M. et al. The Japanese translation of the Gold-MSI: Adaptation and validation of the self-report questionnaire of musical sophistication. Musicae Sci.27, 798–810. https://doi.org/10.1177/10298649221110089 (2023).
Müllensiefen, D., Gingras, B., Musil, J., & Stewart, L. The musicality of non-musicians: An index for assessing musical sophistication in the general population. PLOS ONE9, e89642. https://doi.org/10.1371/journal.pone.0089642 (2014).
Delorme, A., & Makeig, S. EEGLAB: An open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J. Neurosci. Methods134, 9–21. https://doi.org/10.1016/j.jneumeth.2003.10.009 (2004).
Lopez-Calderon, J., & Luck, S. J. ERPLAB: An open-source toolbox for the analysis of event-related potentials. Front. Hum. Neurosci.8, 213. https://doi.org/10.3389/fnhum.2014.00213 (2014).
JASP Team. JASP (Version 0.17.1) [Computer software]. https://jasp-stats.org/faq/how-do-i-cite-jasp/ (2023).
Bae, G. Y., & Luck, S. J. Appropriate correction for multiple comparisons in decoding of ERP data: A re-analysis of Bae & Luck (2018). BioRxiv 672741. https://doi.org/10.1101/672741 (2019).
Maris, E., & Oostenveld, R. Nonparametric statistical testing of EEG- and MEG-data. J. Neurosci. Methods164, 177–190. https://doi.org/10.1016/j.jneumeth.2007.03.024 (2007).
Schönbrodt, F. D., & Wagenmakers, E. J. Bayes factor design analysis: Planning for compelling evidence. Psychon. Bull. Rev.25, 128–142. https://doi.org/10.3758/s13423-017-1230-y (2018).

No competing interests reported.

2024SciRepDecodeOmissionSupplementary.pdf

Download PDF

Editorial decision: Revision requested
20 Mar, 2024
Reviews received at journal
06 Feb, 2024
Reviewers agreed at journal
05 Feb, 2024
Reviewers invited by journal
03 Feb, 2024
Editor assigned by journal
25 Jan, 2024
Editor invited by journal
25 Jan, 2024
Submission checks completed at journal
25 Jan, 2024
First submitted to journal
22 Jan, 2024

You are reading this latest preprint version

Decoding Predicted Musical Notes from Omitted Stimulus Potentials: Comparison of Familiar and Unfamiliar Melodies

Status:

Version 1

Abstract

Figures

Introduction

Results

Discussion

Methods

Ethics

Participants

Stimuli and procedure

Samples of familiar and unfamiliar melodies

EEG recording

EEG data reduction

Decoding

Statistical analysis

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1