The present study investigated the specificity of the predictive information generated according to the predictability of omitted notes in the musical context. The predictability of the notes was manipulated by melody familiarity, and sensory-evoked responses were eliminated by recording OSPs elicited by omitted target notes. Consistent with the predictive coding framework, unexpected omissions in the familiar melody condition elicited larger oN1 and oMMN responses than those in the unfamiliar melody condition. These results suggest that the predictability of the omitted notes was reflected in the OSPs during passive listening. Moreover, the decoding accuracy of the omitted notes was significantly higher in the familiar melody condition than in the unfamiliar melody condition. Thus, the present study suggests that the more predictable the notes, the more specific the predictive representation.
Previous studies have reported that a larger oN1 response occurs in predictable contexts than in unpredictable contexts27,29,33. This larger neural response is consistent with the concept of precision-weighted prediction error37–41. Precision is the inverse of the variance of a (probabilistic) distribution and reflects certainty about a variable such as sensory input39,40. The higher the precision (i.e., high predictability and low uncertainty), the higher the sensitivity and the higher the gain of sensory input37,38. In the present study, the uncertainty of familiar melodies was lower than that of unfamiliar melodies because the memory representation of the melody facilitates the generation of the prediction. Thus, the occurrence of the unexpected omission may be more salient in the familiar melody than in the unfamiliar melody, where the occurrence of the note is ambiguous, and this would result in a larger oN1 response in the familiar melody condition.
The oN1 was elicited even in the unfamiliar melody condition. This may be because the expectation that any melody would continue was violated by the omission. Dercksen et al.32 reported that the oN1 occurred when the timing of a stimulus was predictable but its content was not. In the present study, consistent with their findings, omissions in the unfamiliar melody condition elicited an oN1 when the omission occurred in a continuous melodic context. This temporal prediction should be inherent in musical materials. A sense of beat and rhythm may facilitate better temporal prediction and reduce latency jitters, which may prevent the stable recording of early OSPs41. Thus, the present study shows that oN1 occurs even in the “I don’t know what the upcoming stimulus is, but some stimulus is coming in this time sequence” situation by using an ecologically valid stimulus with clear timing information.
Like the oN1 response, the oMMN response was larger in the familiar melody condition than in the unfamiliar melody condition. Several studies have calculated the oMMN by subtracting the ERP response in the tone from that in the omission30,32. However, processing an omission without sensory input and processing a tone with sensory input are qualitatively different, so interpreting the difference between tone and omission is difficult. In the present study, similar to Prete et al.31, the manifestation of prediction error was extracted as the difference between expected omissions (i.e., pauses in the melody) and unexpected omissions. Even with the improvement of the subtraction method, the results of the present study were consistent with those of Bendixen et al.30, who demonstrated that the oMMN response to omitted syllables was larger for predictable words than for unpredictable words. These findings suggest that the oMMN reflects predictability in an ecologically valid auditory context.
The present results may reflect different functions of the oN1 and the oMMN. The auditory N1 is considered a transient neural response of the cortical system that monitors changes in auditory input43. Thus, the current oN1 may reflect the detection of changes in sensory input, such as an omission in a continuous melody. The oMMN in the present study was extracted as the difference between the unexpected omission (deviant) and the expected omission (standard), as described by Prete et al.31. This subtraction is similar to the method used in previous studies using a self-stimulation task, in which a stimulus expected after the participant pressed the button was unexpectedly omitted. The oN1 was extracted by subtracting the ERP waveforms in a no-sound motor control condition from those in the omission condition26,27,33. Thus, the oN1 in the previous studies and the oMMN in the present study may be a similar type of OSP. In the unfamiliar melody condition, the oN1 was observed, while the oMMN was seldom observed. This may be because the timing of the pauses (i.e., the expected omissions) was as unpredictable as the unexpected omissions in the unfamiliar melody condition. Therefore, in the present study, the oN1 seems to mostly reflect the prediction of timing, while the oMMN seems to mostly reflect the prediction of content.
The accuracy of decoding the omitted note identity was higher in the familiar melody condition than in the unfamiliar melody condition. The significant differences were found in an early latency range (58–83 ms). While the amplitudes of oN1 and oMMN reflect the predictability of melody notes, they do not directly reflect the specificity and clarity of the prediction of note identity based on familiarity. The SVM decoding of the omitted notes from OSPs allows for a more direct examination of the prediction of note identity content compared to examining ERP amplitudes. The results support the notion that the OSPs reflect predictive signals containing specific information about the upcoming stimulus44. SanMiguel et al.28 suggested that prediction inducted a sufficient sensory template for the expected sound, at least up to the oN1 latency range (56–112 ms). Bendixen et al.45 also suggested that the brain is set up to process the expected tone by default and only interrupts processing when an omission is detected. The fact that the latency range with high decoding accuracy (i.e., 58–83 ms) was different from the latency range of oN1 and oMMN may reflect that the predictive representation was strongly retained before the omission was detected and the prediction error was elicited. These results suggest that when the predictability of the musical context is high, ERP responses during omissions contain information about more specific pitch expectations, at least immediately after the onset of the omission. The decoding results should be interpreted carefully. First, the time intervals of the clusters identified by the cluster-based permutation test do not necessarily indicate the onset and offset points of the effects46. Further research is needed to determine whether the familiarity effect on decoding accuracy is observed only before the elicitation of prediction errors or whether it also occurs in the oN1 and oMMN time windows. Second, decoding accuracy was above chance even for the unfamiliar melodies, especially after 100 ms. Although the melodies were unfamiliar, repeated exposure during the experiment might have led to the melodies being learned, creating dynamic expectations. Nonetheless, the observed effect of predictability due to familiarity may be because familiarity had a stronger influence on predictions.
In conclusion, the present study demonstrated that unexpected omissions in the familiar melody condition elicited a larger oN1 and oMMN than unexpected omissions in the unfamiliar melody condition. These findings suggest that the oN1 and oMMN reflect the predictability of the pitch in melody based on the melody’s familiarity. Moreover, the SVM successfully classified the identity of omitted notes, and the decoding accuracy was higher in the familiar melody condition than in the unfamiliar melody condition. These results provide evidence that the ERP during the omission contains distinguishable predictive information, and the higher the predictability, the more the specific representation of the expected note is contained.