Ethics statement
All participants gave oral and written, informed consent in accordance with procedures that were approved by the ethics committee at the School of Psychology, Shaanxi Normal University (Approval No. HR 2021-05-002). The protocols adhered to the Declaration of Helsinki.
Participants
Forty-eight healthy young adults were recruited, with three being excluded for excessive EEG artifacts. The final sample of 45 right-handed (via self-report) young adults (mean age = 18.04, SD = 0.80; 35 females) had an average of 12.3 years of education, and all had normal or corrected-to-normal vision and hearing. The final sample size surpassed that of similar work using EEG to investigate implicit character recognition during the oddball paradigm (Shtyrov et al., 2013; Wang et al., 2013; Wei et al., 2018; Hu et al., 2020) and is comparable to other EEG studies using RSA analysis to explore the representation dynamics of language processing (Hubbard et al., 2021; Hauptman et al., 2022). Participants were recruited from the undergraduate and postgraduate student population at Shaanxi Normal University and were paid 60 RMB for their participation. All participants reported no speech or hearing problems and had no prior history of neurological or psychiatric abnormalities.
An additional group of 33 paid healthy college students (19 female, mean age = 21.48 years, SD = 2.33 years) were recruited to rate the orthographic consistency, phonological consistency, and semantic consistency (transparency) of each Chinese character we selected. Take the scoring of semantic consistency, one question was asked to measure semantic transparency: “To what extent do you think the radical "X" can represent the meaning of the Chinese character "Y"?”. For example, for the character “洋”, the question was “To what extent do you think the radical "氵" can represent the meaning of the Chinese character "洋" ?”. In a similar way, each dimension of consistency was measured on a seven-point scale, with 1 = totally inconsistent and 7 = totally consistent.
Materials
There were four different sets of Chinese phonograms selected. Each set contained four characters (each row in Fig. 1A). In Chinese, a character is generally made up of a semantic radical and a phonetic radical, known as “phonograms”. For example, the character “牲” consists of a semantic radical “牜” and a phonetic radical “生”. Each phonogram set in the present study has a fixed phonetic radical and four different semantic radicals, it enables us to quantify orthographic consistency features based on phonetic radical (Fig. 1A). In the second row in Fig. 1A, for example, the phonetic radical in all characters is “生”, while the semantic radicals are “牜”, “⺮”, “月” and “忄”, respectively.
Semantic radicals are generally on the left side of characters, and phonetic radicals are on the right, which is the main orthographic rule in the majority (63%) of phonograms (Myers, 2019). Semantic radicals usually provide semantic information of characters, while phonetic radicals provide phonological clues. For example, in the Chinese character “牲”, the semantic radical “牜” is on the left side, while the phonetic radical “生” is on the right. The meaning of character “牲” is “domestic animals”. This means that it can easily be speculated from the semantic radical “牜”, which refers to “cattle”. Phonologically, the pronunciation of “牲” is “sheng”, which is also highly consistent with the sound of the phonetic radical “生” (sheng). Characters like “牲” are consistent characters. However, there are some characters (i.e., inconsistent characters) that do not follow the orthographic (positional), phonological and/or semantic rules (e.g., “笙”, “性”, “胜”). Accordingly, these 12 Chinese characters were then divided into three consistent and nine inconsistent characters. The nine inconsistent characters were further divided into three categories, including inconsistent orthographic (IOr) characters, inconsistent phonology (IPh) characters and inconsistent semantics (ISe) characters. Each category has three characters that come from three different sets. In other words, all four categories (i.e., 1 consistent and 3 inconsistent categories) have the same number of characters and the same phonetic radicals (see Fig. 1A). The inconsistent categories differ from the consistent category with regards to orthographic, phonology and semantics, respectively.
Specifically, compared to the consistent category, IOr characters differ with regards to the structure of characters. That is, phonetic radicals appear on the right side of a consistent category. On the contrary, phonetic radicals in the IOr characters are in the less common position in a character. For example, phonetic radial “生” posits at the right side of a “牲”, but it is at the bottom of the IOr character “笙”. Among all characters that are “艮”, “生” and “羊”, they act as phonetic radicals. The probabilities that “艮”, “生” and “羊” appear on the right side of characters are 72.22%, 50.00% and 72.73%, respectively. On the other hand, the percentages of the character positions for “痕”, “笙” and “痒” are 5.56%, 25%, and 18.18%, respectively (Supplementary Table 1). Furthermore, the position of phonetic radicals in IPh and ISe categories are on the right side, the same as consistent characters.
Similarly, IPh characters differ from consistent characters (CC) with regards to phonological consistency. The pronunciations of CC are high in phonological consistency of corresponding phonetic radicals, while the phonological consistency is low in IPh characters. The phonological consistency is defined as the proportion of a specific pronunciation among all characters that adopt the same phonetic radicals (see Borleffs et al., 2017; Siegelman et al., 2020). High consistency refers to the pronunciation of a regular character that is the main pronunciation of all Chinese characters utilizing the specific phonetic radical. In contrast, the pronunciation of IPh characters is a rare sound that corresponds to phonetic radicals. For instance, the pronunciation of “生” and the regular character “牲” are /sheng/, which is the same as most characters that contain “生” (e.g., “笙”, “胜”). However, pronunciation of the IPh character “性” is /xing/, not “/sheng/”. The phonological consistency is 0.39–0.92 for all three CC and is 0-0.28 for the three IPh characters (see Supplementary Table 1 for details). In addition, ISe and IOr characters have the same pronunciation as regular characters (Supplementary Table 1).
Finally, the ISe characters differ from the consistent ones with regards to the transparency of the semantic radical. The CC is high in the transparency of semantic radicals, while the transparency in ISe characters is low. The transparency is defined as the connection between the meaning of the semantic radical, and the meaning of the corresponding character (Shu et al., 2003). That is, semantic radicals of CC can reflect the meaning of corresponding characters. However, the meanings of the ISe characters cannot be speculated from the semantic radicals. For example, the semantic radical “牜” (cattle) is related to the meaning of “牲” (livestock). In contrast, the meaning of ISe character “胜” (victory) is much different from that of the corresponding semantic radical “月” (moon). The same as CC, characters in the IPh and IOr categories are high in the transparency of the semantic radical.
Procedure
The experimental procedure consisted of three oddball blocks and an equal probability block. One of the three categories of inconsistent characters (each category contains three specific characters), in turn, served as the deviant stimuli (dev; probability of occurrence p = 0.25; the number of presentations is divided equally among the three specific characters) across different oddball blocks (each block contains 420 trials), while the consistent characters (containing three specific characters) served as standard stimuli (std; p = 0.75) (Fig. 1D). In the equal probability block (contains 480 trials), the probability of inconsistent (three categories; named equiprobable stimuli) and consistent characters were the same (p = 0.25) (Fig. 1C). Within each block, the trial order was fully randomized, and the order of oddball blocks was also randomized while the equal probability block was implemented at the beginning. For each individual trial, the stimulus was presented for 200 ms, and then a gray image was inserted, lasting for 500–600 ms, at a random time between trials (Fig. 1B). Moreover, the color of the characters may change from white to red at random during some trials (target; p = 0.1; may appear on the standard stimuli of the oddball blocks as well as all stimuli of the equal probability block). The task throughout the experiment was to ignore attributes of the character and to press a button with the right thumb as quickly and accurately as possible when red characters (target stimuli) were presented (similar tasks have been widely used in previous studies examining implicit character processing, e.g., Zhao et al., 2019; Xue et al., 2019). The deviant stimuli in the oddball blocks would not appear twice in a row, and the target only appeared after one standard trial. Participants sat comfortably in an armchair at a distance of 60 cm from the screen, and were given a break for each block that they completed. Using the E-prime software, the images of words were presented within the central visual field (visual angle: horizontally = 2.5°; vertically = 3.8°).
Behavioral analysis
A participant’s response was counted as a hit if the button was pressed for less than 700 ms after the character color changed. Otherwise, the response was counted as a false alarm. Hit and false alarms (FAs) rates during the color change detection task were analyzed in order to evaluate the degree of commitment to unrelated tasks of the participants.
EEG recording and preprocessing
Electroencephalography (EEG) signals were recorded through the use of a 64-channel amplifier (ANT Neuro EEGO, Inc.) that was mounted on an electrode cap according to the international 10–10 system. The online reference electrode during the data collection was CPz. The EEG data was digitized at a sampling rate of 1000 Hz, and impedances were kept below 10 kΩ during the experiment.
Offline preprocessing, artifact removal, and data quality assessment was carried out via the Harvard Automated Processing Pipeline for EEG (HAPPE) in MATLAB (Gabard-Durnam et al., 2018). A spatially distributed subset of channels providing whole-head coverage was processed (excluding the EOG, M1 and M2 channels). HAPPE’s artifact removal steps included bad channel rejection, removal of 50 Hz electrical noise through CleanLine’s multi-taper approach (Mullen, 2012), and participant artifact rejection (e.g., eye blinks, movement) through wavelet-enhanced ICA with automated component rejection via EEGLAB and the Multiple Artifact Rejection Algorithm (Winkler et al., 2015). Post-artifact rejection, any channels removed during the bad channel rejection were repopulated through spherical interpolation to reduce spatial bias in re-referencing. After filtered with a 0.1–40 Hz digital Butterworth bandpass filter with a 12 dB/oct roll-off, the EEG data were then re-referenced to the average reference and mean signal detrended. Epochs were created from − 300 ms pre-stimulus to 700 ms post-stimulus for each trial and baseline corrected using the first 100 ms. Any epochs with retained artifact were rejected using amplitude criteria (± 100 µV), as in prior research (Zhao et al., 2019).
Representational Similarity Analysis
To track the representations of individual characters across time, we used representational similarity analysis (RSA; Kriegeskorte et al., 2008). First, we created neural representational dissimilarity matrices (RDMs) for each time point in the EEG epochs (10 ms resolution), reflecting the pairwise dissimilarity of the characters’ brain representations. Second, we modeled the organization of the neural RDMs using Spearman rank correlation coefficients (Giari et al., 2020; Li et al., 2022), which allowed us to track when representations are explained by the characters’ lexical (frequency) or sub-lexical (containing three dimensions of orthographic, phonological and semantic) statistical information.
Neural RDMs
At each time point from 100 ms before stimulus onset to 600 ms after stimulus onset, we correlated the EEG activity between trial pairs (for the nine different inconsistent characters), separately for the oddball condition (put the data of three oddball sequences together) and equal probability condition. This results in a distance value (1- Pearson correlation) that indicates the dissimilarity between character pairs according to brain activity. By repeating this procedure for each pair of characters we constructed a 9 × 9 neural RDM (Fig. 2A). Individual trials were used as input to the RDM calculation. To calculate the time-point by time-point neural RDMs, the vector for the 61 scalp electrodes was concatenated with those of the five preceding and the five succeeding time points, as implemented in CoSMoMVPA (Oosterhof and Connolly 2016). This resulted in a vector length of 671 features reflecting brain activity spanning 10 ms.
Model RDMs
We designed two series of model RDMs to explore and validate the representation of different statistical information in the EEG data (Fig. 2B). The first series is a number of RDMs constructed based on the current experimental design. These RDMs will be referred to as predictor RDMs in subsequent texts, and these predictor RDMs include orthographic RDM, phonological RDM, semantic RDM and radical-control RDM (based on the phonetic radical category of the material itself). The above predictor RDMs are 9 × 9 binary RDMs, in which 1 corresponded to a comparison between category character (e.g., consistent vs. inconsistent for the orthographic consistency features), and 0 corresponded to a comparison within category stimuli (e.g., consistent vs. consistent). In addition, we also constructed the frequency RDM according to the word (character) frequency (based on the data from the LCSMCS; Sun et al., 1997).
To supplement and validate the results obtained from the predictor RDMs, we constructed another series of RDMs according to the ratings before the formal experiment (from another group of subjects), which resulted in three models of 9 × 9 rating RDMs that corresponded to the orthographic consistency, phonological consistency, and semantic consistency dimensions of our stimuli. Specifically, we calculated the pairwise Euclidean distance between the rating score of each character in each dimension.
Representational similarity analysis
The lower off-diagonal of each matrix was extracted as vectors to calculate the Spearman rank correlations between each model and the EEG data. Since some models were correlated, excluding the other models allowed us to separate the contribution of these models from each other (Giordano et al., 2013; Cichy and Pantazis, 2017). In order to explicitly compare lexical and sub-lexical level statistical information models, lexical (frequency) RDM would be excluded when computing a partial correlation between neural RDM and each sub-lexical (orthographic, phonological and semantic) RDMs, and vice versa. We calculated the partial correlation coefficients at each time point for each subject. These partial correlation coefficients served as an indicator of the time course of different statistical information dimensions in the EEG data.
In addition, in order to detect the difference in the representation strength of different statistical information between the oddball condition and the equal probability condition, we calculated the difference of all partial correlation coefficients of each subject under two conditions (oddball minus equal probability).
Statistical inference
We performed a non-parametric statistical approach for all RSA results which did not depend on assumptions of the data distributions (Nichols and Holmes, 2002). Using the maximum cluster size method, significant temporal clusters were defined as adjacent time points that all exceed a statistical cutoff (cluster-inducing threshold). This cutoff was determined through a sign permutation test according to the distribution of t-values from 10000 permutations of the measured correlation values. The 95th percentile of the t-value distribution was used as the clustering induction threshold of each time point (equivalent to p < 0.05, one-sided). To identify significant clusters, we determined the 95th percentile of maximum cluster sizes across all permutations (equivalent to p < 0.05, one-sided). This approach provided us with significant temporal clusters in which correlation showed significant effects.
Visual mismatch negativity (vMMN) analysis
To verify the existence of prediction error responses, we examined vMMN activities using a data-driven approach. The differential waveforms of characters with different inconsistent categories were obtained by subtracting the ERPs of the corresponding deviant stimuli from the ERPs of the corresponding equiprobable stimuli. This method allows the comparison of ERPs that are evoked by the deviant of the oddball sequence to the ERPs that are evoked by physically identical stimuli from a sequence without any particular frequent (standard) stimulus (Stefanics et al., 2014).
The method of equal probability control was suggested in order to deal with repetition effects due to refractoriness that was assumed to be present in the deviant, minus standard activity that was obtained in classical oddball paradigms (Schröger and Wolff, 1996; Jacobsen and Schröger, 2001). Activity considered as “genuine” vMMN (i.e., vMMN without stimulus-specific refractoriness effects superimposed) emerges when the oddball deviant evokes a larger negativity than the control stimuli (Stefanics et al., 2014). Next, a cluster-based permutation test was utilized to search “genuine” differential activity between the ERPs of deviant and equiprobable stimuli (Maris and Oostenveld, 2007). We conducted this analysis through the use of the Fieldtrip toolbox (Oostenveld et al., 2011) in MATLAB. We developed grand-averages of differential waveforms across two regions of interest (ROI) that correspond to the left (P7, PO7, O1) and right (P8, PO8, O2) posterior occipital-temporal electrodes (the electrodes were selected based on previous studies, e.g., Hu et al., 2020; Kovarski et al., 2021). For each time point (within 0–600 ms) at left or right electrodes, the clusters were formed through two or more neighboring time points whenever the t values (obtained by two-tailed t-test) exceeded the cluster threshold (0.025). The number of permutations was set to 10000, and the corrected significance level was set to 0.05. That is, when the clustering level error probability of a cluster was less than 0.05, then it was considered that there were significant effects in the corresponding period (i.e., effective vMMN activities were identified). We will report the temporal range of the significant negative clusters and their mass for each inconsistent category (the sum of t values in a cluster).