MOL quickly improved temporal order memory performance
Twenty-nine college students (10 males) participated in this experiment (Fig. 1A). On Day 1, subjects were tested for baseline performance, and then given a 2-hour video lecture on the MOL strategy including presentation of a world map with 10 landmarks (see Methods) (Fig. 1B). Participants then practiced the MOL in three consecutive sessions (Day 2 to Day 4) before the scanning session (Day 5). During the baseline test, the two later practice sessions, and the fMRI scanning, subjects performed two encoding-retrieval runs, while they only finished one run on Day 2. In each run, subjects studied a list of 30 words, and after a 6-min delay (filled with a working memory task), they recalled the temporal order of these words (Fig. 1C). To reduce interference or practice effect, new words were used in each run across the whole experiment.
Behavioral analysis showed that memory performance improved dramatically across days (F (3.43, 95.97) = 60.04, P < 0.0001, η2 = 0.50) (Fig. S1A), demonstrating that MOL can quickly improve temporal order memory. During the fMRI scanning, the overall accuracy was at 49.9% (Fig. 2A). Post-hoc t-tests (Tukey HSD) revealed that although the accuracy during scanning was slightly lower than during the last practice session (61.9%, t (28) = -4.04, P = 0.003, Cohen’s d = 0.75), probably due to the restricted learning environment and noise in the scanner, it was significantly higher than during baseline (19.5%, t (28) = 8.41, P < 0.001, Cohen’s d = 1.56) and the first practice session (28.5%, t (28) = 6.12, P < 0.001, Cohen’s d = 1.14).
We further examined the behavioral pattern to ensure that subjects were actually using MOL. First, if they were relying on the locations to encode the temporal order of the studied items, we would predict recency and primacy effects based on the 10 locations. In contrast, if subjects were encoding the items serially without the MOL, we would predict that the recency and primacy effects would be based on the overall list of 30 items. Second, if the subjects used MOL and three words were encoded for each location, we would expect their accuracy to decrease from the first set of 10 words to the second and third sets because of an increasing load at each location (i.e., the fan effect) 23. Finally, we would also expect significant within-location swap errors (i.e., confusion of items that were 10 or 20 positions apart). The fan effect and within-location swap errors should not occur if subjects did not use MOL.
To test these hypotheses, we conducted ANOVAs on the behavioral performance during the fMRI scanning as a function of run (subjects finished two runs of 30 words), set of words (1 to 3), and serial position (Initial: locations 1 to 4; Middle: locations 5 to 7; Final: locations 8 to 10). Since we did not find a significant main effect of run (F (1, 28) = 2.14, P = 0.155) or interactions with run (Fs < 3.12, Ps > 0.05), the two runs were combined for the following analyses (Fig. 2B). Consistent with our hypotheses, a two-way ANOVA with MOL position and serial position as repeated measure revealed significant main effects of set (F (1.99, 55.61) = 24.82, P < 0.0001, η2 = 0.14) (Fig. 2C) and serial position (F (1.91, 53.43) = 7.84, P = 0.001, η2 = 0.05) (Fig. 2D), but importantly, no set by serial position interaction (F (3.23, 90.42) = 0.30, P = 0.842). Planned post-hoc t-tests indicated that memory performance decreased from set 1 to set 3 (ts > 2.74, Ps < 0.05, Cohen’s ds > 0.53), showing a clear fan effect (Fig. 2C). In addition, there were significant recency (final vs. middle, t (28) = 2.82, P = 0.004, Cohen’s d = 0.52) and primacy effects (initial vs. middle, t (28) = 3.81, P < 0.001, Cohen’s d = 0.71) (Fig. 2D). By examining the pattern of errors, we found that subjects were most likely to confuse items between adjacent locations (237 trials, 13.62% of total trials), followed by within-location swap errors (123 trials, 7.07% of total trials) (Fig. 2E). These results strongly suggest that subjects indeed relied on the 10 locations to encode the words.
A different pattern of results was found for the behavioral data during the baseline test. Specifically, we found a significant set by serial position interaction (F (3.27, 91.47) = 11.11, P < 0.0001, η2 = 0.11), indicating only a primacy effect in set 1 (initial vs. middle/final, ts > 5.21, Ps < 0.001), but not in sets 2 and 3 (ts < 1.78, Ps > 0.05) (Fig. S2C). No significant within-location swap error was found (Fig. S2D). These results suggest that before practice the 30 words were encoded into a single list, rather than based on the 10 locations. Across the baseline and three training sessions, the location-based pattern gradually emerged (Fig. S2 to S5), showing increased ratios of within-location swap errors to all errors (Fig. S1B). Together, our behavioral data suggest that subjects effectively learnt to use the MOL to encode the word order, which not only improved the overall performance but also changed the behavioral patterns.
Hippocampal contributions to temporal order memory
The above behavioral evidence suggests that subjects indeed used the MOL strategy to aid memory encoding. We then turned to the fMRI data to examine how employment of this strategy affects activity levels and stimulus-specific representations in hippocampal subfields. The hippocampus and surrounding medial temporal lobe areas were segmented into 5 regions, including CA1, CA23DG, anterior lateral entorhinal cortex (alEC), posterior medial entorhinal cortex (pmEC), and parahippocampal cortex (PHC) (Fig. 3A; Methods). Univariate analyses revealed marginally significant subsequent memory effects (SME) in CA23DG (t (28) = 2.35, P = 0.026, corrected P = 0.074, Cohen’s d = 0.44) and PHC (t (28) = 2.29, P = 0.030, corrected P = 0.074, Cohen’s d = 0.43), with subsequently remembered items showing greater activity than subsequently forgotten items (Fig. 3B). Whole-brain analysis revealed a significant SME in the left parahippocampal gyrus, the left frontal medial cortex, and the left orbital frontal cortex (FWE-corrected for multiple comparisons), consistent with previous observations 24 (Fig. S6A; Table S1).
During retrieval, we found that the CA1 (t (28) = 2.50, P = 0.018, corrected P = 0.046, Cohen’s d = 0.46) and CA23DG (t (28) = 2.84, P = 0.008, corrected P = 0.042, Cohen’s d = 0.53) were more active during remembered than forgotten items (Fig. 3C). Whole brain analysis revealed several additional brain regions for this contrast, including the bilateral supramarginal gyrus (SMG), bilateral superior parietal lobule (SPL), bilateral lateral occipital cortex (LOC), and bilateral frontal pole (FP) (Fig. S6B; Table S2).
Hippocampal representations of structured location sequences
Having shown the involvement of hippocampus in temporal order memory, we further examined whether specific hippocampal representations supported temporal order memory. In the MOL strategy, the spatial location was not presented to the participants during either memory encoding or retrieval. However, its well-learnt structured sequence of locations (i.e., mental route) could be reactivated and linked with the to-be-learnt words during encoding, and again reinstated during retrieval. To probe the neurocognitive representation of this sequential location structure, we examined how hippocampal representational similarity was modulated by location identity, the distance between locations, and the boundaries of the sequence.
Hippocampal spatial pattern separation
To examine the representation of spatial locations, we compared the pattern similarity of items sharing the same location across two runs (i.e., same-location pairs) with those encoding adjacent but different locations (i.e., near-distance pairs, ordinal distance = 1) (Fig. 4A). This analysis could be performed during both encoding and retrieval (according to the temporal distance during encoding). Notably, this cross-run pattern similarity should not be affected by intrinsic autocorrelations of the BOLD signal. We found that the PHC showed greater pattern similarity for same-location pairs than near-distance pairs during encoding (t (28) = 3.42, P = 0.002, corrected P = 0.010, Cohen’s d = 0.64), while a reverse pattern was found during retrieval (t (24) = -2.61, P = 0.015, corrected P = 0.039, Cohen’s d = 0.52; four subjects were excluded from this analysis due to fewer than 10 trials in any condition, see Table S3) (Fig. 4B). The same effect of pattern separation during retrieval was found in CA1 (t (24) = 2.81, P = 0.010, corrected P = 0.039, Cohen’s d = 0.56), but not in CA23DG (t (24) = 0.372, P = 0.713) (Fig. 4C), and there was no significant region (CA1 vs. CA23DG) by location identity interaction (F (1,24) = 2.011, P = 0.169). Whole-brain searchlight analysis did not reveal any representation of location identity elsewhere in the brain.
Given the similar pattern separation effect in CA1 and PHC, we further examined the relationship between representational patterns in CA1 and PHC. Representational connectivity analysis (see Methods; Fig. S7) revealed that the CA1 pattern was positively correlated with the PHC pattern during both encoding (r (28) = 0.036, t (28) = 3.77, P < 0.001) and retrieval (r (24) = 0.076, t (24) = 4.00, P < 0.001). Interestingly, the CA1 encoding pattern was marginally negatively correlated with the CA1 retrieval pattern (r (24) = -0.030, t (24) = -1.81, P = 0.083). These results suggest that the CA1 representation of spatial location could be flexibly modulated via pattern separation processes, which could then modulate the PHC representations (even though the direction of the effects of course cannot be inferred).
The spatial pattern separation could help to differentiate the temporal order of items encoded at the same location during memory retrieval. If the CA1 and PHC pattern separation indeed aided temporal order judgment, we would predict the degree of pattern separation (near-distance minus same-location pairs pattern similarity) in these regions to be negatively correlated with the numbers of within-loci swaps. Robust regression revealed a marginally significant effect in the PHC (t (27) = -1.98, P = 0.058).
Hippocampal sequential pattern separation
Second, we examined how neural pattern similarity was modulated by the distance between two locations in the well-trained sequence 9, 10. Note that distance here refers to the ordinal distance between two locations (ranging from 1 to 9) rather than their Euclidean or geodesic distance in the real world. We compared the pattern similarity of words that were encoded at near distance (i.e., ordinal distance = 1), middle distance (i.e., 2 ≤ ordinal distance ≤ 3) and far distance (i.e., ordinal distance > 3) (Fig. 4A). We found that the CA23DG region showed a main effect of location distance during encoding (F (1.80,50.49) = 5.40, P = 0.009, corrected P = 0.047, η2 = 0.03). Post-hoc t-tests (Tukey HSD) showed higher pattern similarity for far-distance pairs than near- (t (28) = 2.93, P = 0.018, Cohen’s d = 0.54) and middle-distance pairs (t (28) = 3.30, P = 0.007, Cohen’s d = 0.61), but no difference between near- and middle-distance pairs (t (28) = 0.07, P = 0.998) (Fig. 4D). No significant location distance effect was found in CA1 (F (1.92, 53.88) = 0.718, P = 0.487) or in other MTL subfields (Ps > .29). The region (CA1 vs. CA23DG) by location distance interaction was marginally significant (F (1.64, 45.84) = 3.20, P = 0.059). A reverse pattern was found in neocortical regions during encoding, with greater neural pattern similarity for near-distance pairs than far-distance pairs in the right occipital pole (OP), bilateral LOC, left SMG, and left FP (Fig. S8; Table S4). No significant effect of location distance was found during retrieval.
Hippocampal sequence boundary effects during encoding
In addition to location identity and distance, the location structure also contains sequence boundaries. Unlike previous studies where the boundary was introduced by background context or different sequences 11, 14, the boundary in the current study was introduced by the repetition of location sequence. That is, when the 11th word was encoded, subjects would return to the first location, which would break the sequence contiguity. As a result, for a given temporal distance (e.g., 2), we can construct both within-boundary pairs (e.g., location 8 - location 10) and cross-boundary pairs (e.g., location 9 - location 1).
To compare within vs. cross-boundary pairs of matching distances, we analyzed temporal distances of 4 to 7 ordinal positions during encoding (Fig. 5A; Table S5). Following a previous study 14, we predicted that the similarity of neural representations would be higher for within-boundary pairs than cross-boundary pairs. Indeed, this was found in both CA1 (t (28) = 3.70, P < 0.001, corrected P = 0.002, Cohen’s d = 0.69) and CA23DG (t (28) = 4.37, P < 0.001, corrected P < 0.001, Cohen’s d = 0.81) (Fig. 5B). This effect was specific to the hippocampus and did not occur in EC or PHC, or in any other brain region in the whole-brain analysis.
Hippocampal temporal context reinstatement during retrieval
The above analyses reveal that hippocampal representational patterns are modified according to the well-trained structured location sequence, exhibiting spatial and temporal pattern separation to aid temporal order memory. In the following analysis, we further examined whether representations in hippocampal subfields also support another type of temporal order, i.e., the episodic-like temporal context that was formed through one-shot learning and should be specific to a given event sequence. Due to the autocorrelation of fMRI BOLD signal, we could not directly compare the representational similarity of temporally adjacent pairs with more distant pairs. Instead, we examined the reinstatement of temporal context during retrieval (Fig. 6). In particular, for a given temporal distance during retrieval (TDr, ranging from 1 to 29), we grouped the pairs, according to their temporal distance during encoding (TDe), into Short (TDe ≤ 3) and Long (4 ≤ TDe ≤ 6) conditions (Fig. 6A). This grouping was motivated by the small number of trials of each individual distance and by previous findings that effects of temporal contexts decayed quickly beyond 3 items 12. We restricted the analysis to correct trials and TDr < = 20, as there were very few pairs for TDr > 20 (Table S6). We predicted that the brain regions containing representations of temporal context should show higher pattern similarity for short-distance pairs than long-distance pairs. Consistent with this prediction, we found this pattern in CA1 (t (23) = 3.17, P = 0.004, corrected P = 0.021, Cohen’s d = 0.65; four subjects were excluded due to fewer than 10 trials in any condition, and one subject was excluded as an outlier, i.e., 2.5 SDs above the mean, see Table S7) (Fig. 6B). No temporal context reinstatement was found in any other brain region.
Control Analysis: The Hippocampus Did Not Represent Word Semantics
In all the above analysis, we only considered the representation of the event structure and the temporal context, but not the representations of the word, as previous studies did not reveal strong item representations in the hippocampus. To further examine this issue, we conducted latent semantic analysis to generate the semantic similarity matrix of the words, using a well-trained Chinese word embedding model, Directional Skip-Gram 27 (See Methods). Correlating semantic similarity with the neural representational similarity (Fig. S9A) did not reveal significant semantic representation in any hippocampal subfield (Ps > 0.082) (Fig. S9B). In addition, all the above results remained unchanged after controlling the semantic similarity. Interestingly, we found significant semantic representations during encoding in the vmPFC (r (28) = 0.017, P = 0.031), and a marginally significant effect in the SPL (r (28) = 0.020, P = 0.068).