Participants
Sixty-five healthy community-dwelling older adults, who participated in our previous RCT study (32 and 33 participants in INT and CONT, respectively) [4], were recruited in the present study. The psycho-demographic data of these participants are detailed in our previous report [4]. A total of sixty-one participants, including 31 and 30 participants in INT and CONT, respectively, took part in MRI scanning. MRI data from four RCT participants were not collected because one and one participant in INT and CONT, respectively, had claustrophobia, one participant in CONT was equipped with a heart pacemaker, and one participant in CONT declined to undergo the MRI scanning. All participants were right-handed and native Japanese-speaking individuals. We confirmed that there were no significant differences in age (t [df = 59] = 1.01, p = 0.32), sex (c2 [df = 1, n = 61] = 0.15, p = 0.70), and educational level (c2 [df = 1, n = 61] = 0.13, p = 0.71) between the two groups (educational level was binarized with a border of 13 years to categorize participants depending on whether they went to university or college; see Table 1).
Intervention program
The procedures used in the PICMOR intervention and control program are described in detail below. Both the intervention and control programs were based on a group conversation. The 65 people recruited from the Silver Human Resources Center were divided into 16 groups (eight and eight in INT and CONT, respectively), each with four members, except for one CONT, which had five members. The participants were required to participate in the group conversations once a week for 12 weeks. One of the major differences between the two programs was whether the programs were designed to train executive functions.
In the group conversation offered by PICMOR, a robot acted as the chair to lead the conversations and prompt one of the four members to speak about an event they had experienced in their daily life for 1 min. The topic was a predetermined subject that changed every week. During this period, the other three members of the group had to listen attentively so that they could ask questions during the discussion period. The 1 min talking period was repeated without a break, in which they talked about another event related to the topic (i.e., each participant was assigned a total of 2 min to talk). Following this, there was a 2 min discussion period for each event during which the speaker was required to answer questions raised by the other three members of the group. During the discussion periods, the robot automatically encouraged and stopped the participants' utterances to balance the amount of talking time allocated to each participant. For example, when the robot detected that one participant had spent less time talking than others, it directly prompted the participant to comment or ask a question. After the 2 min discussion periods, another member was assigned as the speaker. This procedure (i.e., the 1 min talking periods, followed by the 2 min discussion periods) was repeated for all members. There were two major reasons for using a robot and not a human as the chair. First, we could force the participants to make a speech during the predetermined time and finish their talk when their allocated time was over, and ensure that uncontrollable personal factors that can arise from a human chair, such as hesitation, were excluded. Time management by the robot made it possible to give each member an equally predetermined time (i.e., 1 min) to talk about an event. Second, it would be challenging for a human chairperson to prompt and stop the conversations in real-time based on the talking time of each person.
In contrast, in the group conversations offered by the control program, four members were required to talk freely without any robotic facilitation or predetermined theme, similar to how they would converse in their daily life. As shown in Table 1, the variance of the amount of time spent talking in group conversations during the intervention period was smaller in INT than in CONT. In an F-test to compare the variances between the two groups, the null hypothesis (i.e., the true ratio of variance is 1) was rejected (ratio of variances = 17.79, p < 0.01). This suggests that our experimental manipulation by robotic moderation to balance the amount of talking time for each participant in INT was successful. We hypothesized that repeated training in group conversations in PICMOR compared to the control program would exercise executive functions, such as flexibility, planning, working memory, and response inhibition, given that the participants have to make a speech within a limited time (i.e., 1 min), flexibly ask and answer questions, intentionally store and manipulate information to ask questions, and suppress the interruption of other members in a group conversation. Executive control and verbal abilities can be measured using the verbal fluency task [16], in which participants are required to produce as many words as possible beginning with a specific letter (e.g., /ka/ in Japanese) [5]. The number of correct unique words generated in 1 min is often used as a measure of performance. To successfully perform this task, participants must flexibly retrieve appropriate words along with the task rules, from their long-term memory (semantic memory), which most likely involves accessing their mental lexicon [16-18]. Successful task performance also requires them to keep previously produced words in their working memory so as to avoid repetition and suppress inappropriate words or task-irrelevant thoughts, which may involve executive control processes [16-18]. Given that the characteristics of the training demands in PICMOR involve exercising executive functions, it is reasonable to predict that the ability to produce words within a limited time could be enhanced in INT compared with CONT between the pre- and post-intervention periods. Consistent with this idea, a significantly larger improvement in the PVFT score was observed in INT than in CONT in our previous RCT study [4]. The beneficial intervention effect on verbal fluency could not be explained by the amount of talking time in group conversations during the intervention period, given that the talking time was significantly shorter in INT than in CONT (t [df = 59] = 4.21, p < 0.01, Cohen's d = 1.08; see Table 1).
Data acquisition
All MRI data were acquired with a Philips Achieva 3.0 MRI scanner, located in the Advanced Imaging Center Yaesu Clinic, Tokyo. Data were collected only after the intervention. During the MRI scanning, participants were equipped with a set of earplugs and headphones to reduce the effects of scanner noise and a belt with foam pads around their head to minimize head motion. There was no significant difference in the time period from the last day of the intervention period to the day of MRI data acquisition between INT (mean ± SD = 9.67 ± 0.76 weeks) and CONT (mean ± SD = 9.68 ± 0.56 weeks) (t [df = 59] = 0.05, p = 0.96).
First, three directional T1-weighted anatomical planes were scanned to localize the subsequent anatomical and functional images. Subsequently, anatomical structures were scanned using a high-resolution T1-weighted image (repetition time [TR] = 6.41 ms, echo time [TE] = 3.00 ms, field of view [FOV] = 24.0 cm × 24.0 cm, matrix size = 256 × 256, slice thickness/gap = 1.2/0 mm, 170 sagittal slices). Finally, resting-state functional images were scanned using a pulse sequence of gradient-echo echo-planar imaging, which is sensitive to blood oxygenation level-dependent (BOLD) contrasts (TR = 3000 ms, TE = 30 ms, flip angle = 80 degrees, FOV = 24.0 cm × 24.0 cm, matrix size = 80 × 80, slice thickness/gap = 4.0/0 mm, 35 horizontal slices). All participants were instructed to remain awake with their eyes open and think of nothing during the entire rsfMRI scan (10 min). The rsfMRI run began with dummy scans that were discarded from further analyses.
Data analysis
All MRI data were analyzed using the CONN functional connectivity toolbox v.17.f (www.nitrc.org/projects/conn) [19] for Statistical Parametric Mapping 12 (SPM 12) (www.fil.ion.ucl.ac.uk/spm/), implemented in MATLAB. Resting-state functional images were preprocessed along the default pipeline in CONN. The images were realigned and corrected for slice timing. After the outlier detection, the functional and structural images were segmented and normalized to the Montreal Neurological Institute (MNI) space with a resolution of 2 × 2 × 2 mm3 voxels. These normalized functional images were spatially smoothed by a Gaussian kernel of 8 mm full-width at half-maximum. Subsequently, temporal correction was performed using the component-based noise correction method [20]. In this step, five principal components were extracted from the white matter and cerebrospinal fluid regions. Along with six bulk motion parameters, the first-order derivative of each motion parameter, and the scrubbing parameter, the five principal components were regressed out from the signal of interest. The scrubbing parameter was provided using the Artifact Detection Tools in CONN that can detect outliers based on the variance of whole movements. A band-pass filter with a frequency window of 0.008-0.09 Hz and detrending were then applied to the data.
In the present study, we used seed-to-voxel analyses of the rsfMRI data. Seed regions were defined as spheres with a 5 mm radius, around (−50, 12, 24), (−48, 28, 14), (−52, 12, 0), (−42, 8, 36), (−54, 2, 46), and (−44, 18, 6) in the MNI coordinates, located in the left inferior and middle frontal gyri, based on a previous neuroimaging meta-analysis demonstrating that these regions consistently show significant activation during PVFTs [11]. The anatomical mask of these seed regions was created using MarsBaR (www.marsbar.sourceforge.net). In the individual-level analysis, the mean BOLD time course was extracted from each seed region, and correlation coefficients were calculated with the BOLD time course of each voxel, throughout the whole-brain. The coefficients were converted to normally distributed scores using Fisher's transformation. This procedure yielded individual rsFC maps for each seed region. In the group-level analysis, the rsFC maps identified in the first level analysis of INT and CONT were compared using two-sample t-tests. The models included age, sex, and educational level as covariates. In this second-level analysis, the threshold at the cluster level was corrected for whole-brain multiple comparisons (false discovery rate [FDR]; p < 0.05). Given that we selected six seed regions for analysis, we also employed a stringent statistical significance in the two-sample t-tests. In this case, the height threshold was divided by the number of seed regions (FDR, p < 0.05/6).
We also performed regression analyses for rsFCs using the raw scores of PVFT collected from all participants before and after the intervention. In this analysis, regions that showed significant correlations between the rsFCs with seed regions and a difference in the individual PVFT scores between the pre- and post-intervention periods (i.e., post- minus pre-intervention) as an independent variable were explored at the whole-brain level. This analysis enabled us to identify regions that modulated the increase in the score by interacting with the left inferior and middle frontal gyri as seed regions. Participants' age, sex, and educational level were also included as covariates in the analysis. The threshold at the cluster level was corrected for whole-brain multiple comparisons (FDR, p < 0.05).