Participants
This study was reviewed and approved by the ethics committee at the University of Florida, which adhered to the Declaration of Helsinki. All participants signed an informed consent form prior to participating in the study and were debriefed and compensated at the end of the experiment. Fifteen young, right-handed, healthy adults (10 females, mean age = 22.7 years, SD = 4.0) were selected to take part in this study based on the following exclusion criteria: presence of a history of neurological and psychiatric diseases, of any motor-system complication, the current use of any medication, and presence of any MRI-incompatible object in the body. An additional twenty-one healthy participants were recruited for resting-state acquisition (12 females, mean age = 31.4 years, SD = 7.1). These participants fulfilled the inclusion criteria previously described and the data acquisition was approved by the ethics committee at the University of Montréal, which also adheres to the Declaration of Helsinki. Due to poor fMRI image quality, resting-state data from two participants from this sample were not included in analyses, and data from another participant was also excluded due to missing physiological noise measurements. Thus, the resting-state sample included in the final analyses consisted of 18 healthy participants (mean age 32.39 (SD = 13.19), F/M = 11/7).
Motor control task
Participants who performed a motor control task in the scanner were asked to produce isometric hand grip force with their right hand using an MR-compatible fiber-optic force transducer. Prior to the MRI scan, participants were first trained on the task, and the grip maximum voluntary contraction (MVC) was measured. Specifically, they were asked to produce unilateral grip force using their right hand by opposing their thumb, index, and middle fingers against the force transducer (Archer et al., 2018, 2016; Burciu et al., 2016). Two runs were completed in a block-design paradigm. Each run consisted of 9 blocks of 15 sec force control, followed by 25 sec rest periods (6 min and 25 sec in total). Throughout the run, 2 bars were displayed on a video monitor that participants saw through a mirror mounted on the head coil: a target bar and a force bar. A change in color of the force bar was used to cue participants to either push or release the force sensor. Green was a go signal for producing and sustaining force, while red indicated a rest period. The target bar was set at 10, 20, or 30% of MVC for each participant (3 blocks at each level, pseudo-randomly ordered), while the force bar moved in the vertical plane according to the force output. During the go signal, participants were required to adjust their grip force levels to match the force and target bars. Force was measured using custom-designed load cells (Vaillancourt et al., 2003) and a Micron Optics fiber optic interrogator.
MRI data acquisition
Motor control task dataset. A 3T Siemens Prisma scanner equipped with a 64-channel head/neck coil was used for imaging subjects. We acquired functional magnetic resonance imaging (fMRI) data, covering the whole brain and cervical spinal cord (first cervical, C1, to first thoracic, T1, vertebrae), using blood oxygenation level dependent (BOLD) contrast via a simultaneous multi-slice (SMS) echo-planar imaging (EPI) sequence. Participants laid on the scanner table in a supine position with their head and neck fully supported using foam pads to minimize their motion. We used the neck and brachial plexus SatPad™ sets (SatPad, clinical imaging solutions) around the neck to minimize inhomogeneity artifacts and enhance spinal cord functional image quality. Based on previous work in the lab, we recommend their use in fMRI studies that target the spinal cord. Participants were placed in the scanner such that the mid-chin level was in the scanner’s isocenter, corresponding to C2-C3 vertebral level on the axial plane. The following parameters were used for EPI acquisitions: 72 axial slices angled orthogonal to the cord at C5 vertebral level, in-plane resolution: 1.6x1.6 mm2, matrix size: 120x120, 50% phase oversampling, slice thickness: 4 mm (no gap), TR: 1.85 s, 210 volumes, TE: 27 ms, multiband acceleration factor: 3, GRAPPA factor: 2, 0.875 partial Fourier encoding, FA: 70˚, PE direction: A-P. Additionally, 2 saturation pulses were applied in a V-shaped configuration to minimize ghosting and inflow artifacts related to blood flow in the major cervical vessels.
A 3D-MPRAGE sequence was used to acquire T1-weighted anatomical images covering the entire head as well as neck down to the T3 vertebral level using the following parameters: field of view (FOV): 192×240×320 mm3, sagittal slices; TR: 2300 ms, TE: 3.41 ms, and resolution: 1×1×1 mm3. In addition, a multi-echo data image combination (MEDIC) sequence was used to acquire a T2*-weighted (T2w) anatomical image using the following parameters: axial slices = 36, slice thickness = 4 mm (no gap), in-plane resolution: 0.75×0.75 mm2, matrix size: 256×256, TR: 1280 ms, TE: 18, 20, 21, and 22 ms; FA: 20˚, and GRAPPA acceleration factor: 2. Although the orientation and the size of slices were matched between the T2w and EPI images, the T2w image had fewer slices, covering only the cervical cord to keep the acquisition time down to 12 min. Importantly, the position of the T2w image FOV in the dorsocaudal orientation was set such that the position of the most caudal slice matches that of the EPI image.
Resting-state dataset. Data were also acquired using a 3-Tesla MRI Scanner (Magnetom-Prisma, Siemens, Erlangen, Germany) equipped with a 64-channel head (1–7 elements active) and neck coil (1–2 elements active). T2-weighted images were collected in the transverse orientation with the following parameters: TR = 33ms; TE = 14ms; FOV = 224 mm x 258 mm; flip angle = 5o; in-plane voxel resolution = 0.35 x 0.35 mm2, slice thickness = 2 mm. A total of 112 slices were acquired, spanning from the top of the cerebellum to the upper thoracic region (approximately at T1), thereby encompassing the entirety of the cervical spinal cord. Resting-state functional images were acquired using an echo-planar imaging (EPI) gradient echo sequence covering the brain and cervical spinal (TR/TE = 1550/23 ms, axial FOV = 120 x 120 mm2, flip angle = 70°, in-plane voxel resolution = 1.6 x 1.6 mm2; slice thickness = 4 mm (no gap); iPAT acceleration factor PE = 2, iPAT acceleration factor slice = 3). A total of 69 axial slices were acquired covering the top of the head to approximately T1 vertebra. Only the slices containing the cervical spinal cord were included in further analyses. One resting state-run (i.e., no explicit task) was acquired in total with an acquisition time of 10 minutes and 35 seconds each (230 volumes). Participants were instructed to relax watch the video and minimize swallowing and motion. Physiological recordings were acquired using a pulse sensor and a respiration belt (Siemens Physiology Monitoring Unit).
Spinal cord analysis pipeline
We utilized bash programming, the FMRIB Software Library (FSL) (Smith et al., 2004), Spinal Cord Toolbox (SCT version 6.0) (De Leener et al., 2017), and in-house MATLAB programs to process the collected data. The full spinal cord processing pipeline will be publicly available at www.vahdatlab.org. The main steps of the spinal cord processing pipeline are shown in Fig. 1, including preprocessing, registration to the spinal cord template (PAM50 from SCT toolbox), and regression analysis, as described below.
Preprocessing
Following removing of the first few volumes to reach a steady state condition (in our dataset, no volume was removed due to the inclusion of dummy scans during the acquisition), motion correction was performed on the EPI data in two steps (Fig. 1). First, a 3d linear rigid-body transformation was performed using mcflirt tool (FSL) on the entire brain and spinal cord FOV. Next, the EPI was divided into the spinal cord and brain FOVs. Slice-wise (2d) non-linear motion correction was then performed in a relatively large (23 voxels diameter) cylindrical mask of the spinal cord using the sct_create_mask (SCT). The performance of various motion correction methods (3d linear alone, slice-wise nonlinear alone, or both) was calculated and compared using two indices: i) DVARS that measures changes in BOLD signal intensity from one volume to the next (Eq. 1, (Afyouni and Nichols, 2018), and ii) tSNR (temporal signal-to-noise-ratio) which is the ratio of the mean by the standard deviation of the BOLD signal timeseries (\({Y}_{t}\)) in each voxel (Eq. 2), n being the number of acquired volumes.
Equation 1\(\text{D}\text{V}\text{A}\text{R}\text{S}=\sqrt{\frac{1}{n}\sum _{t=1}^{n}{({Y}_{t}-{Y}_{t-1})}^{2}}\)
Equation 2\(\text{t}\text{S}\text{N}\text{R}= mean\left({Y}_{t}\right)/std\left({Y}_{t}\right)\)
Following motion correction, gradient-echo field map correction was performed using FUGUE tool (FSL) to correct for geometrical distortion due to magnetic field inhomogeneities (Fig. 1). These artifacts are usually larger in the spinal cord compared to the brain due to the proximity to the thorax, vertebrae, and disks, and can cause a shift in the phase-encoding direction, which can be compensated at this step. Next, spatial smoothing was performed inside a mask of the spinal cord in a slice-wise fashion using 3dBlurToFWHM command from AFNI (Cox, 1996). A kernel size of 3.2 mm (2 times the voxel size) was used for smoothing, although a 2.4 mm kernel size resulted in very similar task-based activation maps.
Registration to template
The spinal cord registration pipeline is displayed in Fig. 2. Prior to registration, a spinal cord mask was generated for each of the T2w and T1w images using a deep learning segmentation algorithm as implemented in sct_deepseg_sc (SCT) with automatic centerline detection using the support vector machine and convolutional neural network options (Gros et al., 2019). Following visual inspection, manual corrections were carried out to address any minor inaccuracies in the spinal cord mask. Next, a centerline was drawn manually on the T2w and EPI images. To be consistent across participants, in the T2w image, the centerline in the dorsoventral direction was defined at a voxel location just posterior to the gray commissure, which was clearly visible in the acquired T2w image. For the EPI image, since the gray/white matter distinction is less clear, we used the calculated tSNR map to define the centerline. As shown in Fig. 3, the cord exhibits noticeably higher tSNR values compared to the surroundings, including the cerebrospinal fluid (CSF), blood vessels, and connective tissues. Hence, we identified the centerline at the midpoint of the area with the highest tSNR (yellow area in Fig. 3; when tSNR map is thresholded at 40, color-coded in red-yellow).
Registration to the template (PAM50) was performed in three steps (Fig. 2): 1) EPI image registration to the T2w space using slice-wise centerline images alignment, 2) T2w image registration to the T1w space through the cord segmentation images using sct_register_multimodal (SCT), and 3) T1w image registration to PAM50 through cord segmentation using sct_register_to_template (SCT). In detail, for step 1, EPI and T2w images were registered by performing slice-wise 2d in-plane translations (in the dorsoventral and mediolateral directions) to align their centerlines using flirt command (FSL) with the nearest-neighbor interpolation. For step 2, we manually identified the C1-T1 disc labels in both T1w and T2w images, and used their calculated cord segmentations images to compute a slice-wise rigid body transformation using sct_register_multimodal command (any mismatch in the rostrocaudal direction was first corrected by aligning T1w and T2w disk label locations). For step 3, the cord segmentation and disc labels in the T1w image were employed for registration to PAM50 through sct_register_to_template command, using the default options (including straightening, vertebral alignment, and non-linear slice-wise registration). Note that only step 1 was applied to the 4d EPI images. The steps 2 and 3 warping transformations were concatenated using sct_concat_transfo, and were later applied to the summary statistics images (regression coefficients and their variance images) using sct_apply_transfo command.
Subsequently, we compared the performance of the proposed registration method with the gold-standard spinal cord registration approach which only utilizes the T1W image (Landelle et al., 2023; Vahdat et al., 2015). This approach was implemented by calculating the transformations from the EPI to T1W image using sct_register_multimodal, and from T1W image to PAM50 as calculated above. The two transformations are then concatenated to register the EPI image directly to PAM50. To evaluate the registration performance, the binary cord segmentation mask in the EPI space was registered to PAM50 space using each method, and the result was compared to the binary template cord mask. Two indices were defined by calculating: i) the number of voxels in the template mask that are missed in the registered mask, normalized by the total number of template cord voxels, and ii) the overlap between the registered and the template cord masks, normalized by the total number of voxels in both masks.
Regression analysis
We performed participant-level general linear model (GLM) analyses on the preprocessed EPI data in the T2w space in two steps. First, a GLM was constructed to model and remove the effects of three sets of confounds regressors, including: i) the mean CSF signal averaged in a surround mask generated by dilating the mask of spinal cord (i.e., GM + WM), ii) 6 motion correction parameters (x, y, and z translations and rotations) derived from the linear motion correction step, and 2 slice-wise motion correction regressors generated by the non-linear motion correction step on the spinal cord FOV, and iii) 5 regressors, identified as confound, extracted from independent component analysis (ICA). The fastica algorithm (Hyvarinen, 1999) was used to decompose the preprocessed spinal cord EPI data (before spatial smoothing) within a large mask (including both the spinal cord and the surround mask) into 30 components (Vahdat et al., 2020). We calculated 3 indicators to systematically sort components based on their likelihood of being task-related, as opposed to those related to physiological/scanner noise: 1) spatial indicator: the ratio of significantly active (p < 0.05) voxels within the spinal cord mask to that within the surround mask, 2) frequency indicator: the ratio of the power of the component’s time-series frequency response around the task central frequency (in our block-design paradigm: 1 / (task period + rest period) = 1/40 s = 0.025 Hz; 0.01 < f < 0.1 Hz may be used for resting-state data) to that outside the task-related frequency range with some margins (f < 0.025/1.5 and f > 0.025*1.5 Hz), and 3) temporal indicator: the Pearson correlation between the task timing signal (boxcar function in our task) convolved with a canonical hemodynamic response function (HRF) and the time-series of each component. A confound index was defined by multiplying the 3 indicators; the lower the confound index, the higher the likelihood that component represents a confound. The 5 components with the least confound index were selected, and their time series were included in the first GLM as confound.
Next the residual (clean) image from the first GLM was high pass filtered at 1/60 Hz. The timing of the grip force block design task was modeled as a regressor of interest, while motion outliers were modeled as confounds. The latter were calculated using fsl_motion_outliers command on the non-preprocessed spinal cord EPI data (Fig. 1). The summary statistics images (Cope and VarCope images in FSL) related to the regressor of interest were then registered to PAM50 space using the concatenated warping transformation calculated in the registration step.
We generated a set of individual-specific masks to specify low tSNR voxels for each participant as follows. For each EPI run, a tSNR threshold was calculated using Maximum a Posteriori (MAP) estimation of two Gaussian distributions on the calculated tSNR histogram data inside a large spinal mask (including both the cord and the surround binary masks; Fig. 3). A Gaussian mixture distribution model using two components was fitted to the tSNR histogram using MATLAB fitgmdist function. The threshold was defined as the MAP estimate of the optimal boundary, which is the intersection of the two Gaussian distributions (Fig. 3). Low tSNR voxels (tSNR < threshold) were identified for every EPI run and participant. A subject-specific low tSNR mask was then defined as the intersection of low tSNR voxels from all EPI runs for each participant (a total of 2 runs in our data). Note that the two Gaussian distributions represent, hypothetically, the spinal cord and the CSF/surround voxels, with high tSNR voxels expected to be located inside the spinal cord mask as shown in Fig. 3. Hence, the goal of this step is to calculate an optimal threshold for differentiating spinal cord and CSF/surround voxels and identify those voxels inside the spinal cord mask that, inconsistently, possess low tSNR due to severe distortion and signal loss in the EPI spinal data.
Finally, a group-level GLM model was constructed to average individual-level maps using a mixed-effect (FLAME 1 + 2) model as implemented in FSL. The individual-specific low tSNR masks modeling the low tSNR voxels in each participant were entered as confound (Fig. 1). First we generated a group-level spinal cord mask that contains voxels, in which the majority of participants showed an acceptable tSNR (i.e., tSNR > threshold). This condition ensures that for every voxel in the group-level mask, high quality data from at least half of the participants were included for averaging, and, at the same time, limits the number of confound regressors to half the degrees of freedom in the group-level GLM. Hence, a total of up to n = (number of participants / 2) 4d voxel-wise regressors were included to account for and remove the effect of low tSNR voxels in every participants from group-level averaging. This procedure enabled us to only incorporate voxels with an acceptable tSNR in the group-level averaging, without the need to remove these voxels from the group-level mask/analysis. All group-level statistical maps were then corrected for multiple comparisons using Gaussian random field (GRF) theory, Z > 2.7, and cluster-level threshold p < 0.05.
Brain fMRI processing
Brain image preprocessing was performed with the FSL software package, using the same pipeline as described previously (Vahdat et al., 2020, 2015). One participant was removed from brain analysis due to partial coverage of brain FOV. Preprocessing included the following steps: skull stripping (FSL, BET), motion correction (FSL, MCFLIRT), high-pass temporal filtering (1/60 Hz), GRE field map correction, spatial smoothing using an isotropic Gaussian kernel of 3.2 mm full width at half maximum. BBR method (FSL) was used to register the EPI data to the T1w anatomical image, followed by a nonlinear registration (FNIRT, FSL) from the T1w image to the MNI template. These two transformations were concatenated to register the summary statistics images from the participant level analysis in the EPI space to the MNI template. Participant-level GLM was constructed to identify brain areas activated during motor control task using a boxcar function modeling our block-design paradigm. The 6 motion correction parameters (x, y, and z translations and rotations) derived from the linear motion correction step were included as confound in this analysis. Furthermore, volumes detected as motion outliers (described earlier) were included as confound in the GLM. Next, a group-level GLM model was constructed to average participant-level maps using a mixed-effect (FLAME 1 + 2) model as implemented in FSL. All group-level statistical maps were then corrected for multiple comparisons using Gaussian random field (GRF) theory, Z > 2.7 (voxel-wise uncorrected threshold), and a cluster-wise corrected threshold at p < 0.05.
Functional connectivity analysis
We performed region of interest (ROI)-based functional connectivity analysis to examine the communication between the spinal cord and brain BOLD time-series. We defined 11 a-priori selected ROIs in the brain shown to be part of the sensorimotor network using the MNI brain atlas and brainstem atlas (Cauzzo et al., 2022; Singh et al., 2021; Vahdat et al., 2020). The brain ROIs included: left precentral gyrus, left postcentral gyrus, supplementary motor area, left thalamus, right cerebellum lobule I-IV, right cerebellum lobule V-VI, left red nucleus, bilateral pontine reticular formation, bilateral medullary reticular formation. We also defined 2 ROIs in the spinal cord (right ventral horn C5-C8, right dorsal horn C5-C8 segmental levels). To restrict our analysis to subregions somatotopically related to the forelimb region, each ROI only included the most active voxels during handgrip force production by selecting voxels with the highest Z-score from the individual-level GLM analysis (top 10, 20, 30, or 40th percentile voxels within each ROI). We then calculated the average clean BOLD signal (the residual image from the first GLM) within each ROI at each percentile threshold. For each participant and run, the partial correlation (Marrelec et al., 2006; Vahdat et al., 2014) was calculated between each pair of the brain and spinal cord ROI’s time-series (22 pairs), by accounting for and removing the effects of task presentation (task timing convolved with HRF). The average partial correlation across 4 percentile thresholds were calculated and reported as the functional connectivity (FC) for each ROI pair. For each participant, FC values between the two runs were averaged and Fisher transformation was applied to meet normal distribution assumption. Finally, t-statistics were performed to identify significant functional connections during the motor task between the brain and spial cord ROIs across participants (corrected for multiple comparisons using Bonferroni correction, p < 0.05).
Resting-state analysis
Denoising pipelines. To investigate the impact of different sources of noise we compared the performance of different denoising methods. For each methods, the fMRI time-series was regressed with various variables of no-interest (confounds) using FSL's fMRI Expert Analysis Tool (FEAT). No filtering was applied at this step. All methods incorporated 8 motion-related regressors, 6 obtained from the linear motion correction step, and 2 slice-wise regressors from the SCT motion correction step, as described earlier. Next, we extracted the mean signal of the CSF in a surround mask generated by dilating the mask of the spinal cord. For each participant, the physiological noise modelling was calculated using the Tapas Physio toolbox (Kasper et al., 2017). We used the RETROspective Image CORrection (RETROICOR) procedure (Glover et al., 2000) to generate noise regressors from peripheral physiological recordings (heart rate and respiration). Specifically, we modeled, four cardiac, four respiratory, harmonics, and four multiplicative terms for the interactions between respiratory and cardiac noise (32 regressors in total, similar to (Brooks et al., 2008; Kaptan et al., 2023; Kong et al., 2012). To identify non-neural fluctuations, we extracted the first five principal components of the unfiltered and unsmoothed CSF signal (in the surround mask) in the participant’s native space using the CompCor approach (Behzadi et al., 2007). Additional regressors, identified as confound, were extracted from ICA in a dilated mask of the spinal cord as described in the Regression analysis section (3.89 ± 1.66 components).
Finally, a specific set of regressors of no-interest was accounted for in separate GLMs, corresponding to different denoising models as follow:
-
“MeanCSF” (baseline parameters, mean CSF)
-
“ICA” (baseline parameters, mean CSF, ICA components)
-
“CompCor” (baseline parameters, mean CSF, 5 CompCor components)
-
“Retroicor” (baseline parameters, mean CSF, 32 Retroicor components)
-
“ICA + CompCor” (baseline parameters, mean CSF, ICA and 5 CompCor components)
Variance of the model. For each method, we calculated the voxel-wise explained variance of the model (R2) as a performance metric to provide deeper insights into the impact of noise removal from various sources using the following equation:
Equation 3\({\text{R}}^{2}=[var\left({Y}_{t}\right) - var\left({\epsilon }_{t}\right)]/var\left({Y}_{t}\right)\)
Where \(var\left({Y}_{t}\right)\) and \(var\left({\epsilon }_{t}\right)\) denote respectively the variances of the fMRI times-series and the residual signal (res4d.nii in FSL) at each voxel, after applying the regression model associated with each method. Next, the average R2 was calculated inside the surround mask (i.e., the CSF mask), as well as the spinal cord mask. The ratio between the R2 of CSF and spinal cord (R2CSF / R2cord) is reported to evaluate the model's performance. A good model for removing the physiological noise would account for more variance in the CSF, while for less variance related to neural activation in the spinal cord. Hence, the larger the R2CSF / R2cord ratio, the greater the capacity of the model in explaining the BOLD signal variability inside the CSF compared to the spinal cord, and the better the model in removing the physiological noise.
Being aware of the limitations of the R2 as a quality check metric, such as its failure to account for each model's degree of freedom, we also calculated the adjusted variance of the model (AdjR2) as follows:
Equation 4\({\text{A}\text{d}\text{j}\text{R}}^{2}=1- \frac{(1-{\text{R}}^{2})(N-1)}{N-P-1}\)
Where N is the number of fMRI image time points, and P corresponds to the number of regressors of no-interests (confounds) included in each model. Unlike the R2, which measures the proportion of variance explained by the model, AdjR2 takes into account the degree of freedom of the model. This implies that with an increase in the number of regressors (resulting in the degree of freedom increase), the AdjR2 value will increase only if the additional regressors enhance the overall model performance. The AdjR2 was extracted in the CSF mask. t-statistics were performed to compare the R2 and AjdR2 between each model, and differences were considered significant with a p-value < 0.005, which was adjusted for multiple comparisons using Bonferroni correction (0.05 /10 = 0.005).
Resting-state functional connectivity. To assess the impact of denoising methods on functional connectivity measures, we conducted ROI-based analyses on the residual (cleaned) images. We defined 4 ROIs in the spinal cord (right ventral horn C5-C8, right dorsal horn C5-C8, left ventral horn C5-C8, left dorsal horn C5-C8) and 1 ROI in the surround mask cord (CSF C5-C8). We extracted and band-pass filtered (0.01–0.17 Hz)(Barry et al., 2018a) the time series within each ROI and calculated the average signal. Next, for each participant, we computed the partial correlation between each pair of spinal cord ROIs while the CSF average signal was used as a covariate. The Fisher-transformed correlation coefficients were employed to establish functional connectivity between each ROI pair, specifically between ventral horns (Ventro-Ventral FC), dorsal horns (Dorso-Dorsal DC), ventro-dorsal right horns (Right FC) and ventro-dorsal left horns (Left FC). For each FC, t-statistics were performed to compare the functional connectivity between each model (corrected for multiple comparisons using Bonferroni correction 0.05 /10 = 0.005).