We often switch between tasks in our daily lives. For instance, we may switch between responding to emails and fielding phone calls from colleagues. Although the tasks may be relatively simple, laboratory studies reveal that switching between tasks results in significant increases in response time (RT) and error rate (ER)—termed switch costs (for reviews, see Kiesel et al., 2010; Koch, Poljac, Müller, & Kiesel, 2018; Monsell, 2003).
Modality pairing effects and crosstalk. Task switches frequently involve different stimulus and response modalities. That is, we might switch between a task that requires responding manually to visual stimuli and a task that requires responding vocally to auditory stimuli. Notably, the modalities of the stimuli and responses have been shown to dramatically affect the magnitude of switch costs. Stephan and Koch (2010), for instance, observed smaller switch costs when participants switched between visual-manual and auditory vocal-tasks than when participants switched between visual-vocal and auditory-manual tasks, even though single-task RTs were similar across all stimulus-response (S-R) pairings (see also, Fintor, Stephan, & Koch, 2018, 2019; Friedgen, Koch, & Stephan, 2021, 2022; Schaeffner, Koch, & Philipp, 2018; Stephan & Koch, 2011, 2016). Analogous findings have been observed in mixing costs (Schacherer & Hazeltine, 2019) and dual-task costs (Göthe et al., 2016; Hazeltine et al., 2006; Schacherer & Hazeltine, 2020; Stelzel et al., 2006).
These modality pairing effects may arise from the relationship between the modality of the stimuli and the modality of the action effects, the sensory consequences that follow responses (Stephan & Koch, 2010, 2011; Schacherer & Hazeltine, 2020). For example, vocal responses typically produce auditory action effects (e.g., the sound of speech) and manual responses typically produce action effects with a spatial component (e.g., tactile and proprioceptive feedback to the effector alongside corresponding visual changes in the environment). Accordingly, in visual-manual and auditory-vocal tasks, similar stimulus and effect modalities exist within tasks (visual-manual: visuospatial information; auditory-vocal: sound information), whereas in visual-vocal and auditory-manual tasks, similar stimulus and effect modalities exist across tasks (both tasks contain visuospatial and sound information).
Because switch costs are larger when there is modality overlap across tasks, it has been proposed that modality pairing effects emerge from crosstalk. Crosstalk refers to interference from similar or overlapping codes (Logan & Gordon, 2001; Navon & Miller, 1987) and is hypothesized to be reduced when switching between visual-manual and auditory-vocal tasks because similar modality-specific information is restricted to within tasks. That is, with visual-manual and auditory-vocal tasks, the modality of a stimulus (e.g., auditory) is the same as the modality of the response-related action effect (e.g., auditory feedback from a vocal response) within a task. In contrast, when switching between visual-vocal and auditory-manual tasks, the modality of a stimulus is the same as the modality of the response-related action effect in the other task, which increases the degree of crosstalk. Yet, although several studies propose crosstalk as the source of modality pairing effects, the exact mechanism by which this crosstalk operates is unclear. At present, two accounts have been put forth: stimulus-effect priming (hereafter, SE priming) and central crosstalk. The goal of the present study was to adjudicate between these two accounts to gain a mechanistic understanding of modality pairing effects in task-switching.
According to the SE priming account (e.g., Fintor, Stephan, & Koch, 2018; Stephan, Josten, Friedgen, & Koch, 2021; Stephan & Koch, 2016), perceiving a stimulus and anticipating a response-related effect in modalities that are the same or compatible (e.g., visual stimulus, visuospatial effect from a manual response) may facilitate performance because they prime each other at the level of the shared feature (here: visuospatial). In contrast, when modalities of these codes differ within a task, they prime the activation of the similar code in the competing task. For example, when coordinating visual-vocal and auditory-manual tasks, participants may anticipate the auditory action effect from the vocal response (in the visual-vocal task), which then primes processing of the auditory stimulus in the opposing auditory-manual task, increasing the magnitude of crosstalk (for a similar idea in dual-tasking, see Wirth, Koch, & Kunde, 2020).
Two non-mutually exclusive explanations for these priming effects have been proposed to account for modality pairing effects: effector-set priming and stimulus-uptake facilitation (Wirth et al., 2020). According to the effector-set priming explanation, the presentation of a stimulus in a particular modality primes the compatible (response-)effect set. That is, a visual stimulus activates the processing of visuospatial effects (typically from a manual response), and an auditory stimulus activates processing of auditory effects (typically from a vocal response). Support for this proposal comes from studies demonstrating that perceiving visual and auditory stimuli engage premotor areas involved in hand movements and articulation, respectively, suggesting that the perception of stimulus events is tightly coupled to the activation of modality-compatible actions (Schubotz & von Cramon, 2002; see also, Schubotz, 2007).
Alternatively, according to the stimulus-uptake facilitation explanation, when the modality of the anticipated effect is predictable, as it is when the response modalities for the two tasks overlap (e.g., when both tasks involve manual responses), processing for (response-)effect-compatible stimuli may be facilitated because which effector system needs to be activated is known at the start of each trial. Thus, anticipating visual action effects facilitates processing of visual stimuli and anticipating auditory action effects facilitates processing of auditory stimuli. Evidence for this proposal stems from the observation that preparing for certain types of actions facilitates processing of the corresponding stimulus features. For instance, preparing grasping actions facilitates processing of visual size, while preparing for reaching actions facilitates processing of visual location (Fagioli, Hommel, & Schubotz, 2007). This suggests that action planning can prime perceptual sensitivity to stimulus events that are directly related to the intended action.
In the context of modality pairing effects in task-switching, both priming explanations hold that either the perception of a stimulus biases the activation of the modality-compatible action effect (effector-set priming) or the anticipation of an action effect biases the activation of the stimulus (stimulus-uptake facilitation). This biasing of activation of codes for the subsequent trial increases switch costs when these codes belong to competing task sets (i.e., increased crosstalk).
The central crosstalk account (e.g., Hazeltine et al., 2006; Schacherer & Hazeltine, 2020, 2021, 2023), on the other hand, is inspired by theories holding that all task-relevant features—e.g., stimuli, responses, and action effects—are integrated into the representations engaged by central operations—i.e., the cognitive mechanisms linking perception and action (Frings et al., 2020; Hommel, 2004; Hommel et al., 2001; Prinz, 1990; Schumacher & Hazeltine, 2016). When coordinating tasks that contain similar codes, such as auditory-manual and visual-vocal tasks (both of which contain visuospatial and sound information), the central operations activated by the stimulus (e.g., auditory) in one task interfere with the central operations activated by the response-related action effect (e.g., auditory effects from a vocal response) in the other task, thereby increasing the degree of cross-task interactions. In contrast, when similar stimulus and effect modalities exist within but not between tasks, similar codes are easily mapped to one another, reducing the crosstalk between central codes.
Action effects and response selection. Both the SE priming and central crosstalk accounts rely on the idea that events following the production of a response—i.e., action effects—are included in representations used by response selection processes (e.g., James, 1890; Hommel, 2004; Hommel et al., 2001). That is, selecting and executing an action involves the anticipation of its corresponding sensory effects. For instance, pianists select their actions (key presses) based on the anticipated effect of those actions (musical note).
Empirical support for the proposition that effect representations are anticipated and retrieved during response selection comes from research using response-effect compatibility procedures (for review, see Pfister, 2019). In these tasks, responses are followed by manipulated (i.e., experimentally-induced) action effects. That is, researchers present an additional perceptual event that consistently follows a response (e.g., a light that consistently follows a key press). Often these manipulated effects overlap with responses on some feature dimension, such as spatial position (e.g., Hommel, 1993), sensory intensity (e.g., Kunde, 2001), or conceptual identity (e.g., Koch & Kunde, 2002). When effects overlap with the response producing those effects on some feature dimension, RTs are faster than when they do not overlap. For example, spoken word responses followed by semantically compatible effects (e.g., spoken word green response, spoken word green effect) are faster and less error-prone than when followed by semantically incompatible effects (e.g., spoken word green response, spoken word red effect; Koch & Kunde, 2002). Thus, even though these manipulated effects are not presented until after response production, they appear to cause interference when they are incompatible with features of the response, suggesting that action effects are involved in the selection and generation of actions.
Researchers can also manipulate the compatibility between stimuli and manipulated effects. In a study assessing stimulus-effect modality compatibility and dual-task costs (i.e., the costs associated with performing two tasks versus each in isolation), we (Schacherer & Hazeltine, 2020) varied the compatibility between stimulus and manipulated effect modalities (while holding response modality constant). When stimulus and effect codes were compatible within tasks (e.g., visual stimulus, visual effect), dual-task costs were reduced relative to when they were incompatible within tasks (e.g., visual stimulus, auditory effect). These findings persisted across different stimulus-response pairings (e.g., when both tasks required manual responses), suggesting that modality pairing effects in dual-tasking reflect the compatibility between stimulus and effect modalities, independent of underlying motor codes (i.e., responses).
We interpreted the greater costs in modality-incompatible stimulus-effect pairings as reflecting increased crosstalk between the central codes for the two tasks. Because visual stimuli and visuospatial effects share a modality (vision) and auditory stimuli and auditory effects share a modality (audition), central operations could easily map the similar stimulus and action effect codes to each other within individual tasks. In contrast, when visual stimuli were mapped to auditory effects and auditory stimuli to visual effects, the codes crossed tasks and their near-simultaneous activation increased the degree of cross-task interactions. In other words, when the central codes for the two tasks could be kept separate, dual-task costs were reduced (Halvorson & Hazeltine, 2015, 2019).
Alternatively, the findings from Schacherer and Hazeltine (2020) may instead reflect SE priming. For example, Wirth and colleagues (2020) proposed that modality pairing effects in dual-tasking emerge because stimuli and action effects prime one another at the level of the shared modality (e.g., visuospatial, auditory). When these codes are compatible within a task, this reduces the degree of cross-task interference, allowing the selection of the second task to occur earlier in time, reducing dual-task costs. According to this account, dual-task costs in Schacherer and Hazeltine (2020) were reduced for modality-compatible tasks because anticipating a manipulated visual effect primed the activation of the visual stimulus and anticipating a manipulated auditory effect primed the activation of the auditory stimulus (stimulus-uptake facilitation). The SE priming account also allows that the stimulus may facilitate the activation of the modality-compatible manipulated effect (effector-set priming), which, in turn, shortens response selection. In contrast, dual-task costs are increased for modality-incompatible tasks because the anticipated effect primes the activation of the stimulus (or vice versa) in the opposing task, increasing the degree of crosstalk.
Design of the present study. Although both the priming and central crosstalk accounts make straightforward predictions concerning modality pairing effects in task-switching, adjudicating between these two accounts has proven difficult, largely because no previous study has systematically manipulated the compatibility between stimuli and action effects in a task-switching paradigm. To address this, we adopted the methods used by Schacherer and Hazeltine (2020), in which the compatibility between the modality of the stimulus and the modality of the manipulated action effect is varied. That is, in some task pairings, stimuli and effects were modality-compatible (e.g., visual stimulus, visual effect), whereas in other task pairings, stimulus and effects were modality-incompatible (e.g., visual stimulus, auditory effect). Because we were interested in assessing the effects of stimulus-effect modality compatibility rather than stimulus-response or response-effect modality compatibility, we used tasks that required only manual responses. Thus, the response modality was held constant across tasks, conditions, and experiments, and there was no bias for stimulus-response or response-effect priming in any task, as only one response modality was active in working memory (see Fintor, Stephan, & Koch, 2018).
In all experiments, participants were instructed to produce the action effect assigned to the presented stimulus. For example, in one version of the visual task, participants were instructed to produce a visual leaf effect in response to a visual leaf stimulus, whereas in another condition, participants were instructed to produce a spoken word “leaf” effect in response to that same visual leaf stimulus. Across tasks and experiments, we manipulated the modality compatibility between stimuli and effects (e.g., Schacherer & Hazeltine, 2020).
To examine the contribution of specific stimulus-effect pairings on performance, we examined switch costs for individual tasks, rather than averaging across the two tasks, as is typical in studies assessing modality pairing effects in task-switching (e.g., Schacherer & Hazeltine, 2019; Stephan & Koch, 2010, 2011). By analyzing switch costs for individual tasks, we could identify task-specific priming effects on performance. If the switch costs are larger for one task over the other, this would suggest that the stimulus incorrectly primed the activation of the corresponding effect (or vice versa) in the opposing task. If, however, switch costs are similar across the two tasks, the effects of SE priming would appear minimal.
To evaluate the effects of SE priming on modality pairing effects in task-switching, we conducted three experiments. In Experiment 1, we aimed to determine whether modality pairing effects in task-switching are observed when manipulating the compatibility between stimuli and manipulated action effects, as has been previously observed in dual-tasking (Schacherer & Hazeltine, 2020). To preview the results, we observed smaller switch costs when stimuli and effects were modality-compatible (visual stimulus-visual effect, auditory stimulus-auditory effect) compared to when they were modality-incompatible (visual stimulus-auditory effect, auditory stimulus-visual effect). Experiments 2 and 3 were designed to test whether these modality pairing effects reflect SE priming or central crosstalk. If modality pairing effects reflect SE priming, we would expect to observe asymmetrical costs for the individual tasks, depending on the degree of overlap across tasks. If, however, modality pairing effects instead reflect crosstalk between central codes, costs should be similar across conditions as the central codes for the two tasks include overlapping stimulus and/or effect representations.