The effectiveness of behavioural interventions: data collection, extraction and organization.
We performed an exhaustive review and meta-analysis of the literature (for details see Methods) on behavioural interventions in urban transport mode choice. Our strategy to collect a diverse body of work is twofold. First, we conducted a review of reviews looking at behavioural interventions. We identify and code 37 literature reviews. Using these review studies, we identify sources (studies, reports, etc.) for extracting information about the interventions and their effectiveness in changing transport behaviour. Based on these studies and reviews, we extract 400 cases of transport behaviour change interventions. Second, we conduct a systematic review of recent literature (see methods) targeting original studies published from 2015 onwards, to include the latest studies from the time of the last comprehensive review in our database. Based on this second review, we identify and code 41 original studies resulting published after 2015.
As explained in Fig. 1, our data extraction strategy is focused on collecting and arranging information in three different types, (i) evidence about the effectiveness of behavioural interventions on transport behaviour, which includes information about intervention types as well as information about travel choices, and (ii) contextual information, such as intervention details (where the intervention took place, for how long, etc.) and lastly, (iii) information for evidence evaluation, mostly information related to study characteristics. Following previous literature, we classify different aspects of the interventions based on what type of behavioural strategy or instrument was used (Information, Advice & goals, Promotional programs & activities, Incentives, Soft infrastructure improvements). Supplementary table A.3 provides a brief description of the different components of behavioural interventions that are reported in the literature.
Studies looking at behavioural interventions in transport report a multitude of travel behaviour indicators that can be used for evaluating the effectiveness of interventions. We restrict our evaluation to four types of transport modes (car use, public transport use, biking and active travel/walking) and three outcome indicators (percentage growth in mode use, change in mode split, change in individual´s transport behaviour[1]) (Supplementary table A.5a). While average growth in mode use is a useful indicator, it is by no means the most important way to evaluate change in transport behaviour. This is especially true due to the inter-connected nature of travel mode choice. Growth in usage of a certain mode does not necessarily mean that its relative use compared to other transport modes (mode shift) changes in the same way. For this reason, we collect separate estimates in mode shift. Here we make a distinction between three different classifications, taking into consideration the outcome variable, the sampling strategy, and the scope of the intervention (more details in Methods & supplementary files). These three classifications are individual-level estimates where estimates are only applicable to the individual samples, site-specific estimates where estimates are based on data collected from the site of intervention and only applicable to the transport behaviour at that site, and lastly area-wide estimates (as explained in Supplementary table A.5b). This distinction is necessary as calculating mode split relies on gathering information at the collective level. For instance, this may be employees or school children for the site where the intervention took place (site-specific estimate) or a more general populace of transport users in the area (area level).
Overall, we collect more than 1000 estimates related to different aspects of travel behaviour change (578 estimates for percentage growth in different modes, 660 mode shift estimates). We can trace back the original source, i.e., the paper or report that gives details on the research design, the specifics of treatments being applied and the findings, in 44% of the cases. A vast majority of estimates are from academic studies. A further 10-20% of the estimates are from studies where some details are missing, mostly in terms of research design, rendering a proper critical analysis difficult or impossible. Most of these estimates originate from public or privately conducted reports. In many cases, the intervention is described in greater detail in these reports. Around 30% of the estimates come from meta-analysis, systematic reviews or reports where significant details on the key elements are missing and there is no access to the original source to verify the reported results. In terms of study design, more than half of the estimates (55%) come from studies with some form of before & after design. Out of these at least 20% indicate that they include some form of the control group, while the remaining 30% either do not include a control group or don´t mention it. Additionally, 10% employ a quasi-experimental design. Nearly, 15% of the estimates employ experimental design. For 10% of the estimates, the study design is not mentioned or clarified.
Our final sample represents research conducted in more than 100 different cities across 37 different countries. However, most of our estimates come from the UK, Germany, Netherlands and the US. Only 8% of estimated from our sample are from Asia (or countries outside the EU, North America & Australia) and even then, they originate mainly from studies in two countries (Japan and China).
Behavioural interventions are quite effective at promoting the growth of sustainable transport modes. However, evidence from area-wide estimates shows that modal shift is harder to achieve.
Fig. 2 (a) shows that the average effect size for percentage growth in car-use is estimated to be around -10% (mean Mcar = -10.6 ± 13.1; n = 175). Similarly, for alternate travel modes, the reported effect size is quite large for average percentage growth in mode use, ranging from a 35% increase in bike mode use (mean Mbike = 35.6 ± 84.4; n = 112) to a 10% increase in walking mode (mean Mactive = 11.6 ± 19.8; n = 147). The effect size for walking is especially sensitive to the model used for estimation. Restricting our sample to estimates extracted from original studies and reports does not have any substantial impact on effect size calculations as a whole.
A major complication in interpreting these results is the general lack of baseline travel mode use in many of the studies. This makes it hard to assess the actual change in behaviour relevant to climate change mitigation potential. In order to get a better assessment, we restrict our sample to only estimates where the baseline travel mode share is available (Supplementary section B.1). This restriction reduces the sample size for growth estimates substantially. The estimates for car-use and walk mode from this reduced sample are similar to the full sample, whereas the average growth estimates for public transport mode and bike mode are reduced. Furthermore, this analysis indicates that for bicycle use, higher growth estimates are from studies where the baseline bicycle use is low (less than 10% bike mode share, with the highest estimates coming from studies with baseline bike share lower than 5%), while the few estimates from studies with high baseline bike share do not indicate high growth potential. This has two implications, first, even small changes in travel behaviour are magnified in the case of low baseline mode share, as the growth estimates are large. Secondly, it can also mean that behavioural interventions are more important in an environment where there is low bicycle uptake as they can capture the low-hanging fruit of relatively easily persuadable, motivated or inclined to change individuals.
Fig. 2 (b) & (c) reveal that the average effect sizes for mode-shift are substantially lower than average percentage growth estimates across different transport modes. Furthermore, for sustainable transport modes, except bikes, the average effect size for mode shift is lower for area-wide studies as compared to site-specific studies. Studies that report area-wide mode shift in car-use (-4.7; 95% confidence interval (CI) = (2.022, 8.22)), find that the average effect size is lower than those that report mode split at the site-specific level (-7.08; 95% confidence interval (CI) = (-7.11, -5.30)). However, even though the car-shift estimates are lower than car growth estimates, they are still substantial, especially when we consider area-level estimates. On the other hand, mode shift estimates for public transport are much lower, both for area-level (1.02; 95% confidence interval (CI) = (-0.31, 2.2)) as well as the site-specific estimates (3.07; 95% confidence interval (CI) = (2.3, 5.1)). For biking, the average change in mode split at the area level is higher than the site-specific estimates. Typically, mode split estimates range between 1-3%. Our main results are robust to influential study analysis (see supplementary appendix). However, the results are likely to be subject to publication bias, which has been reported earlier in this literature, and incomplete reporting of all different transport behavioural changes (Supplementary file section B.3).
Meta-regression analysis reveals that baseline transport mode share is a key variable in explaining the heterogeneity in intervention effectiveness.
Mode shift estimates for all travel modes are affected by the baseline mode share. Table 1 reports the meta-regression model used to explain the heterogeneity in estimates across studies. Higher baseline usage of a transport mode is associated with a higher effect of behavioural interventions for all transport modes except biking. Baseline car mode (car_base) has a statistically significant positive effect on car use modal shift, i.e., studies which start with a higher baseline car use, also report a higher reduction in car use as indicated by car use modal share. Baseline active travel mode use has a similar positive effect on modal shift estimates reported for active travel mode. The same is true for baseline public transport mode use (PT_base) and modal shift in public transport mode use, although the effect size is much smaller than car mode and active travel mode. On the contrary, the coefficient for base mode use (Bike_base) is negative (ß = -1.53 ± 0.01; p-value < 0.05), indicating that higher initial bike mode leads to lower estimates for bike mode shift after the intervention.
Our analysis indicates that, compared to other intervention settings, interventions delivered at a community scale are associated with a higher impact on mode share across different transport modes. Similarly, interventions conducted in the workplace setting also tend to higher impact on mode share, except for active travel mode. In both cases, the coefficients are not statistically significant. Meanwhile, in terms of geographical location, we find that typically estimates from the UK tend to be higher for all four transport modes as compared to estimates from the EU. Similarly, estimates originating from the US tend to be higher. However, the coefficients, especially for car-use mode shift is very small. Here, again most of the coefficients are not statistically significant at conventional levels. In summary, we do not find any systematic differences in the effectiveness of interventions based on intervention settings or locations across all transport modes.
Concerning study characteristics, there are a few noteworthy differences. First, we do not find any systematic differences between different sources of estimates, meaning that estimates, where we can find the original source, do not differ significantly from estimates that are based only on information from other review studies. There is great variance in study design in our sample, and in some cases, it helps explain the variation in estimates. In general, we find that, as expected, more robust study designs (purely in terms of empirical strategy), especially those that have a control group (experiments or controlled before/after design), are associated with lower estimates as compared to estimates from studies relying purely on before/after design. For instance, the coefficient for CBA and RCT is negative and statistically significant for PT and car travel modes. While the coefficients for RCT for other modes are not statistically significant they are negative as expected.
|
Car
|
Public Transport
|
Bike
|
Active
|
Intercept
|
325.66
|
3.54
|
869.00
|
-197.66
|
|
(308.19)
|
(3.64)
|
(677.73)
|
(313.76)
|
Region (base = EU)
|
|
|
|
|
Region.Asia
|
6.65
|
-6.63
|
15.28
|
|
|
(12.27)
|
(7.68)
|
(19.14)
|
|
Region.Australia
|
-1.60
|
3.69
|
-1.14
|
0.22
|
|
(3.17)
|
(3.16)
|
(7.92)
|
(2.72)
|
Region.UK
|
-5.90**
|
11.24
|
-3.97
|
5.92
|
|
(2.85)
|
(10.77)
|
(12.35)
|
(3.79)
|
Region.US
|
-0.06
|
-1.67
|
-27.33***
|
2.27
|
|
(3.63)
|
(2.34)
|
(5.58)
|
(2.48)
|
scope_sum (base = ind)
|
|
|
|
|
scope_sum.school-uni
|
3.97
|
-2.34
|
2.45
|
1.39
|
|
(3.14)
|
(3.84)
|
(8.41)
|
(4.21)
|
scope_sum.Workplace
|
-0.46
|
0.68
|
6.46
|
-0.05
|
|
(2.68)
|
(4.31)
|
(6.07)
|
(3.85)
|
scope_sum.res-community
|
-1.94
|
5.70
|
24.69
|
2.96
|
|
(6.36)
|
(5.51)
|
(15.14)
|
(5.53)
|
scope_sum.area-wide
|
2.22
|
-2.45
|
-1.44
|
2.29
|
|
(2.96)
|
(4.46)
|
(6.45)
|
(3.72)
|
scope_sum.NI
|
3.47
|
|
|
|
|
(3.28)
|
|
|
|
Study_design_code (base = BA)[2]
|
|
|
|
|
Study_design_code_2.CBA
|
3.01*
|
3.57
|
0.24
|
-2.93
|
|
(1.83)
|
(3.98)
|
(7.43)
|
(3.48)
|
Study_design_code_2.CS
|
-8.58
|
2.57
|
8.30
|
7.82
|
|
(5.60)
|
(2.58)
|
(12.83)
|
(8.25)
|
Study_design_code_2.CCS
|
-4.42
|
-6.04
|
-22.01
|
-6.07
|
|
(9.20)
|
(7.53)
|
(23.61)
|
(5.93)
|
Study_design_code_2.Quasi-experiment
|
6.32
|
-13.09
|
-4.96
|
2.08
|
(6.30)
|
(12.43)
|
(15.92)
|
(5.90)
|
Study_design_code_2.RCT
|
0.45
|
-7.56*
|
-28.62
|
-3.95
|
|
(4.51)
|
(4.42)
|
(17.52)
|
(3.44)
|
Study_design_code_2.Experiment
|
-5.05
|
|
|
|
|
(17.10)
|
|
|
|
Study_design_code_2.post intervention
|
6.46
|
-4.11
|
|
-2.77
|
(5.53)
|
(3.81)
|
|
(3.71)
|
Study_design_code_2.unclear
|
-1.91
|
-0.26
|
|
-10.43***
|
|
(2.15)
|
(4.95)
|
|
(3.58)
|
Study_design_code_2.natural experiment
|
|
|
-5.50
|
|
|
|
(16.82)
|
|
Study_design_code_2.other
|
|
|
-5.54
|
-8.04
|
|
|
|
(15.15)
|
(6.61)
|
Year
|
-0.16
|
|
-0.43
|
0.10
|
|
(0.15)
|
|
(0.34)
|
(0.16)
|
factor(info.type).2
|
1.80
|
-13.79
|
-10.74
|
-2.17
|
|
(3.77)
|
(11.02)
|
(12.85)
|
(4.11)
|
factor(info.type).2.5
|
|
|
|
-6.89
|
|
|
|
|
(7.15)
|
factor(info.type).3
|
2.04
|
-5.47
|
-8.31
|
3.62
|
|
(3.72)
|
(4.94)
|
(10.29)
|
(4.64)
|
car_base
|
-0.14***
|
|
|
|
|
(0.00)
|
|
|
|
Bike_base
|
|
|
-1.53***
|
|
|
|
|
(0.01)
|
|
PT_base
|
|
0.01**
|
|
|
|
|
(0.00)
|
|
|
active_mode_base
|
|
|
|
0.09**
|
|
|
|
|
(0.04)
|
No. of Effects
|
274
|
78
|
78
|
93
|
Table 1 | Results from the Meta-regression model. The dependent variable is different for each column. (i) Car-use mode shift, (ii) Public transport mode shift, (iii) Bike mode shift, and (iv) Active travel mode shift.
Interventions with an incentive component are effective at the increasing use of public transport and reducing car use
Our analysis shows that most of the interventions are a combination of different behavioural interventions and/or incentives (identified and defined by us based on theory; complementary files). Out of 515 cases of mode shift estimates where details of the intervention are available, we find that information provision is the most commonly reported intervention component, followed by (peer, expert) advice (or goal-setting) and soft-infrastructure improvements. Incentives and promotional activities are employed the least. In terms of combinations, information provision is most often accompanied by advice and soft infrastructure improvements (Fig. 3b). However, at the same time, it´s the leading intervention component which is implemented alone (in 20-25% of all the cases where information provision intervention is reported) (Fig. 3a). All other intervention components are typically accompanied by at least one or two more intervention components. The most extreme example of this tendency is the soft infrastructure component, which is accompanied by other intervention components in 95% of the cases.
Given the tendency to combine different intervention components, it is challenging to extract the individual impact of each intervention component. Fig. 3 (c) shows the average effect size for interventions when a component is present. Analyzing variation in average effect sizes for individual intervention components and starting with car use reduction, we find that interventions with incentives have the highest impact on car-use reduction (12.4; 95% confidence interval (CI) = (9.8, 15.02)). This is followed by interventions with (peer or expert) advice and a goal-setting component to them. Soft-infrastructure improvements and promotional activities both tend to have a similar impact. Lastly, out of all the intervention components considered, interventions with different types of information provision tend to have the lowest reported impact on car use mode shift. For public transport mode shift, there is also greater evidence for the effectiveness of incentives interventions. The effect size for intervention with incentive component is (6.1; 95% confidence interval (CI) = (4.1, 8.1)) significantly higher than other intervention components. Other than incentives, interventions with soft infrastructure and information provision seem to have high effectiveness as well.
The results for biking are inconclusive, we do not find any major differences in effectiveness between different interventions. For active travel mode, we find that the intervention with soft infrastructure improvements increases active travel compared to other intervention components. On the other hand, interventions with incentives show a negative impact on active travel modes. Since most of the incentive treatments in our sample are directed at increasing public transport use, the negative impact on active travel mode may be a side-effect of their effectiveness in increasing public transport.
Critical Appraisal
Some limitations should be considered when interpreting the data and designing future studies. First, most of the studies do not provide the full information needed for properly assessing the evidence quality or the intervention details needed. This extends to the method of evaluation, especially when it comes to estimates from reports. Even narrowing down to studies where the original source is available, we can only find all relevant information needed for critical appraisal of results of behavioural intervention on transport behaviour in only 30-40% of the cases. Often information is missing on how and when the intervention was implemented, what exactly the intervention entailed (different components incorporated) and the selection of control groups. A key detail missing in most studies is the parallel developments in transport infrastructure. The supplementary file (section B.3) provides a detailed summary of the information available for all the studies considered and the results based on the information availability.
Additionally, researchers take substantially different approaches to how measuring transport behaviour. As indicated earlier many different outcome variables can be chosen to assess the effectiveness of behavioural interventions. However, this is not the only issue. Even for the same outcome variable, there are many different ways in which estimates from one study may not be comparable to another. For instance, the growth in mode use can be based on the number of trips, the number of users or the total distance travelled. Furthermore, the duration, methodology and precision of how these indicators are measured may differ. Some studies may use a question, others may use multiple-day diaries or even objective measures of travel distance. This makes comparing outcomes over studies a very challenging task. There needs to be a concentrated effort among travel behaviour researchers to push toward coming up with standards for outcome reporting in travel behaviour studies.
Our critical appraisal suggests that there are only a handful of well-conducted studies with appropriately chosen research designs necessary to establish the causal relationships between variables of interest. Only about 30% of the studies employ an experimental or quasi-experimental design with control groups. Moreover, the number of studies that conduct a proper follow-up to establish the stability of behavioural changes is very low. Even when studies do conduct a follow-up, there is not enough information on what other changes may have happened (such as infrastructure or price changes) in both intervention and control groups.
There is an active debate between transportation researchers on the right application of behavioural interventions. For instance, a sizeable number of interventions in our database are based on the idea of using behavioural interventions as a wedge to change travel behaviour during periods of interruption, such as new transport infrastructure provision20 or moving to new housing21. However, given the small number of these studies and the lack of detailed information, it is difficult to say whether interventions with such a background are more effective than interventions that do not make these explicit considerations.
Lastly, we find a vast majority of the studies in our review of review are related to high-income countries and cities. Most studies were conducted in the UK (28%) followed by Europe (22%), Australia (17%) and North America (17%), meaning very little representation from Asia (8%) or even worse none from Latin America or Africa. The lack of a greater focus on transport issues in low and middle-sized cities coupled with limited information on conditional effects means that the generalizability and transferability of review findings are undermined. This lack of research in low & middle-income countries with rapidly urbanizing populations and growing car use in cities that are highly vulnerable to climate change is extremely concerning.
[1] We exclude Change in individual´s transport behavior from our anaylsis here.
[2] BA = Before /After design, CBA = Before /After with control group design, Experiment = Randomized control trials, Post-intervention = only post intervention design, Other = other forms of study design, Unclear = study design is not clear.