Study design and population
This study used longitudinal data from the ‘Physical Activity through Sustainable Transport Approaches’ (PASTA) project 64,75. The analytical framework of PASTA distinguished hierarchical levels for various factors (i.e. city, individual, and trips), and four main domains that influence mobility behavior, namely factors relating to transport mode choice and use, socio-demographic factors, socio-geographical factors, and socio-psychological factors 64,76. Seven European cities (Antwerp, Barcelona, London, Orebro, Rome, Vienna, and Zurich) were selected to provide a good representativeness of urban environments in terms of size, built environment, transport provision, modal split and ambition to increase levels of active travel 48. To ensure sufficiently large sample sizes for different transport modes, users of less common transport modes such as cycling were oversampled 48. Participants were recruited opportunistically on a rolling basis following a standardized guidance for all cities and also some city-specific approaches. A comprehensive user engagement strategy was applied to minimize attrition over the two-year timeframe. Further details on the recruitment strategy are given elsewhere 77.
A total of 10,722 participants entered the study on a rolling basis between November 2014 and November 2016 by completing a baseline questionnaire (BLQ). Participants provided detailed information on general travel behavior, daily travel activity, geolocations (home, work, education), vehicle ownership (private motorized, bicycle, etc.), public transport accessibility and socio-demographic characteristics. Follow-up questionnaires were distributed every two weeks: every third of these follow-up questionnaires also included a one-day travel diary, henceforth labelled a ‘long follow-up’ (long FUQ) 64. All valid travel diaries were included in the analyses, implying that some participants provided multiple diary data at different time points. Using longitudinal data aimed to improve measurement of ‘typical’ travel behavior 78. Participants had to be 18 years of age (16 years in Zurich) or older, and had to give informed consent at registration. Data handling and ethical considerations regarding confidentiality and privacy of the information collected were reported in the study protocol 64. Table S2 in the Supplementary Information provides an excerpt of the PASTA BLQ, including travel diary data.
Exposure: transport mode choice and use
The primary exposure variables were daily trip frequencies obtained from the travel diaries, for each of the main modes: walking; cycling; e-biking; motorcycle or moped; public transport; and car or van. The most common metric used by local and national administrations across the world is mode share (or split) by trip frequency, not by distance 40,47; hence the results of the primary exposure analysis may be used to estimate lifecycle CO2 emissions directly from trip mode share data. Due to low counts of e-biking and motorcycle trips, e-biking was merged with cycling, with indirect emissions derived from observed bike/e-bike shares (see also footnote of Table 1). Also, motorcycle was merged with car as reported CO2 emission rates for motorcycles are comparable to cars on a per passenger-km basis 79. Participants provided information on each trip made on the previous day, including start time, location of origin, transport mode, trip purpose, location of destination, end time and duration (Table S2). The diary was based on the established KONTIV-Design 80,81, with some adaptations for online use. 5623 participants provided a valid travel diary in either the BLQ or the long FUQ; out of those 3836 participants completed valid baseline surveys and travel diaries. In the travel diary, trip purpose, duration and location were self-reported. Total trip duration was also derived as the difference between start and end time, while trip distance was obtained retrospectively feeding origin and destination coordinates to the Google Maps Application Programming Interfaces (API), which returned the fastest route per mode between origin and destination.
Three secondary exposure variables were developed to explore differences between groups of individuals. First, participants were categorized as using a ‘main mode’ of travel based on furthest daily distance (levels: walking, cycling, car, public transport). Further categorizations based on cycling frequency included a dichotomous variable of ‘cycling’ on the diary day (yes/no) as well as a trichotomous variable characterizing participants as ‘frequent cyclist’ (three or more times a day), ‘occasional cyclist’ (once or twice a day), or ‘non-cyclist’ (none). Table 1 shows sample sizes and mean (SD) values of the primary outcome variable for each group.
Outcome variables: carbon dioxide emissions
The primary outcome of interest was daily lifecycle CO2 emissions (mass of carbon dioxide in gram or kilogram per day) attributable to passenger travel. Lifecycle CO2 emissions categories considered were operational emissions, energy supply emissions and vehicle production emissions. First, operational emissions were derived for each trip based on trip distance (computed from travel diary data), ‘hot’ carbon emissions factors, emissions from ‘cold starts’ (for cars only) and vehicle occupancy rates (passengers/vehicle) that varied by trip purpose. The method for cars and vans considered mean trip speeds (derived from the travel diaries), location-specific vehicle fleet compositions (taking into account the types of vehicle operating in the vehicle fleets during the study period) and the effect of ‘real world driving’ (adding 22% to carbon emissions derived from ‘real world’ test data based on BEIS 79 and ICCT 82) to calculate the so called ‘hot’ emission of CO2 emitted per car-km. For motorcycle, bus and rail, fuel type shares and occupancy rates were based on BEIS 79. Buses were mainly powered by diesel powertrains; motorcycles were 100% gasoline; and urban rail was assumed to be all electric. For cars, ‘cold start’ excess emissions were added to ‘hot’ emissions based on the vehcile fleet composition, ambient temperatures (see Table S13 in the Supplemntary Information) and trip distances observed in each city: across the seven cities, cold start emissions averaged 126 (SD 42) gCO2 per car trip, with the trip share of a car operating with a ‘cold’ engine averaging 13 (SD 8) percent. Cold start emissions were higher-than-average in Orebro and Zurich, and lower in Barcelona. Second, carbon emissions from energy supply considered upstream emissions from the extraction, production, generation and distribution of energy supply, with values taken from international databases for fossil fuel emissions 83–85 and emissions from electricity generation and supply 86. Third, vehicle lifecycle emissions considered emissions from the manufacture of vehicles, with aggregate carbon values per vehicle type (cars, motorcycles, bikes and public transport vehicles) derived assuming typical lifetime mileages, mass body weights, material composition and material-specific emissions and energy use factors. The main functional relationships and data are provided in the Supplementary Information. The derived emissions rates are shown in the Supplementary Information for each city, disaggregated by emissions category and transport mode. Total daily emissions were calculated as the sum of emissions for each trip, mode and purpose (e.g. the sum of 4 trips on a given day = trip 1: home to work by car, trip 2: work to shop by bike, trip 3: shop to work by bike; and trip 4: work to home by car). Secondary outcomes of interest were total lifecycle CO2 emissions for four aggregated journey purposes: (1) work or education/school trips; (2) business trips; (3) social or recreational trips; and (4) shopping, personal business, escort or ‘other’ trips.
Covariates
A number of covariates were hypothesized to confound the association between carbon emissions and transport mode choice and use e.g. 37,49,55. Demographic and socio-economic covariates considered in the analyses were age, sex, employment status, household income, educational level, and household composition (e.g. single occupancy, or having children or not). Vehicle ownership covariates considered were car accessibility, having a valid driving license, and bicycle accessibility. Health covariates considered were self-rated health status and Body Mass Index (BMI), which have been shown to influence motorized travel and transport CO2 emissions 11. The perceived walking times to the nearest bus stop, tram stop or railway station were included as public transport accessibility measures. All of the covariates were self-reported. BMI was derived from self-reported weight and height as weight(kg)/height(m)2 87.
Statistical analysis
In a first step, bivariate analyses were performed to assess the association between transport-related CO2 emissions, the exposure variables, and the potential covariates. Only covariates with p-value < 0.1 were included in the linear mixed-effects models. In a second step, differences in CO2 emissions between the different transport mode users were identified by using mixed-effects linear regression models with city as a random effect (to take account of correlation among responses from the same city). The analysis used multiple data points for each individual, obtained on different weekdays; therefore, respondents and weekdays were also included as random effects. Because CO2 emissions were heavily skewed towards the right (see also Fig. 1), we applied the transformation ‘ln([x/mean(x)] + 0.01)’ (adding 0.01 to avoid turning zeros into missing values) in the comparative analysis. This improved our regression diagnostics, with residuals closer to a normal distribution and their variance less heteroscedastic. Note a log transformation changes the focus from absolute to relative or percentage change; therefore, any regression coefficient β is a mean difference of the log outcome comparing adjacent units of a predictor. This is practically useless, so we exponentiate the parameter eβ and interpret this value as a geometric mean difference 88. Three regression models were fitted: (0) unadjusted (exposure only); (1) adjusted by socio-demographic covariates: sex, age, education level, employment status, household income, household composition; and (2) adjusted by all covariates from model 1 and additionally other covariates of interest (those found to be statistically significant in previous literature described earlier): holding a valid driving license, access to a car or van, bicycle ownership, self-rated health, BMI, walking-time accessibility to the nearest bus stop, and walking-time accessibility to the nearest train station. Age was included as a continuous variable as a proxy for time. The same set of models were fitted for each of the four journey purposes.
Potential interaction by sex, employment status, income, car access, BMI and city were investigated with Type II Wald chisquare tests in the fully-adjusted models. We observed significant interactions for some transport modes (e.g. use of all modes and car access; public transport use and gender; car use and income); therefore, all models’ sensitivity to different levels of the above factors were tested. We also tested the models’ sensitivity to a number of other factors: age (‘<35 years’), working status (‘working’), car access (‘not having access to a car’), body weight (‘being overweight’), household income (‘high income’) and city (Table 2). Participants were also ranked according to their CO2 emissions (all travel and by trip purpose) then split into ten emissions deciles. Chi-square tests were performed on selected covariates to profile the ‘bottom’ and ‘top’ deciles. Possible mediation of the effect of transport mode use on CO2 emissions was assessed for three potential mediators: total daily distance travelled, BMI and self-rated health 89,90. Only observations without missing data were included. R statistical software v3.6.1 was used for all analyses.