Methods
Data were a sample of video observed individuals recorded by municipality-operated public security cameras in the Netherlands (data and materials are available at osf.io/j7guw). We were granted permission to use the recordings for scientific purposes by the Dutch Attorney General, Ministry of Public Affairs.
The Ethics Committee for Legal and Criminological Research at the Vrije University Amsterdam approved the study.
We obtained access to more than 60,000 hours of footage across 63 cameras located in Amsterdam and Rotterdam. For Study 1, we selected recordings from a single camera in Amsterdam (to minimize between-context heterogeneity), which had a high quality, and captured a pedestrianized street that allowed for continuous observation of pedestrians. We included 60 hours of footage from five days (three Thursdays, one Saturday, one Sunday), recorded in the day hours between May and the beginning of June 2020.
Coding procedure. Two trained research assistants coded data following a codebook developed for the study. The interrater reliability of the codebook was evaluated by independently double coding 44 individuals and 25 contexts. All included variables had a Gwet’s21 AC1/AC2 score larger than .8, indicating good interrater agreement (each score is noted in the below Measures section). The coding began by randomly selecting 51 30-minute segments across the 60 hours of footage included. If possible, we then observed seven persons with a mask and—to construct a relatively balanced sample—seven persons without a mask for each segment. In total, we sampled 383 persons (176 with and 207 without a mask) for an average of 25 seconds (SD = 7.4) and a total of 158 observation minutes. This satisfied an a priori power analysis suggesting that 339 cases would detect a small effect (f² = 0.05), with a power of 90%, and a conservative alpha of .00522. The small effect size assumed in the power analysis was established from what we considered a lower threshold of practical significance23. Note that we coded beyond the required number of observations to have a buffer for missing data.
Measures. The dependent variable was captured as a binary variable distinguishing between whether or not the observed individual was within a 1.5 meters radius to a stranger (AC1 = .92), i.e., the official Dutch meter-threshold for social distancing. Whether the other person is a stranger or affiliated was inferred from whether they arrived at the scene together and walked in each other’s company24. To assess the coding of interpersonal distance, we utilized the exact dimensions of street tiles as a ‘ruler.’ Note that we also, as an alternative ‘high-risk’ version of the dependent variable, measured social distancing with a 0.5 meters cutpoint (AC1 = .89).
The independent variable was a binary measure, distinguishing between whether the person wore a face mask or not (AC1 = 1.0). Face masks included respirators (e.g., N95), surgical masks, cloth masks, and excluded persons wearing face shields, and improvised face coverings (e.g., bandanas, scarves). We also excluded persons wearing masks covering neither the nose nor the mouth (e.g., hanging under the chin) or who changed the mask’s placement (i.e., between facial areas, or putting it on/off). Finally, we included some control variables in the observational analysis: a visual assessment of the person’s age (AC2 = .90) and gender (AC1 = .96), and a measure of crowding captured as a count of the number of persons moving through each segment (AC2 = 1.0).