The empirical method of employing a causal random forest, a type of machine learning technique, enabled us to detect heterogeneity in the effects the COVID-19 pandemic had on increasing young consumers’ online purchases. Table 3 presents the CATEs of important selected attributes used to identify the heterogeneity in the effects of COVID-19. The variables were selected in order based on how frequently each attribute was selected when splitting data to build a tree (Athey et al., 2019). The variables selected as important suggest that young consumers with these attributes tended to shift more swiftly to online shopping during the pandemic. The identification of heterogeneous effects is also demonstrated in Fig. 1 with an associated 95% confidence interval.
The GRF method was able to identify specific demographic characteristics associated with increasing online shopping behavior during COVID-19. Household income and size contributed to explaining the heterogeneous effects of the pandemic in inducing young consumers to choose to shop online. Both the below-median and above-median income groups experienced the positive effects of the COVID-19 pandemic and turned to online shopping more during lockdown. However, young consumers from the below-median income group showed greater effects of the pandemic in moving toward online shopping compared to the above-median income group. Duffy et al. (2022) stated that for the general consumer, those of lower income in their sample who were also experiencing food insecurity transitioned to online grocery shopping at a rate of 36%. The authors also explained that this lower income group was also more likely to benefit from the Supplemental Nutrition Assistance Program (SNAP), as the United States Department of Agriculture started allowing online grocery outlets to accept SNAP during the COVID-19 pandemic. Our findings also show that young consumers from large households were more likely to shop online during the pandemic, which is consistent with the findings of Duffy et al. (2022), who found that those with children and those experiencing food insecurity were more likely to shop online for groceries during the pandemic period.
Our findings also indicated that the number of televisions that young households owned affected the likelihood of purchasing groceries online. Young households with no television purchased more groceries online than those with one or more televisions. It is likely that members of young households without televisions use a computer or a laptop as the main means of media, thereby making them more familiar with the processes of e-commerce purchases. In addition, shopping websites are optimized for computer browsers and screens along with keyboard and mouse control (Wagner et al., 2017), so that those using a computer or a laptop as their primary device for tasks may find online shopping using the device more convenient.
Other demographic characteristics that were identified as attributes associated with purchasing groceries online during COVID-19 include gender based on employment status. It is interesting to observe that employed females responded to COVID-19 with the highest rates of online purchasing, followed by employed males. Unemployed females responded negatively with a drop in online purchasing, while both unemployed and employed males responded positively. Unemployed males experienced lower effects with respect to online purchasing compared to employed males. Kock et al. (2020) found that women and Generation Z, defined as those born after 1997, presented higher motivation to shop online compared to older, male groups. Our findings contribute to the literature by confirming that the tendency to make online purchases differs not only across gender but also across employment status.
Racial disparities were also observed in a heterogeneous tendency to purchase online while being affected by COVID-19. We found that young black and Asian consumers were more likely to increase online shopping compared to those who are white. Similar to our findings, Sze al. (2020) also found that certain racial groups, such as blacks and Asians, are at a higher risk of being infected by COVID-19 compared to whites. Martin et al (2020) offered explanations in their study, noting that these minorities are more likely to live in households of larger sizes, thereby reducing the effectiveness of lockdown and social distancing for those individuals. Hawkins (2020) explains that these minority groups are also more likely to be employed in sectors with close proximity to others and therefore a higher risk of exposure to COVID-19. Thus, young consumers of these minority groups, if aware of risks associated with both occupation and higher household size, may be inclined to purchase more online and avoid face-to-face contact to reduce the risks of infection.
It is interesting to observe how the GRF method allows the detection of heterogeneity in the effects of COVID-19 on which types of products are more likely to be consumed online. The estimated treatment effects, the effect of COVID-19 in enhancing the consumption of a specific product type, is higher for frozen foods. Chenarides et al. (2021) surveyed consumers in the U.S. and found that due to store service closures and lockdown, consumers purchased more than normal during the service times. When stores visits are constrained, foods that are not perishable or that can be stored for a long time are preferred. The empirical evidence of Chang et al. (2021) also revealed that frozen foods were among the food items most in demand during COVID-19, based on data from Taiwan.
Consistent with research on how coupons and deal-flags induce consumers to make online purchases (Ren et al., 2021), we also found that more online purchases occurred during COVID-19 for items with deal-flags or discounts compared to regular-priced items. Our results provide evidence that young consumers, a particular population group by age strata, were sensitive to when products were offered with price deals while purchasing online items during COVID-19.
4.1 Difference-in-difference estimators for further robustness
The generalized random forest (GRF) provides researchers with the advantage of handling data by reducing its dimensionality when the data has many covariates without a pre-identifying model structure. For the robustness of the results produced with random forest algorithm, we estimated the treatment effects using the difference-in-difference approach. Table 4 presents the marginal effect estimators generated by interacting the treatment with the attributes selected as important in GRF. Overall, the coefficients from the logistic regression show signs that are consistent compared to the CATEs estimated using GRF. Those young adults with below-median incomes who own no televisions and/or are from large households were likely to make more online grocery purchases compared to those with above-median incomes who own more than one television and/or are from small households, during the pandemic. Employed females also had a higher likelihood of purchasing online groceries compared to others. The negative signs associated with employed males indicate that employed males were less likely to purchase groceries online compared to others, including both females and unemployed males. Price deal flags showed a negative sign, which is inconsistent with the CATEs estimated in GRF, however this result is not statistically significant. Dummies for different races also showed consistent signs that in general young black and Asian consumers were more likely to purchase groceries online compared to other races, while white young adults were less likely to during the pandemic. Although discrepancies exist in the magnitudes of the effects, the signs are consistent in general, and the statistical significance of the variables indicate that GRF is indeed effective in selecting the important variables when data is high dimensional. In addition, the estimation using GRF was able to effectively identify the heterogeneous effects of COVID-19.