The conducted analyses of risk factors within a geographical area, or spatial risks, indicate that several factors, alone or in combination, conferred higher risk of having inhabitants infected with COVID-19. These factors include areas with a high proportion of overcrowding, especially when occurring concurrently with an economic standard beneath the national median, or a combination of a high proportion of men and many motor vehicles, or a presence of health and social care workers, regardless of educational requirements. When adding the dimension of time, i.e., conducting spatio-temporal analyses, the largest effect on a DeSO area’s risk was time itself, most conspicuously during the second and third waves of the COVID pandemic, where interactive effects with having a non-Scandinavian background or high rates of gainfully employed persons, produced the largest risks. Overcrowding foremost interacted with time, and mainly during the first wave. Otherwise, overcrowding contributed the most to risks in DeSO zones marked by low economic standard, low education levels, or a high proportion of man or vehicles or health care workers or gainfully employed persons. Throughout, it was clear that in the three largest cities, spatial risk factors co-existed in certain boroughs and DeSO zones. Our results then suggest the following chain of events: people with lower education levels and those not of Scandinavian descent and, further, that could not work from home tended to be exposed to a higher degree. Within this group, individuals with lower levels of education and/or a non-Scandinavian background tended to live to a greater extent in poorer neighbourhoods, contributing to a higher incidence of secondary infections (in their DeSO zones) due to more exposure sources and less ability to protect themselves.
Several joint factors may increase disease rates and risk of severe forms of infection during a pandemic. Overcrowded households have been identified as important contributors to being infected with COVID-19, but the determining of factors for transition into, and within, households is still largely unknown, as are interactive effects from aspects of relative deprivation. Within the home, physical closeness, sharing several surfaces, and the lack of possibility to isolate persons who display symptoms, are the most probable causes. Since overcrowded households are related to poverty and poorer health (21), both baseline health and possibly housing conditions (poor ventilation or exposure to mould or rot) could further contribute to increased risks of severe illness (7). Determining factors outside the household are a high frequency of social contacts, especially if involving potentially infected persons, e.g., working health care jobs, commuting to work with public transport, or social activities, where only the latter can be considered as truly voluntary. An additional example could be other risk occupations which may be exacerbated in vulnerable zones, e.g., taxi drivers who, at the beginning of the pandemic, picked up infected passengers from the airports and then returned back to their neighbourhoods and homes, increasing infection risks by a route of introducing infection from outside the DeSO zone or even from outside Sweden.
Our results emphasize that the predictive value of overcrowding will have the largest effect in combination with other risk factors. Two Swedish register studies have for example noted that occupations with many social contacts, such as taxi driving, were not necessarily associated with increased mortality in older cohabitants (> 67 years) (22, 23), not even in adjusted models. This indicates that certain risk factors mainly occur in the presence of other risk factors, presumably overcrowding, low education or having a foreign background, which may relate to differences in living conditions and access to adequate information on disease protection. One may also think of university students or persons living in high rent cities, who often share accommodation to be able to live in central locations but will lack relation to other hazardous dimensions, such as working in a risk occupation. This could explain why Stockholm, where the average rent per m2 is the highest in Sweden, displays overcrowding in almost all DeSO-zones.
Aside from overcrowding and higher participation in jobs that cannot be performed from home, there may be aspects that increase exposure risks for persons with a non-Scandinavian background. When COVID-19 was declared a full-scale pandemic in the spring of 2020, the awareness of the disease was high throughout society, but several international studies showed that crucial information about protection and risks of infection did not reach everyone to the same extent. Through an interview-study conducted by our project group in non-Scandinavian-born workers in high-risk occupations (24), the lack of access to and understanding of health information seemingly played a part in having less ability to protect oneself and others. This ties to several aspects. An initial root cause was lack of information in languages other than Swedish, but also less knowledge of reliable sources to obtain such information or less trust in official outlets. Studies on media consumption conducted in different boroughs in Gothenburg showed that persons in geographical areas with low educational level and a high proportion of foreign-born persons reported a higher usage of social media, foreign news media or social networks, as the main information source, especially for people who lacked proficiency in the Swedish language (25). This difficulty to access important societal information is sometimes referred to as "the knowledge gap”, based on limited language skills or trust in authorities in the new country (26). In this context, a surprising finding was that the combination of being non-Scandinavian and owning a motor vehicle was a primary risk predictor, since car ownership has consistently been considered a protective factor (5, 6). We found no studies on increased infection risks from owning a car, and can only speculate, based on verbal information, that it implies increased infection exposure due to facilitating transportation for people in the car owner's social circle.
Due to a higher prevalence of existing poor health in areas with low socioeconomic status, there are not only disparities in infection, but also larger risks for severe COVID-19 and post-COVID in such areas. According to the stages of disease theory, as proposed by Clouston (27), when new diseases arise, they transition through phases marking distinct patterns in mortality inequality, that emerge following the development of new information and mitigation strategies. More advantaged communities, such as those with less household overcrowding, can better implement resources that curb the spread and lethality of COVID-19. A widespread disease will therefore hit more disadvantaged groups the hardest at both the initial outbreak of a pandemic, but also in the long-term aftermath, increasing already prevalent societal disparities in health.
Methodological considerations
Regarding modelling considerations, a question which emerged was whether it is the general overcrowding or socioeconomic situation in a person’s DeSO which affects its risk of testing positive for COVID-19, the person’s household structure, or a combination of the two. For the ICU cases and the deaths, our interest did maybe not as much lie in overcrowding as such, but rather to what degree general socioeconomic vulnerability was a risk enhancer. To reveal such relationships, one could step away from interpreting the exponential term in the proposed Poisson regression setup as an individual’s risk and instead consider, e.g., a logistic regression model where the response variable is the indicator whether a given individual has tested positive for COVID-19, has been admitted to the ICU or has died from COVID-19. As covariates one would then include the spatial covariates which correspond to each individual, as well as different person-level covariates, most notably the dichotomised variable indicating whether a given individual lives in an overcrowded household or not. Here too, one could use an elastic-net regularised regression approach. Finally, the proposed approach was chosen to obtain an easily interpretable model in combination with variable/model selection and collinearity adjustment. This implies that we assume that all dependencies presented can be prescribed to the underlying covariate structure. However, since we are dealing with disease transmission for an infectious disease, the underlying DeSO covariates likely cannot explain the complete dynamics of the spread of the disease. Consequently, a classical spatio-temporal modelling approach which is based on models incorporating dependencies in the temporally evolving (spatial) multivariate response variable (a discrete random field) would be one way forward. Still, successfully incorporating variable selection techniques into such models remains a challenge. Ove way to handle this could be to model the residuals obtained here by means of a spatial(-temporal) random field model, in order to shed light on the inherent dependencies that reflect the spatial(-temporal) spread of the disease.
Strengths and limitations
One of the larger limitations of our study is the lack of testing in the early stages of the pandemic (spring-summer 2020), during which tests were almost exclusively conducted in health care workers or severe cases of COVID-19 that needed hospitalization. Those included in the early phase of the analyses might consequently constitute a sub-cohort of particularly vulnerable persons, due to an underreporting in the first wave of the pandemic. Another methodological limitation is that data on socioeconomic variables are based on information in 2018, and we cannot know to what extent the living conditions for individual persons and within DeSO zones changed in the years leading up to 2022. Furthermore, according to organizations for undocumented immigrants in Sweden, it has been approximated that about 10 000–35 000 illegal immigrants are residing in Sweden, most likely concentrated to the larger cities, Stockholm, Gothenburg and Malmö, where there are more possibilities to work and to live anonymously. While some are known to live in dwellings provided by illegal contractors, e.g., in the construction industry, many rent rooms or live with relatives, which can be assumed to be more common in immigrant dense areas. It is, therefore, a limitation that our analyses are based on official statistics, likely underestimating overcrowding in high risk DeSO.
While most studies on overcrowding and infection risks are based on cohort data of a selection of a country’s inhabitants, our study included all registered residents in Sweden with associated spatial data on decisive socioeconomic factors such as housing, income, education, occupation, origin and vehicle ownership. We also have access to all positive PCR-tests in Sweden and date for test results. Additionally, unlike many similar studies, we also have information on marital status/civic partnership and on age of all housing members, and thereby make calculations using a definition of overcrowding similar to the Eurostat standard. The deviance in age boundaries between Eurostat's definition and our current definition is related to the format of Statistics Sweden's data, and we consider this difference to have rather small implications for the results.
Finally, as the statistical procedure employed here does not account for spatial and temporal dependencies in the response variable, there is a risk that we over-interpret the effects of the covariates. The reason is that high counts during a given period might to a large degree be the result of proximity-related disease transmission between DeSOs, rather than sociodemographic structures of the individual DeSOs. We believe that the risk of such over-interpretations is smaller in the case of ICU and death counts, as they are already conditioned on individuals being infected.