In this prospective cohort study, we used data from the Norwegian Emergency Preparedness Register (11). The register includes data from all testing for SARS-CoV-2 (polymerase chain reaction tests - PCR) in Norway from the beginning of the pandemic, all medical records from primary care (used here: general practitioners and emergency wards) and specialist care (used here: for exclusion of hospitalized individuals and for calculation of the number of comoribidities). It also includes data on all vaccine doses for COVID-19 for all individuals as well as background characteristics such as age, sex and country of birth.
Our inclusion criteria were persons living in Norway on August 1st 2020, age between 18 and 70 years, who were tested (with a positive or negative result), or not tested for SARS-CoV-2 in between August 1st 2020 and August 1st 2021 without being hospitalized from -2 to +14 days from the test date (or a hypothetical test date for the untested), and without having a diagnostic code of any of the included operational definitions of post-covid symptoms or complaints (general practitioner or emergency ward), from six months prior to the test date for SARS-CoV-2 to the beginning of the week of testing. In this way, we allowed for prevalent complaints/healthcare use in relation to testing but not in relation to pre-test complaints/healthcare use. We also avoided the selection bias that may arise because of routine testing prior to specialist healthcare use (5).
Test criteria during our study period included everyone having symptoms or being a close contact to a person with confirmed or suspected SARS-CoV-2 infection. We could not include antigen testing, however every positive antigen test was required to be followed up by a PCR test (of which we captured all in the current study).
Based on previous findings of increased health care use for 1 to 6 months after positive tests (5-7), we included follow-up data from primary care for 4 different time points during follow-up: baseline (test week), 2, 4 and 6 months after the test date. All contacts that occurred during the time passing between the time points were included in the latter time point (i.e. all contacts from week 1 to week 7 were included in week 8 together with the contacts in week 8, to allow for clustering of contacts for different causes over time). We required at least a six months follow-up time, i.e. the few persons dying or emigrating during follow-up were excluded ensuring everyone could be observed throughout the entire study period.
The Ethics Committee of South-East Norway confirmed (June 4th 2020, #153204) that external ethical board review was not required. The data sources (The emergency preparedness register for COVID-19 (Beredt C19)) were established and handled in accordance with the Health Preparedness Act §2-4 (11). All methods were carried out in accordance with relevant guidelines and regulations. No informed consent from participants was required since our study was based on routinely collected register data.
Study groups
The study sample was dividied into three mutually exclusive study groups according to their test status, as previously described (7):
- Persons testing positive for SARS-CoV-2, including everyone with one or more positive tests in the inclusion period. In the rare cases of several tests with a positive result, we chose the first one. Persons whose first positive test fell outside the inclusion period were excluded.
- Persons testing negative for SARS-CoV-2, including everyone with one or more negative tests in the inclusion period. If there were several tests with a negative result in- or outside the inclusion period, we randomly chose one of the tests. Persons whose randomly drawn negative test fell outside the inclusion period were excluded. In this way, frequent and less frequent testers had the same probability to have a test during the inclusion period.
- Untested persons, including everyone who were never tested for SARS-CoV-2 neither in- or outside the study period, and who were assigned a random, hypothetical test date falling in the inclusion period (equal probability for each date).
Outcomes: Operational definitions of long-covid
We studied medical symptoms and complaints reported from primary care that were diagnosed by general practitioners and medical doctors at emergency wards that fell in the following categories based on International Classification of Primary Care (ICPC-2) codes:
- Pulmonary complaints: shortness of breath/dyspnea (R02), cough (R05)
- Neurological complaints: impaired concentration, memory problems or brain fog (P20)
- General complaints: fatigue (A04, A05, A28, A29)
Based on these categorizations, we made seven operational definitions of typical post-covid complaints, i.e. we studied the single categories and all combinations of the three categories as separate outcomes, for each study time point: 1) Pulmonary complaints, 2) Neurological complaints, 3) General complaints, 4) Pulmonary + neurological complaints, 5) Pulmonary + general complaints, 6) Neurological + general complaints, and 7) Pulmonary + neurological + general complaints.
In this way, outcomes 1) to 3) were studied as outcomes occurring on their own („single complaints“), whereas outcomes 4) to 7) were studied as outcomes occurring in combination („combined complaints“). The operational definitions were made arbitrarily in the lack of established definition of a post-covid or long-covid symptom or complaint, however they were in large extent in accordance with the recently agreed on WHO-definition for the post-covid condition (9).
Medical recording to the National registries is mandated by law in Norway, ensuring no missing outcome data in our study. Norwegian health register data have been demonstrated to have high validity and reliability in a small comparative study of medical journal notes and medical records (12), i.e. they may be used for studying patterns of health care use and complaints leading to health care use. Because seeking healthcare was a requirement for our definitions, we allowed for up to two months to pass for the definitions to overlap as described above.
Statistical analyses
First, at baseline, 2, 4 and 6 months we studied the prevalence of the typical post-covid complaints, according to the seven operational definitions above, in the group testing positive for SARS-CoV-2, in the group testing negative and in the untested group. To take into account the dependence of the data at each time point for each person, we determined the groupwise point prevalence and its 95% confidence intervals (CI) of each outcome at each time points from a logistic regression model with robust standard errors (clustered on patient). Second, to compare the prevalence of the complaints at baseline and the follow-ups (2, 4 and 6 months) between the exposure groups, we used a logistic regression model. In the model, the group (testing positive vs negative and testing positive vs untested), the time points (0, 2, 4 and 6 months), and their interaction were included as fixed effects, while the patient was included as a random effect. All regression analyses were adjusted for age, sex, education level (in four categories: no education, primary school, upper secondary school or college/university), country of birth (Norway vs abroad), the number of comorbidities (0, 1 or 2 or more, as based on risk conditions for severe COVID-19 defined by an expert panel in ethics and prioritisation, with data identified in data from the Norwegian Patient Register) (13), the number of vaccine doses (0, 1 or 2 or more) and calendar month with year as potential confounders. A separate model was fitted for each outcome.
Further, to assess whether the estimated group differences could be affected by previous history of any of the definitions, we repeated the analyses with adjustment for medical records from general practitioner and/or the emergency ward that were indicative of pulmonary, neurological and/or general complaints during 2017-19, using diagnostic codes as described above (0 (absent) vs 1 (present) for the complaint in question).
To further illustrate the changes in prevalence and overlap of the outcome definitions over time in the two exposure groups, we used proportional Venn-diagrams. All analyses were performed in Stata MP v. 17.