Study design and controls: In this study, our population was HCWs in the US. We used a case-control design and compared HCW deaths with three control groups: 1) Non-HCW deaths, 2) HCW non-deaths, and 3) non-HCW non-deaths. A COVID-19 death was defined as a death derived from a laboratory-confirmed COVID-19 case. A dichotomized variable of HCW (Yes/No) was used to separate HCWs from non-HCWs. The purpose of the first control group was to examine demographic and symptom-related differences between HCW deaths and non-HCW deaths. In addition, we used two non-death control groups, non-death among HCWs and non-death among the non-HCW or general population, to identify risk factors and symptomatology of COVID deaths. HCW non-deaths are an ideal reference group to control for important confounders including occupation, education, and medical knowledge. For the third control group, we used non-deaths among non-HCWs, which represents the general healthy population.
Data acquisition: Information regarding COVID-19 confirmed cases, probable cases, and deaths across the US were obtained from the Restricted Access Dataset operated by the US Centers for Disease Control and Prevention (CDC). This COVID-19 surveillance system database includes patient-level data reported by all US territories and states. COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01) on 4/5/2020. In January 2020, COVID-19 data collection started, and all states and territories were encouraged to enact laws to submit case notifications to the CDC via jurisdictions. At this time, the CDC also requested that public health departments around the country report all COVID-19 cases to them using the standardized case definitions for lab-confirmed or probable cases. These case reports have been routinely submitted using standardized case report forms ever since. This study covers the timeframe form January 1, 2020 to December 31, 2020.
In this dataset, we obtained demographic and medical information for each record including COVID-19 case status (confirmed or probable case), date of first positive specimen collection, gender, age group, race, ethnicity, county and state of residence, presence of underlying comorbidity or disease, presence of severe COVID-19 symptoms (pneumonia, acute respiratory distress syndrome (ARDS), abnormal chest x-ray, hospitalization status, intensive care unit (ICU) admission status, mechanical ventilation (MV)/intubation status, and death status), and the presence of less severe symptoms: fever, subjective fever, chills, myalgia, rhinorrhea, sore throat, cough, shortness of breath, nausea/vomiting, headache, abdominal pain, and diarrhea. To prevent the release of data that could be used to identify persons, data cells are suppressed by the CDC for low reporting counts (< 5) records, and uncommon combinations of demographic characteristics (sex, age group, race, ethnicity). Suppressed values are re-coded to the NA answer option.
Outcomes and predictors: The major health outcome of interested in this study is COVID-19-related deaths. Among all the deaths identified, 97.8% of them were from confirmed COVID-19 cases and 2.2% came from probable cases. While we examined temporal-spatial death trends, the fatality rate by county was defined as the total numbers of COVID-19-related deaths divided by the total numbers of confirmed COVID-19 cases in each county in the US. We used a total of 20 predictors including demographic variables (gender, age group, race, ethnicity, month, and season), severe COVID-19 symptoms (hospitalization, admission to ICU, pneumonia, ARDS, abnormal chest x-ray, received MV/intubation, and death status), and less severe COVID-19-related symptoms, such as fever > 100.4F (38C), chills, muscle aches (myalgia), subjective fever (felt feverish), runny nose (rhinorrhea), sore throat, cough (new onset or worsening of chronic cough), and shortness of breath (dyspnea).
Statistical analysis and confounders: We first compared all 20 predictor variables between the death and control groups using Chi-square tests. We then developed logistic regression models by regressing the outcome variable against each predictor while controlling for demographic confounders. Demographic confounders in this study included gender, age group, race, and ethnicity. In addition, we examined and compared the temporal trends of confirmed cases and deaths among HCWs and the general population. Finally, we demonstrated the spatial variation in the fatality rate among HCWs across the country. All data cleaning, analysis, and results were accomplished using R 3.5.1 (https://www.r-project.org/).