Study inclusion
The results of the search and selection process are presented in Figure 1. A total of 202 articles were identified. Among these, 40 were MedRxiv preprints and 162 were fully published articles from MEDLINE Complete (EBSCO) and PubMed. 2 articles were identified from other sources for example manual search. After abstract/title exclusion and removing duplicates, 74 articles were submitted to full text screening and 31 of these were included for the systematic review. Most articles were excluded because they did not present sufficient data hence it was not possible to extract data to construct 2 × 2 table and 1 article was excluded because it was only available in Chinese. A total of 29 articles describing the results of 99 independent studies/data sets (19, 23 and 57 investigating LFIA, CLIA and ELISA respectively) were eligible for the meta-analysis.
Figure 1. PRISMA flow diagram for selection of articles for meta-analysis.
Characteristics of the studies
The general characteristics of the included articles are presented in table 1. All the published articles (n=14) included in the review were published in 2020 because COVID-19 is an emerging disease. The 17 unpublished articles were MedRxiv preprints which have been submitted to different journals for publication. Twenty five articles included in the review had a case-control design, comparing a group of well-defined cases with a group of healthy controls or controls with diseases or COVID-19 rRT-PCR negative patients and only six studies were cross sectional studies. One study had no control group and was excluded in the meta-analysis 27. Most of the studies (n=22) were conducted in China where the COVID-19 pandemic began and 3 studies were conducted in Italy whilst, USA whilst UK, Denmark, Germany, Spain and Japan each conducted one study.
Most articles (n=26) included in the review clearly stated that the gold standard nucleic acid tests (rRT-PCR or deep sequencing) were used as the reference standard. However, five articles used a combination of epidemiological risk, clinical features, chest CT images and rRT-PCR. In one article the reference standard used was not stated but all the patients in the study were COVID-19 patients 28.
Point-of-care (POC) lateral flow immunoassays (LFIA) were used in 14 articles, CLIA were used in 9 articles and ELISA were used in 13 articles. We did not identify articles using FIA that met our inclusion criteria. One study did not specify the serological assay used and it was excluded from the review 29. One study used a LIPS which is performed in solution, thus maintaining the native antigen conformation 30. Most of the serological assay test kits were commercial (n=21) and 12 were in-house. Three SARS-CoV-2 antigens, Spike protein (S), nucleocapsid protein (N) and envelope protein (E) were used together or separately in studies included in the review. The spike protein and nucleocapsid were used as the antigen in 9 articles and 6 articles respectively. Five articles used both S and N as the antigens separately. In 3 articles S and N antigens (S-N) were used together as the antigen. In 1 article N and E antigens (N-E) were used together as the antigen. In 7 articles the name of antigen used was not given.
Table 1: The general characteristics of the studies included in the review.
Study ID
|
Country
|
Antibody type
|
Antigen type
|
Commercial
|
Reference standard
|
Index test
|
Control group/comparison group
|
Kai-Wang
To 27 P, CS
|
China
|
IgG and IgM
|
S and N
|
Inhouse
|
rRT-PCR
|
ELISA
|
No control
|
Cassaniti 40 P, CS
|
Italy
|
IgG, IgM and IgG-IgM
|
S
|
Commercial
|
rRT-PCR
|
POC LFIA
|
Patients with fever and respiratory
syndrome/RT-PCR negative
|
Duchuan Lin 41
|
China
|
IgG, IgM and IgG-IgM
|
N
|
Inhouse
|
Epidemiological risk/clinical features/rRT-PCR
|
CLIA
|
Healthy individuals and tuberculosis patients
|
Jie Xiang 42
|
China
|
IgG, IgM and IgG-IgM
|
|
Commercial
|
rRT-PCR
|
ELISA and POC LFIA
|
Healthy individuals
|
Li Guo 13 P
|
China
|
IgM
|
N
|
Inhouse
|
Deep sequencing and rRT-PCR
|
ELISA
|
Adult patients with acute lower respiratory
tract infections (ALRTIs)
|
Rui Liu 29 CS
|
China
|
IgM
|
N
|
Inhouse
|
rRT-PCR
|
|
COVID-19 rRT-PCR negative patients
|
Wanbing Liu 34 P
|
China
|
IgG, IgM and IgG-IgM
|
S and N
|
Commercial
|
rRT-PCR
|
ELISA
|
Healthy individuals
|
Xuefei Cai 43
|
China
|
IgG, IgM and IgG-IgM
|
S
|
Inhouse
|
rRT-PCR
|
Peptide-based Magnetic CLIA
|
Mixed dieases and Healthy controls
|
Yu bao Pan 44 CS
|
China
|
IgG, IgM and IgG-IgM
|
|
Commercial
|
rRT-PCR
|
POC LFIA
|
COVID-19 rRT-PCR negative patients
|
Yujiao Jin 45 P
|
China
|
IgG, IgM and IgG-IgM
|
S-N
|
Commercial
|
rRT-PCR
|
CLIA
|
Patients with suspected SARS-CoV-2
infection but with negative rRT-PCR results
|
Zhao 46 P
|
China
|
IgG and IgM
|
S
|
Commercial
|
Chest CT images/Epidermiological history/Clinical diagnosis/rRT-PCR
|
ELISA
|
Healthy individuals
|
Zhengtu Li47 P
|
China
|
IgG, IgM and IgG-IgM
|
S
|
Commercial
|
rRT-PCR
|
POC LFIA
|
Healthy individuals
|
Rongqing Zhao 28
|
China
|
IgG-IgM
|
S
|
Inhouse
|
Not clear but all cases were confirmed COVID-19 patients
|
ELISA
|
Healthy individuals (samples collected
before and during the COVID-19 pandemic)
|
Pingping Zhang 48
|
China
|
IgG-IgM
|
S
|
Inhouse
|
rRT-PCR
|
POC LFIA
|
COVID-19 rRT-PCR negative patients
|
Paradiso 49
|
Italy
|
IgG-IgM
|
S
|
Commercial
|
rRT-PCR
|
POC LFIA
|
Patients with Covid-19 disease
orienting-symptoms but rRT-PCR negative
|
Huan Ma 50
|
China
|
IgA, IgG, IgM, IgG-IgM and Ab
|
S and N
|
Inhouse
|
rRT-PCR
|
CLIA
|
Healthy individuals, COVID-19 suspected
individuals and Mixed disease group
|
Qian 15
|
China
|
IgG and IgM
|
S-N
|
Commercial
|
rRT-PCR
|
CLIA
|
Healthy individuals and Hospitalised individuals
|
Ling Zhong 36 P
|
China
|
IgG and IgM
|
|
Inhouse
|
rRT-PCR
|
CLIA and ELISA
|
Healthy individuals
|
Jiajia Xie 51 P, CS
|
China
|
IgG and IgM
|
E-N
|
Commercial
|
Chest CT images/Epidermiological history/Clinical diagnosis/rRT-PCR
|
CLIA
|
Clinically confirmed COVID-19 rRT-PCR negative patients
|
Infantino 52 P
|
Italy
|
IgG and IgM
|
S-N
|
Commercial
|
rRT-PCR
|
CLIA
|
Mixed dieases patients and blood donars Pre-COVID-19
|
Adams 53
|
UK
|
IgG, IgM and IgG-IgM
|
S
|
Inhouse (ELISA) and Commercial (LFIA)
|
rRT-PCR
|
ELISA and POC LFIA
|
Healthy blood and ICU cerebral organ donars before
the COVID-19 pandemic
|
Lassaunière 54
|
Denmark
|
IgA, IgG and Ab
|
S
|
Commercial
|
rRT-PCR
|
ELISA and POC LFIA
|
Healthy individuals and mixed dieases patients (Including
Acute respiratory tract infections caused by other
corona viruses and
non-corona viruses
|
Qiang Wang 55 P
|
China
|
IgG, IgM and IgG-IgM
|
|
Commercial
|
Chest CT images/Epidermiological history/Clinical diagnosis/rRT-PCR
|
ELISA and POC LFIA
|
COVID-19 clinical negative mixed diseases patients
|
Fei Xiang 42 P
|
China
|
IgG and IgM
|
N
|
Commercial
|
rRT-PCR
|
ELISA
|
Healthy blood donors or from patients with other
disease hospitalized
|
Bin Lou 56
|
China
|
IgG, IgM and Ab
|
S and N
|
Commercial
|
rRT-PCR
|
ELISA, CLIA and POC LFIA
|
Healthy Individuals
|
Lei Liu 57
|
China
|
IgG-IgM
|
N
|
Commercial
|
rRT-PCR
|
ELISA
|
Randomly-selected ordinary patients and
healthy blood donors
|
Imai 58
|
Japan
|
IgG, IgM and IgG-IgM
|
|
Commercial
|
rRT-PCR
|
POC LFIA
|
Non-COVID-19 patients (from April to October 2019
|
Pérez-García 59
|
Spain
|
IgG, IgM and IgG-IgM
|
|
Commercial
|
rRT-PCR
|
POC LFIA
|
Healthy individuals (samples collected
before the COVID-19 pandemic)
|
Zhenhua Chen 60 P
|
China
|
IgG
|
N
|
Inhouse
|
rRT-PCR
|
POC LFIA
|
Clinically suspicious for the presence of anti-SARS-CoV-2
|
Dohla 61 P, CS
|
German
|
IgG, IgM and IgG-IgM
|
|
Commercial
|
RT-qPCR
|
POC LFIA
|
COVID-19 RT-qPCR negative patients
|
Burbelo 30
|
USA
|
Ab
|
S and N
|
Inhouse
|
RT-PCR
|
LIPS
|
Subjects with COVID-19-like symptoms or household
contacts of persons with COVID-19 (not tested by PCR),
and blood donors who donated samples before 2018.
|
Key
- Studies with P superscripts were published articles and without P superscripts were MedRxiv preprints.
- Studies with CS superscripts are cross sectional studies and without CS superscripts are case control studies.
- IgG-IgM means that either one of them or both were detected in serum.
- Ab means total antibodies.
Methodological quality of included studies
The methodological quality of the included studies for the IgG or IgM or IgG-IgM based LFIA, CLIA and ELISA summarised across all studies are shown in Figures 2b, 3b and 4b. Figures 2a, 3a and 4a show for the risk of bias and applicability concerns summary results for the LFIA, CLIA and ELISA individual studies respectively. None of the studies included in this review had low risk of bias in all four QUADAS-2 domains. Generally case control studies were of high risk of bias and high concern in the patients and timing and flow domains and cross sectional studies were of low risk of bias and low concern in all domains.
Patient selection domain
Generally most studies included were at risk of bias and had high concerns regarding applicability. Studies were mostly case control studies and they did not include a consecutive or random series of participants implying that the patients that were included are not representative for clinical use. All thirteen ELISA studies were at high risk of bias and had high concerns regarding applicability. For CLIA, all the 9 studies included had high risk of bias and only 1 cross sectional study had low applicability concerns. Generally LFIA had more studies (n=4) with low risk of bias and applicability concerns in the patient selection domain because there were 4 LFIA cross sectional studies.
Index test domain
The LFIA studies had a high risk of bias (9/14) and high applicability concerns (12/14) in the index test domain. The high risk of bias was due to no blinding between the index test and the reference test. The high applicability concerns were due to tests using serum or plasma instead of whole blood which would make the test less amenable to use at the point of care. The CLIA and ELISA studies generally had a low risk of bias (6/9 and 8/13 respectively). This was because most studies were automated and had a pre-specified threshold (cut-off value to decide whether a test is positive or negative). The studies that had high risk of bias did not have a pre-specified threshold. Likewise, CLIA and ELISA studies generally had low applicability concerns in the index test domain (5/9 and 8/13 respectively) because they used commercial index tests.
Reference standard domain
Like the index test domain, studies generally had a low risk of bias (10/14, 8/9 and 10/13 for LFIA, CLIA and ELISA respectively) in the reference standard domain. Generally the studies were of low applicability concern, 10/14, 8/9 and 11/13 for LFIA, CLIA and ELISA respectively.
Flow and timing domain
All the CLIA (n=9) and ELISA (n=13) studies were at high risk of bias in the flow and timing domain. These studies were all case control studies. Most of the LFIA studies were also at a high risk of bias however 4 cross sectional LFIA studies were at low risk of bias.
Figure 2. LFIA methodological quality summary table and graph.
Figure 3. CLIA methodological quality summary table and graph.
Figure 4. ELISA methodological quality summary table and graph.
Quantitative synthesis and meta-analysis
Firstly, we considered performance of the LFIA devices using rRT-PCR-confirmed cases as the reference standard. The forest plots in Figure 5 show the sensitivity, specificity range, and heterogeneity for the three IgG or IgM or IgG-IgM based LFIA detecting COVID-19 across the included studies. Overall, the sensitivity varied widely across studies in contrast to the specificity which did not vary much except for 2 studies, Yunbao Pan, 2020 and Qiang Wang, 2020, which had the lowest and second lowest specificities respectively. Amongst the IgG based LFIA tests (n=17) the sensitivity estimates ranged from 0.14 (95% CI 0.09-0.21) (Imai, 2020) to 1.00 (95% CI 0.77-1.00) (Qiang Wang, 2020) and specificity estimates ranged from 0.41 (95% CI 0.21-0.64) (Yunbao Pan, 2020) to 1.00 (95% CI 0.97-1.00) (Bin Lou, 2020) (Figure 5a). For the IgM based LFIA tests (n=16) the sensitivity estimates ranged from 0.05 (95% CI 0.01-0.18) Adams (assay 4 to 1.00) (95% CI 0.77-1.00) (Qiang Wang, 2020) and specificity estimates ranged from 0.64 (95% CI 0.41-0.83) (Yunbao Pan, 2020) to 1.00 (95% CI 0.94-1.00) (Adams assays 4 and 5) (Figure 5b). For the IgG-IgM based LFIA tests (n=24), the sensitivity estimates ranged from 0.18 (95% CI 0.08-0.34) (Cassaniti, 2020) to 1.00 (95% CI 0.77-1.00) (Qiang Wang, 2020), with most of the studies having sensitivities over 0.55 and specificity estimates ranged from 0.36 (95% CI 0.17-0.59) (Yunbao Pan, 2020) to 1.00 (95% CI 0.94-1.00) (Adams assays 2 and 3) (Figure 5c)
We then considered performance of the different IgG or IgM or IgG-IgM based CLIA test using rRT-PCR-confirmed cases as the reference standard (Figures 6a, 6b and 6c). Considering any positive result (IgM positive, IgG positive or both), CLIA serological tests achieved sensitivity ranging from 0.48 (95% CI 0.29-0.68%) (Yujiao Jin, 2020) to 1.00 (95 % CI 0.79-1.00) with most studies being between 0.80 and 1. The specificity was over 0.80 in most tests except for 2 tests, one IgG based test and one IgM based test which had the lowest 0.00 (95% CI 0.00-0.009) and second lowest 0.15 (95% CI 0.06-0.30) specificities respectively.
Lastly, we evaluated the performance of the different IgG or IgM or IgG-IgM based ELISA tests using rRT-PCR-confirmed cases as the reference standard (Figures 7a, 7b and 7c). The sensitivities and specificities were generally high, ranging from 0.80 to 1.00 and 0.95 to 1.00 in most studies. For all the IgG based ELISA tests (n=10), the sensitivity estimates ranged from 0.65 (95% CI 0.57-0.72) (Zhao, 2020) to 1.00 (95% CI 0.79-1.00) (Kai-Wang To, 2020) and specificity estimates from 0.86 (95% CI 0.51-0.89) to 1.00 (95% CI 0.98-1.00) (Ling Zhong, 2020) (Figure 7a). In the IgM based tests (n=11), the sensitivity and specificity in the individual studies ranged from 0.44 (95% CI 0.32-0.58) (Jie Xiang, 2020) to 1.00 (95% CI 0.77–1.00) (Qiang Wang, 2020) and 0.69 (95% CI 0.57-0.80) (Qiang Wang, 2020) to 1.00 (95% CI 0.99–1.00) (Ling Zhong, 2020), respectively (Figure 7b). The sensitivity across the 5 studies included in the IgG-IgM based ELISA tests ranged from 0.80 (95% CI 0.74-0.85) (Wanbing Liu, 2020) to 0.87 (95% CI 0.77-0.94) (Rongqing Zhao, 2020). On the other hand, specificity across the 5 studies ranged from 0.97 (95% CI 0.92-0.99) (Lei Liu, 2020) to 1.00 (95% CI 0.98-1.00) (Rongqing Zhao, 2020) (Figure 7c).
Figure 5: Forest plot of sensitivity, specificity and heterogeneity of serological LFIA diagnosis of COVID-19. 5a Forest plot for the IgG LFIA. 5b Forest plot for the IgM based LFIA. 5c Forest plot for the IgG-IgM based LFIA.
Figure 6: Forest plot of sensitivity, specificity and heterogeneity of serological CLIA diagnosis of COVID-19. 6a Forest plot for the IgG CLIA. 6b Forest plot for the IgM based CLIA. 6c Forest plot for the IgG-IgM based CLIA.
Figure 7: Forest plot of sensitivity, specificity and heterogeneity of serological ELISA diagnosis of COVID-19. 7a Forest plot for the IgG ELISA. 7b Forest plot for the IgM based ELISA. 7c Forest plot for the IgG-IgM based ELISA.
We also constructed the SROC curves for all the three antibody based serological tests, figure 8. However we did not calculate the area under the ROC (AUROC). From the SROC we visually assessed heterogeneity between the different tests. Diagonal line indicated useless tests and the best tests were clustered further up to the top left hand corner.
Figure 8: Summary ROC curves for the three antibody serological test groups. Every symbol reflects a 2 × 2 table, one for each test. One study may have contributed more than one 2 × 2 table. The curves are shown for the different test-types.
The bivariate model and the hierarchical summary receiver operating characteristic curve (HSROC) model were performed to evaluate the diagnostic accuracy of the serological tests. The outputs of the meta-analysis (bivariate and HSROC parameter estimates, as well as the summary values of sensitivity and specificity) are presented in Table 2 and Figure 9. The pooled sensitivity for the IgG, IgM and IgG-IgM based LFIA tests were 0.5856, 0.4637 and 0.6886 respectively compared to rRT-PCR. The pooled sensitivity for the IgG and IgM based CLIA tests were 0.9311 and 0.8516 respectively compared to rRT-PCR. The pooled sensitivity for the IgG, IgM and IgG-IgM based ELISA tests were 0.8292, 0.0.8388 and 0.8531 respectively compared to rRT-PCR. All the tests had high specificities ranging from 0.9693 to 0.9991 compared to rRT-PCR. The estimated SROC curves for bivariate models are not presented.
HSROCs were also used to visually access the overall performance of the diagnostic tests, to access the overall diagnostic accuracy of the tests and to compare the diagnostic accuracy of the different tests used for diagnosing COVID-19 in the review (Figure 9). The overall diagnostic test accuracy was measured by the proximity of the curve to the top left corner which represents high sensitivity and specificity. The closer the curve was to the upper left hand corner, the better the diagnostic accuracy 31. From figure 9 it can be observed that ELISA and CLIA have better diagnostic accuracy compared to LFIA and IgG-IgM based ELISA tests have the best overall diagnostic test accuracy. Of importance, it is noteworthy that in the study the evidence base was too weak to definitively state that one class of test was more accurate than the other class of tests.
Table 2: Summary estimates of test accuracy.
Test type
|
Antibody type
|
Number of studies/tests
|
Sensitivity (95 %-CI)
|
Specificity (95 %-CI)
|
Correlation
|
LFIA
|
IgG
|
17
|
58.56 (43.97-71.79)
|
98.96 (95.61-99.76)
|
-0.4454
|
CLIA
|
IgG
|
9
|
93.11 (93.09-93.12)
|
97.57 (97.57-97.58
|
-0.511
|
ELISA
|
IgG
|
10
|
82.92 (74.16-89.15)
|
99.48 (96.75-99.92)
|
-0.1709
|
LFIA
|
IgM
|
16
|
46.37 (30.16-0.6339)
|
97.34 (92.75-99.05)
|
-0.7925
|
CLIA
|
IgM
|
10
|
85.16 (73.56-0.9221)
|
96.93 (85.5-99.41)
|
-0.7074
|
ELISA
|
IgM
|
11
|
83.88 (0.7307-0.909)
|
99.91 (97.78-100)
|
-0.7247
|
LFIA
|
IgG-IgM
|
24
|
68.86 (58.78-77.42)
|
97.57 (94.66-98.92)
|
0.1011
|
CLIA
|
IgG-IgM
|
3
|
-
|
-
|
-
|
ELISA
|
IgG-IgM
|
5
|
85.31 (78.51-90.23)
|
99.01 (92.87-99.87)
|
-0.6771
|
Figure 9: Hierarchical summary receiver operating characteristic (HSROC) curve obtained using 0penMeta-Analyst. Every circle represents the sensitivity and specificity estimates of individual studies in the meta-analysis and the size of the circle reflects the sample size. The black dots indicates summary points of sensitivity and specificity; HSROC curve is the line passing through summary point. The curve is the regression line that summarises the overall diagnostic accuracy. a HSROC for IgG serological tests; b HSROC for IgM serological tests and c HSROC IgG-IgM serological tests. 1 LFIA HSROC, 2 CLIA HSROC and 3 ELISA HSROC.
We identified one study (Burbelo, 2020) reporting total antibody (Ab) based luciferase immunoprecipitation assay system (LIPS) using N and S antigens with sensitivities and specificities of 0.91 (95 % CI 0.77-0.99) and 1.00 (0.80-1.00) and 1.00 (0.92-1.00) and 1.00 (0.92-1.00) respectively. We also identified studies reporting other Ab based serological assays and IgA based serological assays but results are not reported in this review.
Heterogeneity investigations
Generally high overall I^2 values above 85 %, which indicate high heterogeneities, were observed for both the sensitivities and specificities when we performed antigen subgroup meta-analysis with the exception of IgG-IgM based ELISA. IgG-IgM based ELISA had an overall sensitivity I^2 value of 52. 12 % which is considered moderate heterogeneity and overall specificity I^2 value of 0 % which is considered to be low heterogeneity. However it should be noted that only 5 studies were included for this subgroup meta-analysis. Overall I^2 values for sensitivities and specificities heterogeneities for the antigen type subgroup meta-analysis are shown in Table 3. We did not investigate heterogeneity for LFIA because most studies included in the review did not specify the type of antigen they used in their serological tests.
Detailed results of heterogeneity for the different antigen type sensitivities and specificities for each test type and antibody type combination are presented in Additional file 2.
Table 3: Overall antibody type subgroup meta-analysis heterogeneity
Test type
|
Antibody type
|
Heterogeneity (I^2 )
|
Sensitivity
|
Specificity
|
LFIA
|
IgG
|
-
|
-
|
IgM
|
-
|
-
|
IgG-IgM
|
-
|
-
|
CLIA
|
IgG
|
93.56 %
|
86.5 %
|
IgM
|
93. 42 %
|
95.17 %
|
IgG-IgM
|
-
|
-
|
ELISA
|
IgG
|
78.07 %
|
84.97 %
|
IgM
|
85. 47 %
|
90.08 %
|
IgG-IgM
|
52.12 %
|
0 %
|
Test sensitivity by time since onset of symptoms
Figure 10 shows forest plots for antibody positive rates for IgG (25 tests), IgM (22 tests) and IgG-IgM (30 tests), stratified by days since initial symptom onset to specimen collection. The sensitivity of the serological tests generally increased with increased time from symptoms onset Regardless of test method (ELISA or CLIA or LFIA) the sensitivities for IgG and IgM based tests were generally low in the first week (1-7 days) of symptom onset followed by the second (7-14 days) and the sensitivities were generally highest in the third week or later (>14 days) for each test. Data on specificity stratified by specimen collection since symptom onset was not available for all the studies.
Figure 10: Forest plot of studies evaluating tests for detection of IgG, IgM and IgG-IgM according to days since COVID-19 symptom onset to specimen collection. In brackets () are the number of days since symptom onset to specimen collection. Artron, Auto Bio CTK Biotech CTK Biotech are test names all reported in a study by Lassaunire et al.