On 6 of July, 2020, the search found 1405,7 abstracts from 26,980 published articles. As expected, we observed an exponential increase in publication never seen in the recent scientific literature history (Fig. 1). These articles were published manly as Journal Articles (60.8%), Letters (17.09%), Editorials (6.84%), Reviews (6.51%) and Comments (2.03%). We excluded articles without available abstract (12,923) and applied the word and sentence tokenization methodology. Then, using the countrycode R package [19], we calculated how many times a country was cited in the abstract and the article filiation. United States (43.59%), United Kingdom (16.63%), China (11.25%), Italy (5.71%) and Spain (5.35%) were the main source of scientific literature. About 82.53% of articles analyzed came from these five countries.
Using the atomization process, 75,368 words/terms were found. Of these, 7,899 common words were excluded, remaining 67,469 words. The ten most cited terms are demonstrated in Table 1. After that, we selected the 50 most recurrent words in the abstracts to continue the investigation (Supplementary Table 2). Our analysis suggests that the scientific focus, until now, has been to summarize the main clinical symptoms of COVID-19. It is also possible to infer that many articles were driven to describe the virus spreading. The other scientific efforts discussed were about the transmission, prevention, treatment, health care management and diagnosis of SARS-CoV-2 and COVID-19.
Table 1
The ten most cited words in COVID literature.
Global | n | Diagnose | n | Treatment | n | Epidemiology | n | Transmission | n | Signs | n |
disease | 11738 | disease | 776 | treatment | 1935 | disease | 377 | transmission | 991 | disease | 3192 |
pandemic | 10117 | diagnosis | 700 | disease | 1660 | clinical | 308 | disease | 616 | clinical | 2864 |
health | 9418 | clinical | 649 | clinical | 1366 | health | 264 | infection | 520 | pandemic | 2599 |
infection | 7760 | infection | 566 | pandemic | 1256 | infection | 241 | health | 519 | health | 2288 |
clinical | 7598 | pandemic | 548 | severe | 1116 | epidemiological | 236 | pandemic | 455 | infection | 2118 |
respiratory | 6821 | respiratory | 460 | infection | 1069 | pandemic | 235 | respiratory | 400 | severe | 1954 |
severe | 6767 | treatment | 458 | care | 958 | respiratory | 203 | during | 390 | respiratory | 1838 |
care | 6453 | severe | 444 | respiratory | 920 | severe | 197 | virus | 382 | during | 1713 |
during | 6242 | study | 415 | health | 879 | study | 188 | risk | 380 | care | 1668 |
risk | 5311 | during | 406 | during | 812 | risk | 145 | study | 282 | study | 1663 |
Since our platform is based on published data, we are not reporting available pre-prints articles. We chose to analyze and share literature that had undergone a strict reviewing process, thus reporting validated findings.
Based on global words tokenization/atomization from the analyzed abstracts, we categorized the studies in five categories. Respectively, 1,999, 4,260, 1,038, 1,834 and 8,584 abstracts were classified in the categories as Diagnosis, Treatment, Epidemiology, Transmission and Clinical & Signs & Symptoms (Supplementary Fig. 1). Twenty-eight articles hit all five criteria simultaneously (Fig. 2) and 3,374 abstracts were not categorized. Diagnosis studies have been focusing on clinical diagnosis of the acute symptoms, mainly respiratory. The terms "PCR" or "qPCR" were rarely found in the abstracts. Curiously, a small quantity of molecular diagnosis was cited and consequently discussed. We are sensitive to this matter, since molecular or antibody detection tests (qPCR and ELISA/CLIA, respectively) are considered golden standard for diagnosis. Treatment focused in the clinical treatment of the severe acute respiratory syndrome and pneumonia. Health care management was highly mentioned. The use of antivirals was suggested, but no specific drugs were found to be relevant. The words “therapy”, “drugs”, “trials” and “effective” indicate that investigations into forms of treatment are currently being conducted. Despite that, we implement a Panel Drug at PlatCOVID to list all cited drugs. Epidemiology studies have been focusing on clinical and infection features of the disease as well as on the transmission risks. Epidemiological data from pneumonia status seems to be relevant to medical prevention and treatment during COVID-19 pandemic. Transmission studies have reported how the disease is transmitted by respiratory routes. The terms "transmission", "disease", and "infection" were highly cited in the abstracts, suggesting that forms of infection play an important role in epidemic transmission. Articles categorized as Clinical & Signs & Symptoms were the most abundant in the general analysis. In detail, these studies discussed the severe acute respiratory conditions and pneumonia symptoms in the infected group, being "acute", "pneumonia" and "lung" common terms used to describe patient's clinical condition.
Moreover, the most frequent terms (Table 1) indicate the importance of determining the clinical aspects of the infection. Taking all these findings into account, the primary scientific response during the pandemic seems to be focused into the report of main clinical signs and symptoms in order to extend this information to appropriate treatment and patient management. Nevertheless, a new perspective in molecular treatment and diagnosis shall be critical to face COVID-19.
The translation scientific language is a continuous challenge. The scientific perception and fake news circulating with dramatic frequency in the media and social networks could misunderstand the real meaning of scientific evidence. Thus, we implemented a Web platform dedicated to COVID-19 scientific literature that is able to automatically analyze, classify and evidence the important information of published articles.