Preprints intend to accelerate the access to preliminary data for the scientific community, mainly to receive rapid feedback prior to entering a peer-review process, which is a requirement for publication in the majority of indexed journals (Mudrak, 2020). MedRxiv and bioRxiv, widely-known preprint servers (Else, 2019), have a disclaimer on their homepage that states: “these are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information”.
Scientific production has drastically increased due to the COVID-19 pandemic. Approximately 900 articles, including published works and preprints, were published prior to March 12, 2020 (Callaway et al., 2020). A significant proportion of the preprint surge could be a result of scholarly journals requiring that works submitted for peer-review must be made simultaneously available for the public on a preprint server (Fidahic, Nujic, Runjic, Civljak, Markotic, Lovric Makaric, et al., 2020).
In spite of the fact that we identified a considerably shorter time to acceptance due to the COVID-19 crisis in the group of journals we analyzed, the publication rate for the preprints included in our sample was 23.7%. A recent preliminary analysis estimated that the average turnaround times for scholarly journals during the COVID-19 pandemic was 60 days (Horbach, 2020). Considering the date of posting on a preprint server, our sample had a median follow-up of 119 days (min: 87, max: 150) to determine publication status. Thus, we consider that the follow-up we provided was enough to identify the majority of preprints that would eventually be published, and we consider that it is unlikely that the publication rate in our sample will be altered significantly.
In comparison to our findings, in previous infectious outbreaks such as Zika or Ebola, the publication rates for preprints on a peer-reviewed journal were approximately 60% and 48%, respectively. However, only 174 preprints were posted during the Zika outbreak (Nov 2015 to Aug 2017), while 75 preprints were posted during the Ebola outbreak (May 2014 to Jan 2016) (Johansson et al., 2018). Until June 28, 2020, 4651 preprints related to COVID-19 have been posted on medRxiv, while 1189 preprints were posted on bioRxiv, evidencing the drastic increase in preprint production during the COVID-19 pandemic. As a matter of fact, the scientific community has never before produced so much non-peer reviewed data (Dinis, 2020).
In our sample, we found that preprints that were eventually published in a scholarly journal had a significantly higher number of citations when compared to preprints that remained unpublished. Even though we did not directly evaluated the quality of the preprints, the number of citations is an indicator of the scientific impact, which is one of the components of the concept of scientific quality (Aksnes et al., 2019). Thus, considering the publication rate and the lower citation count we identified in our sample, we could assume that some of these preprints may not have the quality needed to go through the scrutiny of a peer-review process in order to be published on a scholarly journal. Further studies to directly assess the quality of preprints posted during the COVID-19 pandemic are required.
We did not find significant differences in terms of metrics when we compared unpublished preprints to preprints that were subsequently published on journals. This raises the concern that the scientific community, and the general population, may read and share preprints that do not have enough quality to go through a peer-review process.
We found that half of the preprints that were subsequently published had significant modifications in the result section, which suggests that preprints can change importantly after peer-review, raising concerns on the possibility of significant errors in the data analysis of preprints that are not peer-reviewed and published, as previously reported (Else, 2019; Heimstädt, 2020).
Some preprints might contain essential and time-sensitive information. For example, a study showed that the basic reproduction number, R0, calculated using data available on preprints was not different to the one estimated in peer-reviewed articles (Majumder & Mandl, 2020), and preprints on the viral sequence and structure have allowed for early investigation of potential therapeutic options and vaccines (Brainard, 2020; Kwon, 2020).
While there is a widespread agreement that preprints could be useful in the current context, there are significant risks associated with the potential spread of faulty data without appropriate third-party screening (Dinis, 2020). Lack of a peer-review process in preprints may be an important implication, due to the fact that the basic screening process employed by preprint servers may not be enough to avoid the dissemination of flawed information (Rawlinson & Bloom, 2019). For example, a preprint that was posted on bioRxiv suggested significant molecular similarities between SARS-CoV-2 and HIV (Kwon, 2020). This preprint was later withdrawn from the server, however, by the time that happened, it had already sparked controversy and conspiracy theories. To our concern, we found that during the COVID-19 pandemic multiple preprints have been used in the development of clinical guidelines and public health policies (Bhimraj et al., 2020; Heimstädt, 2020; Majumder & Mandl, 2020; Nicolalde et al., 2020).
In spite of the fact that peer-review aims to be an exhaustive and thorough process that improves the quality of a manuscript, articles published on a peer-reviewed journal should not be taken as non-refutable knowledge. To illustrate this, a couple of peer-reviewed articles have been recently withdrawn from two prestigious journals due to significant concerns on primary data validity(M. Mehra et al., 2020; M. R. Mehra et al., 2020).
Main limitations to our study include the fact that we only included preprints on pharmacological interventions against COVID-19. As well, we only used medRxiv and bioRxiv as preprint servers to obtain our sample. However, due to the follow-up of our study, it seems unlikely that the publication rate would be altered significantly.