The outbreak of coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was initially reported in Wuhan, China since December, 2019. Here, we reported a timely and comprehensive resource named iCTCF to archive 256,356 chest computed tomography (CT) images, 127 types of clinical features (CFs), and laboratory-confirmed SARS-CoV-2 clinical status from 1170 patients, reaching a data volume of 38.2 GB. To facilitate COVID-19 diagnosis, we integrated the heterogeneous CT and CF datasets, and developed a novel framework of Hybrid-learning for UnbiaSed predicTion of COVID-19 patients (HUST-19) to predict negative cases, mild/regular and severe/critically ill patients, respectively. Although both CT images and CFs are informative in predicting patients with or without COVID-19 pneumonia, the integration of CT and CF datasets achieved a striking accuracy with an area under the curve (AUC) value of 0.978, much higher than that when exclusively using either CT (0.919) or CF data (0.882). Together with HUST- 19, iCTCF can serve as a fundamental resource for improving the diagnosis and management of COVID-19 patients.
Authors Wanshan Ning, Shijun Lei, Jingjing Yang, and Yukun Cao contributed equally to this work.