The worldwide outbreak of the COVID-19 has become a global pandemic resulting in millions of confirmed cases and hundreds of thousands of deadths. To face such a global crisis, bioinformatics has played a key role in the diagnosis, follow-up, prognosis and treatment of COVID-19-infected patients.
A novel bioinformatic tool for metagenomic analysis of whole genomes is proposed in this paper that is composed of three projections: global, clustering and genomic index. For each projection, key modules are described. Global projection provides various combinatorial distributions for a whole genome of N length, and the m-mer scheme partitions this sequence as M segments on 1D, 2D and 3D density matrices for multiple projections. Clustering projections based on distributions from global projections make special filters extract specific parts as probability eigenvalues.
Genomic index projection provides comprehensive technologies under the theory of information entropy, and a list of measuring entropies are included, such as combinatorial entropy CE, integrated entropy IE, mean entropy ME and topological entropy TE. Three projections provide unified information to describe complicated functions, internal structures and refined variations for multiple groups of SARSCoV-2 on variations and other genomes in comparisons.
The outputs of three projections are illustrated on variant maps to support category, clustering, classification and establishing root activities for refined quantitative operations from bottom to top strategy.