COVID-19 patients worldwide are conveniently described by position information to collect samples, and modern GIS maps are useful to show influenced flows and numbers of patients on various regions of a pendamic. From an analysis viewpoint, it is more interesting to organize genomic information into a phylogenic tree with multiple branches and leaves in representations. Clusters of genomes are organized as phylogenic trees to represent intrinsic information of genomes. However, there are structural difficulties in projecting phylogenetic information into 2D distributions as GIS maps naturally.
Considering advanced generating schemes of phylogenetic trees, information entropy provides ultra optimal properties in the minimum computational complexity, superior flexibility, better stability, improved reliability and higher quality on global constructions.
In this paper, a novel projection is proposed to arrange SARS-CoV-2 genomes by genomic indexes to make a structural organization as 2D GIS maps. For any genome, there is a unique invariant under certain conditions to provide an absolute position on a specific region. In this hierarchical framework, it is possible to use a visual tool to represent any selected region for clustering genomes on refined effects. Applied diversity measure to a given set of genomes, equivalent clusters and complementary visual effects are provided between genomic index maps and phylogenetic trees. Sample genomes of three UK new lineages are aligned by BLAST as a basis on both RNA-dependent RNA polymerase RDRP segments and whole genomes. Selected regions and various projections show spread effects of five thousand SARS-CoV-2 genomes in 72 countries on both RDRP and whole genomes, and six special countries/regions are selected on genomic index maps.
Based on genomic index maps, one SNV of two genomes on B.1.1.7 lineage can be identified from a unit of 10^4 probability measure to a unit of 10^6 difference for genomic indexes on a special ‘G’ projection to extract the finest variation.
Further exploration on optimal classification and phylogenetic analysis of genomic index maps and phylogenetic trees on SARS-CoV-2 genomes worldwide are discussed.