Background: In microbiome data analysis, unsupervised clustering is often used to identify naturally occurring clusters, which can then be assessed for associations with characteristics of interest. In this work, we systematically compared beta diversity and clustering methods commonly used in microbiome analyses. We applied these to four published datasets where highly distinct microbiome profiles could be seen between sample groups.
Results: Although no single method outperformed the others consistently, we did identify key scenarios where certain methods can underperform. Specifically, the Bray Curtis metric resulted in poor clustering in a dataset where high-abundance OTUs were relatively rare. In contrast, the unweighted UniFrac metric clustered poorly when used on a dataset with a high prevalence of low-abundance OTUs. To test our proposition, we systematically modified properties of the poorly performing datasets and found that this approach resulted in improved Bray Curtis and unweighted UniFrac performance. Based on these observations, we rationally combined the Bray Curtis metric and the unweighted UniFrac metrics and found that this new beta diversity metric showed high performance across all datasets. We also evaluated our findings by examining a clinical dataset where clusters are less separated.
Conclusions: Our systematic evaluation of clustering performance in these five datasets demonstrates that there is no existing clustering method that universally performs best across all datasets. We propose a combined metric of Bray Curtis and unweighted UniFrac that capitalizes on the complementary strengths of the two metrics.