Study participants
Eighty-one participants recruited within B-PREDICT provided a FIT tube and a stool nucleic acid collection and preservation tube (Norgen). The median age of patients was 63.4 years and the median BMI was 27.6. Additionally, five healthy volunteers were recruited with a median age of 30.8 years and a median BMI of 23.1 (Table 1).
Table 1 Characteristics of study participants
Norgen and FIT samples produce similar numbers of reads and sequences
The denoised reads contained 6,097 ASVs. Taxonomic classification of all ASVs yielded 241 species, 240 genera, 80 families, 47 orders, 21 classes and 14 phyla, with varying proportions of reads classified at each taxonomic rank (Fig. S1). The median richness was 263 ASVs for FIT samples and 265 ASVs for Norgen samples (Fig. 1D) and the median number of reads after filtering was 60,283 for FIT and 59,266 for Norgen (Fig. 1E).
Of the identified ASVs 1,029 (16.9%) were detected in more than 5% of the samples. The median prevalence of these ASVs (i.e. percentage of samples in which an ASV was detected) was 11.5% in FIT samples and 12.5% in Norgen samples (Fig. 1F).
Average CLR abundances similar between FIT and Norgen
The average CLR abundances of ASVs display high similarity between the FIT and the Norgen samples (Fig. 1A). Among the ten ASVs with the highest differences between sample types, seven are more abundant in FIT samples. Of these, the highest differences can be observed for an ASV belonging to the genus Escherichia-Shigella of the Phylum Proteobacteria and the rest belong to the genera Enterococcus, Lactococcus, Streptococcus, Leuconostoc. Of the three ASVs with higher abundance in Norgen samples one belongs to the genus Oscillibacter and two could not be classified on the genus rank. Complete results, including the comparisons on each taxonomic rank are available in Table S1 and Fig. S1.
ICCs positively associated with abundance of ASVs
The ASV-specific ICCs between the FIT and Norgen samples of the patients (Fig. 1B) display a positive association with the summed log abundances of the ASV. Low summed log abundances are in many cases accompanied by low ICCs and large confidence intervals. This indicates, that the estimates lack in precision for many of the rarer ASVs. Overall, the ICCs’ first quartile is 0.759, the median is 0.892, and the third quartile is 0.951. A common interpretation is, that an ICC higher than 0.75 indicates good reliability and an ICC higher 0.9 indicates excellent reliability (19). Fig. S1 provides visualizations of this analysis for taxonomic ranks from species to phylum and Table S2 contains complete ICC estimates and confidence intervals for all bacterial taxa and ASVs. These results confirm an association between abundances and reliability. Additionally, these results indicate higher reliability for higher ranks. This is probably due to the fact, that higher ranks result in deeper aggregation and higher proportions of classified reads. ICCs were also estimated based on the volunteer samples, which consisted of triplicates for each sample type. The “between FIT and Norgen” ICCs were therefore calculated based on the means of the respective samples. Additionally, this allowed for the estimation of the ICCs within the FIT and within the Norgen samples (Fig. S2). However, these were calculated as being obtained from three separate random raters (i.e. triplicate samples), resulting in lower and less stable estimates than the “between FIT and Norgen” ICCs, making a direct comparison of these results impossible. Nevertheless, this analysis shows that even separate stool samples of the same sample type and from the same subject contain noteworthy heterogeneity. The ICC estimates for the alpha and beta diversities and their confidence intervals can be seen in Fig. 1C. The Shannon, Simpson and Inverse Simpson indices all display ICCs above 0.75, with Shannon providing the highest reliability between FIT and Norgen. In the case of the beta diversities, the Bray-Curtis dissimilarity and the Jaccard index result in almost perfect agreement. Unweighted UniFrac also displays an excellent ICC, while the weighted version results in only good reliability.
Samples form subject-specific clusters
The inter-subject distances (i.e. all possible distances between two samples from different subjects) displayed a median of 82.4, a maximum of 113.0 and a minimum of 50.1, which is higher than the maximum of all intra-subject distances, namely 41.2. The intra-subject distances consist of the distances between the FIT and the Norgen samples of each patient (1 distance per patient; median = 26.5) and each volunteer (9 distances per volunteer; median = 26.4) as well as the distances between the FIT (3 distances per volunteer; median = 25.4), respectively Norgen (3 distances per volunteer; median = 23.5) triplicates of each volunteer (Fig. 2A). The intra-volunteer distances were significantly different (Kruskall-Wallis test: p = < 0.001) and of the subsequent pairwise tests only the comparison between “FIT to Norgen” distances and “Norgen to Norgen“ distances reached statistical significance (Wilcoxon test: p = < 0.001). Based on these distances a hierarchical clustering was performed on all samples. All samples clustered together according to the subject who provided them before being joined with samples of other subjects (Fig. 2B).
PCA of volunteer samples reveals no separability of FIT and Norgen samples
The first four principal components of the ASVs CLR abundances in the volunteer samples are shown in Fig. 3B and reveal no sample type-specific clusters. Only the samples recruited from volunteer no. 4 display some slight separability between FIT and Norgen samples. However, all other samples cluster randomly around a subject-specific centre, regardless of the sample type.
Differential abundance detected at various taxonomic ranks
Bacterial abundances of the patients’ FIT and Norgen samples were compared at all taxonomic ranks (species to phylum) and the significant results are presented in Fig.
3A as a taxonomic tree. Some branches of the tree display consistent differences between the sample types. For example, all significantly differentially abundant taxa belonging to the phylum Actinobacteriota or the class Bacilli are more abundant in FIT samples, while all the significant taxa belonging to the class Bacteroidales are more abundant in the Norgen samples. However, there are also inconsistent branches, like the family of Lachnospiraceae, which contain both, genera more abundant in FIT and genera more abundant in Norgen samples. Complete results of the ALDEx2 analysis are available in Table S3.
Table 2 List of full names for taxa displayed in Fig. 3A.
Effect sizes were calculated with ALDEx2 and are only shown for significant (after p-value correction) differences. Negative values indicate higher abundances in FIT samples, while positive values correspond to higher abundances in Norgen samples.
Sample type explains only small proportion of sum of squares
Linear models were fitted on the CLR abundances of the volunteer samples for all ASVs detected in at least 3 of the 30 samples. The resulting proportions of sum of squares explained by subject and sample type as well as the residual proportions are shown in a ternary plot in Fig. 4A and the corresponding boxplots in Fig. 4B. This shows, that most of the variance in the ASVs’ CLR abundances can be explained by the subject compared to only small amounts which are explained by the sample type. Some ASVs display a high proportion of residual variance, which overall constitutes a much bigger issue for the repeatability of results. This is also evident from the results of separate models for FIT and Norgen using ASVs detected in at least three samples of the respective type (Fig. 4C). This model specification shows, that amounts of variance explained by subject are slightly lower (i.e. residual variance is higher) for FIT than for Norgen. For both sample types, there is a peak at proportions near 1, which is slightly less pronounced for FIT and corresponds to a lower mean of 0.930 for FIT, compared to 0.936 for Norgen.