The congruence between the phylogeny of Nitrososphaeria and their ecological preferences was relatively limited, suggesting the existence of a considerable ecotypical intra-diversity within this class [6] due to high diversification rates [17]. These observations corroborate work conducted on marine systems, which shows that closely related AOA isolates could differ in a range of physiological traits [72, 73]. Extensive niche specialization could be due to autotrophic vs. mixotrophic/heterotrophic growth and preferences for C compounds [20] in addition to differences in affinity for ammonia [21, 22]. Observed differences in the balances between lineages at fine phylogenetic scale also reflect a low level of intra-genomic heterogeneity in the 16S rRNA gene [74], and highlight the relevance of using ASVs to study the environmental determinants of soil Nitrososphaeria (more than 10 times as many nitrososphaerial ASVs than OTUs were detected).
By using MRT, we could evaluate the hierarchical effects of environmental filters in the delineation of ecological preferences of nitrososphaerial lineages. The machine learning approach used also allowed us to go beyond simple correlations when exploring drivers of archaeal diversity. Across our broad geographical and environmental gradients, MAT was the most prominent variable in the MRT analysis and ranked among the best predictors of evenness. Temperature affects composition and nitrification activity of AOA communities [25, 26, 75], but since MAT and latitude correlated (Spearman’s r = 0.9), MAT could also reflect the importance of spatial distance. Soil pH was another significant predictor of phylogenetic diversity and evenness of Nitrososphaeria across the European gradient. This agrees with the dominant idea that soil pH drives diversification of terrestrial Thaumarchaeota at broad phylogenetic scales [17–19, 27]. However, the balance shifts in the shallow nodes in our study suggest recent adaptations in relation to soil pH. This would imply multiple independent specializations to acidic pH in Nitrososphaeria, which support the conclusions of a recent study based on analyses of the amoA gene [6]. The limited genomic information available on Nitrososphaerales prevents us using the congruence between 16S rRNA and amoA phylogenies [6, 27] to match the clades observed in this study for identifying amoA-based low-pH AOA lineages (within clades NS-α, -β, -γ and -ζ in [6]). Nevertheless, since pH controls the equilibrium between ammonia and ammonium in soils, these adaptations could reflect differences in substrate affinity [21, 22], possibly through distinct molecular adaptations of the ammonia monooxygenase [76]. Different N-related variables such as CEC, C:N ratio were also important predictors of the diversity and/or evenness of Nitrososphaeria and for the abundance of AOA, it was Total N. These results illustrate the expected selective role played by soil N, and are consistent with demonstrated effects of soil C:N ratio [4, 77] and ammonium supply [78, 79] on the abundance and structure of soil AOA communities. Random forest modeling revealed that nearly of the above-mentioned relationships were non-linear, and some ALE curves suggested the existence of thresholds (TH). The increase of nitrososphaerial PD in the low range of the C:N ratio (TH ~ 10) could be due to preferences for mineralized N from organic matter [80]. This aligns with the observed sharp increase of archaeal nitrification rates at C:N ratios below 20 [81], indicating a coupling between nitrososphaerial diversity levels and nitrification activity. However, the high number of significant shallow nodes in both high and low C:N ratio clusters in Southern Europe shows that niche specialization in Nitrososphaeria extends past preferences in soil C:N. For example, total N was a better predictor for AOA abundances than C:N, showing a strong positive effect across the measured range, although it appeared weaker between 2-3.5 g N kg− 1 soil. The capacity for cation exchange was also important, likely by influencing the retention of ammonium. We identified a threshold above which CEC had a strong negative effect on nitrososphaerial PD (TH ~ 15 cmol kg− 1). Soil CEC is linked to soil texture and clay content has previously been reported to negatively affect the abundance of AOA [13].
For the overall soil archaeal communities, mean annual temperature (MAT) was, similarly as for the Nitrososphaeria, an important factor and the main driver of β-diversity. However, the importance of MAT also reflects the effect of distance on the changes in community composition, which was further supported by the high fitness value of the distance-decay relationship (R2adj = 0.22) along the large gradient studied. This observation contrasts with a recent survey where substantially weaker decays of community similarity were found in archaeal communities in maize and rice fields across Eastern China [82]. At large spatial scales, distance-decay relationships are typically influenced by dispersal limitation (stochastic process) and species sorting (deterministic process), i.e. the combined effect of environmental filtering and biotic interactions [83]. In the present study, dispersal limitation and environmental filtering had comparable effects on the overall β-diversity of archaea, whereas the few other studies of archaeal communities across broad spatial scales in arable soils have reported larger sorting:dispersal effect ratios [40, 82]. The relative importance of dispersal limitation in the present study could have been overestimated, since some of the variation explained by spatial factors alone likely encompass unmeasured environmental variables and we could have missed environmental variables that are relevant for archaea. Nevertheless, the use of ASVs instead of OTUs and the removal of the rare taxa combined with broader environmental and geographic gradients, as we did, should lead to a more accurate assessment of the relative importance of environmental filtering versus dispersal limitation for the assembly of archaeal communities when compared to other studies [40, 82]. Our results thus indicate that the structure of archaeal communities associated to fields under cereal cultivation could be less sensitive to changes in environmental conditions than previously thought. Balanced effects of stochastic versus deterministic processes were recently found to promote diverse, yet uneven ecosystem functions in a study comparing three types of agroecosystems [84]. This interpretation of effects of balanced stochastic and deterministic processes fits with the observed low evenness of the archaeal community and the presence of a high diversity of low abundant groups harboring potentially diverse functional capabilities at the continental scale.
Archaeal classes present at low abundance exhibited different responses to the environmental factors evaluated. Soil pH displayed a negative relationship with both PD and evenness in Group 1.1c thaumarchaeota. Accordingly, these microorganisms have mainly been detected in acidic environments [85, 86]. The few known representatives of this group do not have amoA homologs and do not produce nitrite or nitrate in culture. It is therefore hypothesized that Group 1.1c is a non-ammonia oxidizing lineage within the Thaumarchaeota [71, 87], which is supported by the lack of significant effects of N-related variables observed in this study. Woesearchaeia and the two classes of methanogens shared only a few environmental preferences (i.e. elevation and clay content) with regards to α-diversity, despite recent findings that they that are potential metabolic partners [53]. This partnership or their importance in arable soils thus remain elusive. Finally, diversity and evenness of the Thermoplasmata, which was the second most abundant class in terms of both ASVs and relative abundance, appeared to be predominantly driven by MAT and soil texture along the gradient. Their ecology in agricultural soils is largely unknown as Thermoplasmata have mainly been studied in acid mine drainage [88] and hot environments [89]. Members of this class however account for ca. 5 % of archaeal sequences in global soil samples [3] and have been detected at high abundances in deeper soil layers in both boreal [90] and temperate, acidic deciduous forests [91]. There are few cultivated representatives, but more than 400 genomes available that suggest a versatile metabolic potential for this class, including methanogenesis, sulfur cycling and even dinitrogen fixation [92]. Sulfur cycling was also detected in a genome obtained from peat soil [87], while methanogenic Thermoplasmata appear to be widespread in wetlands [93]. Several low-abundant archaeal lineages could thus be involved in C and S cycling and play a more important role than previously thought in arable soils.