Non-invasive Phenotyping for Water and Nitrogen Uptake by Deep Roots Explored using Machine Learning

doi:10.21203/rs.3.rs-2608651/v1

Download PDF

Research Article

Non-invasive Phenotyping for Water and Nitrogen Uptake by Deep Roots Explored using Machine Learning

https://doi.org/10.21203/rs.3.rs-2608651/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 06 Sep, 2023

Read the published version in Plant and Soil →

You are reading this latest preprint version

Background and aims

Root distribution over the soil profile is important for crop resource uptake. Using machine learning (ML), this study investigated whether measured planar root length density (pRLD) at different soil depths were related to uptake of isotope tracer (15N) and drought stress indicator (13C) in wheat, to reveal root function.

Methods

In the RadiMax semi-field root-screening facility 95/120 different winter wheat genotypes were phenotyped for root growth in 2018/2019, respectively. Using the minirhizotron technique, root images were acquired across a depth range from 80 to 250 cm in May, June, and July and pRLD was extracted using a convolutional neural network. We developed ML models to explore whether the pRLD estimates at different soil depths were predictive of the uptake of deep soil nitrogen - using deep placement of 15N tracer as well as drought resilience potential using natural abundance of 13C isotope. We analyzed the correlations to tracer levels to both an analytical root depth estimation and an ML approach. We further analyzed the genotypic effects on root function using mediation analysis.

Results

Both analytical and ML models demonstrated clear correlations between pRLD distribution and resource uptake. Further, both models demonstrated that deep roots at approx. 150 to 170 cm depth were most important for explaining the plant content of 15N and 13C isotopes. The correlations were higher in 2018 than in 2019.

Conclusions

The results demonstrated that in the semi-field non-invasive root phenotyping setup, analytical and ML-based analysis provided complementary insight into the importance of deep rooting for water and nitrogen uptake.

Machine learning

deep rooting

deep resource uptake

random forest

13C

15N

To achieve sustainable and climate resilient crop production, we need crops to have deep and well-developed root systems, enabling efficient use of resources also from deeper soil layers. One way to improve crop root systems is through plant breeding, but to do this efficient root phenotyping is required, allowing the identification of genetic variability in the relevant root traits. Root measurements are notoriously difficult, as roots and their uptake activity are hidden in the soil and difficult to observe. Therefore, a range of facilities have been developed for root phenotyping (Cai et al. 2016; Eberbach et al. 2013; Svane et al. 2019b), but there is a need for further development as well as for verification of the validity of the results obtained.

Water and nitrogen are important soil resources considered in crop production. Efficient water use is needed for productivity and crop resilience towards dry periods. Efficient nitrogen use, particularly from deeper soil layers is needed to reduce the need for this energy demanding fertilizer and to minimize nitrogen leaching loss to the environment. Water and nitrate are more mobile in the soil medium than most plant nutrients, meaning that in periods with surplus water supply, they tend to move downwards to deeper soil layers (Thorup-Kristensen and Kirkegaard 2016). Thus, deeper rooting will likely increase the ability of crops to take up water and nitrogen when needed, and help to recover the part of these resources at maximum risk of loss.

This leads to a need to develop root phenotyping methods, but also to verify that deeper rooted phenotypes actually lead to increased water and nitrate uptake from the soil. The RadiMax root phenotyping facility (Svane et al. 2019b) was developed specifically for such studies. Images of roots in the soil are obtained through minirhizotrons, and data for planar root length density (pRLD) on individual images are obtained using the AI based image analysis tool RootPainter (Smith et al. 2022). The facility and experiments allow the use of isotope tracers as indicators of root activity, where we used deep injection of a 15N labelled isotope solution to study deep nitrogen uptake (Wacker et al. 2022). Natural abundance of 13C in grains was used to evaluate the genotypic potential for drought resilience. Due to natural isotope discrimination processes (Zhang et al. 2009), 13C composition could be used as an indicator for water stress.

The semi-automated root imaging process and use of fully automated deep learning for image analysis enable us to collect large amounts of root data containing information about root architecture. Using ML on these large datasets allows linking data from root images to root function and to maximize the information we can extract from the data.

Machine learning can be used to analyze hierarchical and non-linear relationships between predictor variables and response variables and this is often more effective than traditional linear regression methods (Baker et al. 2018). The size and quality of experimental multimodal data in agriculture are constantly growing. Thereby, ML offers potential for insight into complex agricultural relationships. Random Forests (RF) (Breiman 2001) is a widely used ML algorithm, that is known for allowing many variables with relatively small numbers of observations and in addition providing an assessment of variable importance (Breiman 2001; Ishwaran 2007; Strobl et al. 2007).

In soil science and hydrology, neural network models have been used extensively to estimate soil retention and hydraulic conductivity using a variety of easy-to-measure data (Pachepsky and Rawls 1999; Schaap and Leij 1998). Even though neural nets have become more popular in deep learning for image segmentation, and pattern recognition, they are less interpretable (Toms et al. 2020) as their complex structure and large number of parameters make it difficult to understand how they make decisions. Neural networks work by learning patterns in the data, and these patterns are represented in the connections between neurons, which are adjusted during the training process (Kiranyaz et al. 2021). As a result, it can be difficult to understand how specific input data is being processed and what features the model is using to make its predictions. Awika et al. (2021) employed machine learning techniques to examine the phenotypic and genetic correlations between root traits and shoot fresh weights of different baby spinach accessions grown in small pots under low and high nitrogen concentrations.

While the root observations are dynamic and include data for different depth increments at different time points, the tracer data typically give one end point measurement at crop harvest, showing an integrated effect of roots at different depths over the crop growth period.

The root images from the RadiMax facility were previously investigated. It was demonstrated that roots could be identified in the images using a deep learning-based method (Smith et al. 2020). This allowed estimation of the planar root length density (pRLD) for each individual image.

This study uses machine learning to investigate the following:

Is maximal root depth related to water and nitrogen uptake?
Does pRLD distribution across soil depths add additional understanding of root function?
What is the genotypic contribution to the variation in root growth and function?

Experimental systems

The RadiMax semi-field facility (Svane et al. 2019b) was designed for root phenotyping of crops grown in soil to maturity. In two experimental units, different winter wheat cultivars and breeding material were grown in two seasons (Wacker et al. 2022). Root imaging was performed with multi-spectral cameras (Svane et al. 2019a) through a minirhizotron system installed in the experimental units. Each image area covered 20 cm² and they were taken every 3.5 cm soil depth. The range of imaging was from 80 to 250 cm soil depth.

Drought experiment

The RadiMax facility has the possibility to exclude rainwater via large rain-out shelters (Svane et al. 2019b). The rain-out shelters where used in both years from late May to maturity in July, creating a drought condition. The facility is equipped with a sub-surface irrigation system, which can be used to irrigate the crop with increasing soil depths. In 2018, the subsurface irrigation system was used from May onwards, creating water supply with increasing soil depths. In 2019, the subsurface irrigation system was not used. In this year, the total soil water availability increased with depth. More details on soil water supply in the experiment can be found in (Wacker et al. 2022).

Tracer injection and plant sampling

To study deep N uptake, an isotopic tracer of 15N was injected into the soil at 180 cm depth (Wacker et al. 2022) to all replicates at anthesis. At harvest, a sample corresponding to the aboveground area above the 15N injection was taken, and grains were analyzed for 15N content via mass-spectroscopy (Wacker et al. 2022). To study the indirect effect of deep-water uptake, the same sample was also analyzed for 13C discrimination.

Root image analysis

Before image analysis, incomplete samples due to broken tubes or missing isotope data were identified and eliminated (16 tubes in 2018, 28 in 2019) from the dataset. The visible roots in each image were segmented using a convolutional neural network (Smith et al. 2020). For each image, the total length of segmented roots was extracted (Han et al. 20). For further calculation, the square root of the total root length per image was used (Wacker et al. 2022), and we denote this estimate the planar root length density (pRLD).

Facility Position Correction

Planar root length density and isotope composition measurements showed strong position effects, mainly in one unit of the facility (Fig. 1), which may have been caused by differences in soil compaction caused during the construction of the facility. Therefore, for all the analyses presented, the variables (isotopes measurements and pRLD) were linearly corrected for distance from the end of the facility. In other experiments (not reported), also piecewise linear corrections were used with very similar results.

Root Distribution Analysis

We investigated the influence of root distribution in two complementary ways. First, we defined analytical models for estimation of the root depth, and analyzed their ability to predict isotope data.

Second, we defined a machine learning approach to investigate whether other aspects of the root distribution could contribute to explaining the root function indicated by the uptake of the isotope tracer 15N enrichment and natural 13C discrimination.

Root Depth Estimates

From minirhizotron image based pRLD data, we designed estimates of the root depth for each position in the facility. In previous studies, root density has been modelled by an exponential model as a function of depth (Zuo et al. 2004) and we implemented that as a baseline comparison.

However, based on inspection of pRLD as a function of soil depth (example shown in Fig. 2), the sigmoid function appeared to fit the profiles better (Fig. 2, the goodness of fit${R}^{2}=0.78$) and therefore potentially give a better estimator for root depth.

We fitted these two mathematical models to the pRLD measurements for each replicate in the facility using nonlinear regression (implemented using the “curve_fit” function in the SciPy (Community 2019) optimization library). Specifically, with $y$ being the pRLD at depth $x$, after fitting the exponential function

$$y=a {e}^{- bx},$$

we computed the depth (D_50, corresponds to orange line in Fig. 2), at which 50% of the pRLD is accumulated (Fan et al. 2016), as D₅₀ $= \text{l}\text{o}\text{g}\left(2\right) / \text{b}$, where $a,$and $b$are the parameters of the model.

Similarly, we fitted to the sigmoid function

$$y=\frac{a}{1+ {e}^{- b(x-c)}}$$

and extracted the sigmoid inflection (SI) (green line in Fig. 2) point c as the estimate of the root depth.

These analytically designed estimates (D₅₀ and SI) were calculated of all the tubes and from all three months of root measurements in the datasets from the winter wheat experiments in the years 2018 and 2019.

Root distribution analysis across soil layers

We accumulated the pRLD in 10 intervals in the deeper soil layers between 119–220 cm to provide consistent depth data for all months of the two observation years for the machine learning models. The average pRLD across all facility tubes at different depths from 119 to 220 cm in the two years is illustrated in Fig. 3.

In total from the three observations months (May, June and July), this gives 30 input variables for predicting the outcomes in the form of the isotope measurements (15N or 13C).

Machine learning algorithms

We evaluated alternative ensemble ML algorithms, random forest (RF) and gradient boosting (GB), to model the relationship between the pRLD summed into intervals and the isotope tracer measurements. Ensemble learning methods are comprised of a set of classifiers that aggregate predictions to find the estimated results. Boosting and bagging are the two most used families of ensemble methods. The bagging method (e.g. RF) was introduced by Breiman (1996); it is a method of selecting a random sample with replacement of data from a training. In general, bagging is used with weak learners that exhibit high variance and low bias, whereas boosting (e.g. GB (Friedman 2001)) is used when low variance and high bias are observed. We have used two alternative ML algorithms (RF and GB) for our analysis which are described below:

Random Forest

The random forest model, introduced by Breiman (2001), is an ensemble learning method aimed at reducing model variance. It is a collection of low correlated decision trees based on the bagging and feature randomness methods. These types of decision trees are also called regression tree if it is used for regression. Regression trees are constructed through the splitting of the data into smaller segments by nodes or branches. The trees in the random forest model are trained independently, and their outcomes are then averaged. The prediction variance of the RF model is reduced by averaging the model predictions across the trees.

Gradient Boosting

The Gradient Boosting (GB) model, introduced by Friedman (2001), is a boosting ensemble method. Essentially, the gradient boosting model aims at enhancing model accuracy and robustness by aggregating multiple weak learners. Gradient-boosted trees are constructed iteratively as other boosting methods, but they offer the advantage of optimizing an arbitrary differentiable loss function (Friedman 2001), but it generalizes the other machine learning methods by allowing optimization of differentiable loss function.

For the RF and GB models, we combined the pRLD estimates for all three months into a single model.

The RF and GB models have a number of hyper parameters that may be optimized to the task at hand. For most hyper parameters, we used the default values in the ScitKit-Learn (Pedregosa et al. 2011) Python implementation. However, for hyper parameters like number of trees, maximal depth of the trees, and maximal number of features included, we used nested cross-validation to optimize these parameters.

Statistical analysis

The performance of the ML predictions was evaluated using cross-validation. The performance metric was correlation coefficient (R) where model estimates were compared to measured 15N uptake and 13C discrimination. In all cases we tested, the RF models performed slightly better than GB model and for simplicity we only investigated the RF model further for feature importance and mediation analysis. Similarly, we only performed mediation for the SI estimate (and not D₅₀). We computed feature importance to investigate which segment of soil layers are most important for 15N uptake and 13C discrimination. This was quantified by the RF average impurity reduction (Breiman 2001).

Mediation analysis was carried out using the SI (May, June, and July) and RF models for predicting $\text{l}\text{o}\text{g} {\delta }15\text{N}$ and ${\delta }13\text{C}$ as mediators to determine the genotype effect (Fig. 4) of deep root traits on $\text{l}\text{o}\text{g} {\delta }15\text{N}$, and ${\delta }13\text{C}$using the approach used in Wacker et al. (2021). Using mediation analysis, we can determine how a genotype (ID) and a dependent variable ($\text{l}\text{o}\text{g} {\delta }15\text{N}$ or ${\delta }13\text{C}$) were related by using a mediator variable.

Planar root length density and sigmoid inflections for different months in 2018 and 2019 are summarized in Table 1. There was an average pRLD of 94 sqrt(cm) per minihizotron in May, compared to 54 sqrt(cm) in June and 32 sqrt(cm) in July 2018. June measurements showed a higher average rooting depth (SI) of 170 cm compared to 151 cm in May and 167 cm in July 2018, whereas July measurements showed a higher average rooting depth of 184 cm compared to 157 cm in May and 177 cm in June 2019.

Table 1

Summary of planar root length density (pRLD) per minihizotron and sigmoid inflections (SI) in the 2018 and 2019 dataset, for May, June, and July
Descriptive statistics	pRLD per minihizotron (sqrt(cm))						SI (cm)
Descriptive statistics	May		June		July		May		June		July
Year	2018	2019	2018	2019	2018	2019	2018	2019	2018	2019	2018	2019
mean	94	119	54	117	32	60	151	157	170	177	167	184
std	28	39	22	38	17	18	15	14	15	16	16	22
min	25	32	8	30	2	13	109	127	131	116	111	75
max	170	143	129	231	75	120	192	186	212	222	208	227

Correlation between root models and isotopes

The correlations between analytical models (D₅₀ and SI) or ML models (RF and GB) and the isotope measurements ($\text{l}\text{o}\text{g} {\delta }15\text{N} \text{a}\text{n}\text{d} {\delta }13\text{C}$) are shown in Table 2. In general, SI had a higher correlation with $\text{l}\text{o}\text{g} {\delta }15\text{N}$and ${\delta }13\text{C}$than D₅₀ showing SI to be relevant estimator of rooting depth. The correlation between root-depth estimates and $\text{l}\text{o}\text{g} {\delta }15\text{N}$was lower in 2019 than in 2018. However, there were no statistically significant correlations with ${\delta }13\text{C}$($\text{p}\ge 0.05$) in 2019. In the 2018 experiment, the RF algorithm showed a higher correlation ($\text{R} = 0.46$ for $\text{l}\text{o}\text{g} {\delta }15\text{N}$, and $\text{R}=0.43$ for ${\delta }13\text{C}$) between predicted and observed corrected values of isotopes than the GB (R=0.36 for $\text{l}\text{o}\text{g} {\delta }15\text{N}$, and $\text{R}=0.30$ for ${\delta }13\text{C}$) but in 2019 correlations were not statistically significant ($\text{p}\ge 0.05$).

Table 2

Correlations between the root-depths estimated by the sigmoid inflection point (SI), D₅₀ and the random forest (RF) and gradient boosting (GB) model versus the isotope (log ${\delta }$15N and ${\delta }$13C) measurements. We trained RF and GB models on pRLD data from May, June, and July across 10 intervals using cross-validation. The correlations which were not statistically significant in 2019 are marked with NS ($\text{p}\ge 0.05$), ^* represents $\text{p}\hspace{0.17em}<\hspace{0.17em}0.05$, ^** $\text{p} \hspace{0.17em}<\hspace{0.17em}0.01$, ^*** $\text{p}$ < 0.001
		2018		2019
		$\text{l}\text{o}\text{g} {\delta }15\text{N}$	${\delta }13\text{C}$	$\text{l}\text{o}\text{g} {\delta }15\text{N}$	${\delta }13\text{C}$
D₅₀	May	0.26^***	-0.17^**	0.17^*	0.0 NS
	June	0.31^***	-0.21^***	0.17^*	0.12 NS
	July	0.30^***	-0.18^**	0.18^**	0.07 NS
SI	May	0.34^***	-0.27^***	0.15^*	-0.01NS
	June	0.37^***	-0.30^***	0.16^*	0.05 NS
	July	0.39^***	-0.31^***	0.24^***	0.12 NS
RF	May + June + July	0.46^***	0.43^***	0.09 NS	0.03 NS
GB	May + June + July	0.39^***	0.38^***	0.02 NS	-0.03 NS

Importance of Deep Roots for 15N uptake and 13C discrimination

The relationship between the root depth, as estimated by the SI model, and the isotopes allows a simple estimation of the importance of deep root function. For instance, the correlation between ${\delta }13\text{C}$and SI in June 2018 (at -0.30 in Table 2) is illustrated in Fig. 5 (upper right). The parameters in the regression lines allows a simple interpretation of the impact of deep rooting. From the slopes for the correlation plots against 15N (Fig. 5, left), we can see that if rooting depth increases by 1 cm, then we would expect $\text{l}\text{o}\text{g} {\delta }15\text{N}$ to be higher on average by 3% in June 2018 (dry year) and 1% in June 2019.

Importance of Root Distributions

The feature importance quantification for the RF model allows an interpretation of the importance of pRLD at different depths across the three observation months. Figure 6 illustrates this and demonstrates that throughout all three months, pRLD found at depths150-160 cm and 160–170 cm were the most important features for predicting both 15N and 13C in 2018. Based on the figure, the soil depth between 150–160 cm had a significant impact on 13C discrimination in May 2018. May's pRLD at 150–160 cm soil depth was most important for predicting ${\delta }13\text{C}$ followed by June and July's. In contrast, at 160–170 cm depth pRLD from all months showed equally high importance for 13C discrimination. For 15N, June data were the most important, and May data the least important.

The feature importance of RF model in 2019 is not shown in Fig. 6 as correlation between RF’s prediction and $\text{l}\text{o}\text{g} {\delta }15\text{N}$ (or ${\delta }13\text{C}$) was not statistically significant (Table 2).

Interplay between Genotypes, Models, and Isotopes

The mediation analysis of both direct and indirect effects of genotype on $\text{l}\text{o}\text{g} {\delta }15\text{N}$and ${\delta }13\text{C}$is shown in Tables 3 and 4. In both years, the genotype significantly influenced all deep root estimators and RF models. The genotype effect on individual root depths (SI) in May, June and July was statistically significant ($\text{p}<0.05$) in the year 2018. The combined RF model explained 43.4% of the variation in $\text{l}\text{o}\text{g} {\delta }15\text{N}$and 47.4% of the variation in ${\delta }13\text{C}$in 2018, and 56.4% of the variation in $\text{l}\text{o}\text{g} {\delta }15\text{N}$ and 79.4% of the variation in ${\delta }13\text{C}$ in 2019.

A significant effect of genotype on both 15N uptake and 13C discrimination was mediated by RF estimators in both years. The indirect effects of rooting depth determined as sigmoid inflection point (SI) in May, June, and July explained 3.1%, 3.7%, and 3.2% of the variation in 2018, whereas the RF model explained 5.1% of the variation of 15N uptake. Based on the indirect effects of SI in May, June, and July, 2.7%, 2.5%, and 3.4% of the variation of 13C discrimination was explained, respectively, while 2.3% was explained by the RF model. In 2019, according to the indirect effects of SI in May, June, and July, 1.5%, 1%, and 3.3% of the variation of 15N could be explained by SI.

Table 3

Mediation analysis of deep rooting on the genotype effect on 15N uptake. R² of the total effect and decomposition of R² for direct genotype (ID) and indirect effects (May SI, June SI, July SI, and RF model) predicting 15N uptake
	Genotype effect on Individual deep root depth and RF model		Significance of parameters of the total effect model
Year	Parameter	p value	Parameter	p value
2018	May SI	${ 5\times 10}^{-6}$	ID	${2\times 10}^{-9}$
	June SI	${ 4\times 10}^{-7}$	May SI	0.27
	July SI	${ 6\times 10}^{-7}$	June SI	0.61
	RF Model	${ 6\times 10}^{-4}$	July SI	0.52
			RF Model	${ 2\times 10}^{-4}$
2019	May SI	${ 1\times 10}^{-5}$	ID	${ 3\times 10}^{-9}$
	June SI	${ 2\times 10}^{-3}$	May SI	0.60
	July SI	${ 3\times 10}^{-3}$	June SI	0.54
	RF Model	${ 2\times 10}^{-7}$	July SI	0.16
			RF Model	${ 2\times 10}^{-2}$
Year	Parameter	Total effect in percentage	Percentage due to direct effect	Percentage due to indirect effect
2018	ID	43.4	85.8
	May SI			3.1
	June SI			3.7
	July SI			3.2
	RF Model			5.1
2019	ID	56.4	94.9
	May SI			1.5
	June SI			1.0
	July SI			3.3
	RF Model			-0.8

Table 4

Mediation analysis of deep root depths on the genotype effect of 13C discrimination. R² of the total effect and decomposition of R² for direct genotype (ID) and indirect effects (May SI, June SI, July SI, and RF model) predicting 13C discrimination
	Genotype effect on Individual deep root depth and RF model		Significance of parameters of the total effect model
Year	Parameter	p value	Parameter	p value
2018	May SI	${ 5\times 10}^{-6}$	ID	${ 8\times 10}^{-15}$
	June SI	${ 4\times 10}^{-7}$	May SI	0.35
	July SI	${ 6\times 10}^{-7}$	June SI	0.31
	RF Model	${ 1\times 10}^{-2}$	July SI	0.23
			RF Model	${ 7\times 10}^{-4}$
2019	May SI	${ 1\times 10}^{-5}$	ID	${ 2\times 10}^{-16}$
	June SI	${ 2\times 10}^{-3}$	May SI	${ 5\times 10}^{-2}$
	July SI	${ 2\times 10}^{-3}$	June SI	0.37
	RF Model	${ 3\times 10}^{-5}$	July SI	0.81
			RF Model	0.64
Year	Parameter	Total effect in percentage	Percentage due to direct effect	Percentage due to indirect effect
2018	ID	47.4	89.1
	May SI			2.7
	June SI			2.5
	July SI			3.4
	RF Model			2.3
2019	ID	79.4	97.3
	May SI			0.8
	June SI			0.6
	July SI			1.1
	RF Model			0.1

The purpose of the study was to confirm the importance of deep rooting as observed before (Wacker et al., 2022), but also to further investigate whether machine learning methods can reveal additional insight into the importance of root distribution across depths. Finally, we wished to investigate the interplay between wheat genotypes, root growth, and root function.

We proposed a novel model using the sigmoid inflection point (SI) as the estimate for deep rooting. This gives a more directly intuitive depth estimate compared to the D₅₀ estimate in the previously proposed exponential decay model (Fan et al., 2016). In addition, the SI model fitted the pRLD distribution better (illustrated in Fig. 2). The results demonstrated that the SI root depth estimate was also better at explaining the deep root function measured by 15N and 13C isotope measurements than the exponential model (Table 2). These results confirmed the importance of deep root growth.

The correlations for the SI versus $\text{l}\text{o}\text{g} {\delta }15\text{N}$ were positive, while correlations to ${\delta }13\text{C}$were negative. This confirms that increased deep rooting allowed higher uptake of 15N applied deep in the soil, and indirectly, that it also allowed higher water uptake from deep soil layers. The negative correlations of SI with ${\delta }13\text{C}$ indicates that there was a positive relationship between root depth estimated as SI and drought resilience (Simelton et al. 2009), as deep water uptake allows the plants to keep their stomata more open. According to the regression lines, if root depth increases by 1 cm, $\text{l}\text{o}\text{g} {\delta }15\text{N}$ will increase by 3% (Lindeman 1980; Kasza and Wolfe 2014) in June 2018, and if rooting depth increases by 1 cm, ${\delta }13\text{C}$will be lower on average by 1% (slope was − 0.01) in June 2018 and 2019.

To investigate whether machine learning could provide stronger models for explaining root function, we evaluated the two ensemble models, RF and GB. Here, the RF model showed a higher correlation between root data and isotope tracer measurements (see Table 2), and this was in most cases also higher than analysis based on the SI root depth estimate. Thereby, the RF could potentially reveal additional insight into root function. The RF feature importance visualization (Fig. 6) showed that roots between 150 and 170 cm were most important for explaining isotope tracer measurements. We can see in Table 1 that this interval was actually very similar to the root depth as estimated by SI.

As 15N was injected specifically at 180 cm soil depth, it is encouraging but not surprising, to find that roots at the depth between 150 and 170 cm were the most important for predicting 15N uptake. However, the 13C results are based on general plant water stress response, not on tracers added to a specific depth. Finding that also 13C results are mainly predicted by root data from 150 to 170 cm depth, confirm a more general hypothesis that the deepest part of the root system is crucialfor genotype differences in water uptake (Thourp-Kristensen and Kirkegaard, 2016). Crops take much of their water from upper soil layers, but in these layers, there are high root densities longer time for water uptake, so root growth is normally not limiting the water uptake from upper layers. However, in deep soil layers, there are fewer roots, and roots firstly appear there late in the growing season, providing the chance for using these water resources in the summer when there is not so much water available in the shallow soil layers. In a soil layer where root growth and duration limits the water use, genotypic differences become important for actual water use. This was also found by Kirkegaard (Kirkegaard et al. 2007), who established that the extra water assimilated from deep layers was of extra value for yield production, as it is taken up late during the grain filling phase (Lilley and Kirkegaard 2011).

In combination, these results demonstrate the importance of the deep roots, as opposed to, for instance, higher root density in upper soil layers, as the RF model confirmed the importance of root growth at more than 150 cm depth. However, the RF did still achieve higher correlations for most outcomes in Table 2, indicating that some information was lost in the simple calculation of a root depth estimate by the SI model. It is possible that the RF model did in fact implicitly also just estimate the root depth, but became more robust by integrating pRLD measurements from all three months (May, June, and July) as opposed to the SI model that estimated these individually. To investigate this, we create models integrating the three SI estimates using either RF or GB for the combination. These models did not perform better than the individual SI estimates (results not shown). We interpret this to conclude that the RF model using all root data points, does indeed capture more than just root depth, and integrates relevant root characteristics across depths. Although all roots contribute to water uptake, but mainly the deep roots contribute to the variation in water and N uptake.

Finally, we investigated the interplay between genotype and the proposed root models in the mediation analysis. The results (in Tables 3 and 4) showed that as expected the genotype was strongly related to the isotope tracer measurements. However, for both years and both isotopes, the mediation analysis also demonstrated that both the SI root depth estimate and the RF model contributed with additional, independent information. This supports that these root models can support plant breeding by explaining root function.

In addition to addressing the main study objectives, the results allow further interesting observations as discussed below.

The SI root depth estimate may allow improved estimates of root growth and ability for deep soil exploitation (Burridge et al. 2017). Apart from the rooting depth achieved, also the timing of deep root growth may differ among winter wheat genotypes (Hodgkinson et al. 2017), and this can be significant for resource uptake. Different timing of root growth can be computed by finding the inflection points in June, July, and August on pRLD. Further development in root imaging and image analysis may also directly allow the observation of late root season root growth, which can be of special importance for water and nitrogen uptake during the yield forming growth stages of flowering and grain filling.

The SI estimates (Table 2) and the RF feature importance analysis revealed additional effects (Fig. 6). The shift of soil depths important for 15N uptake in 2018, from 150–160 cm soil depth in Jun to 160–170 cm soil depth in July indicates continuous root growth post-anthesis to be important for tracer uptake, which was injected at anthesis. During May 2018, soil depths of 150–160 cm contributed significantly to 13C discrimination in grain, while root length at 160–170 cm depth in July, followed by June and May, was the most important factor for 13C discrimination. This again implies a continuously developing root profile over time. As 15N was not applied until anthesis, and 13C was also measured in grain, produced mainly by post anthesis photosynthesis, roots in May will have had little or no direct influence on the isotopic results. However, root observations in May may be predictive of later root presence and development. Young roots are more easily identified on root images than older roots, as can be seen by the declining root observations in many soil layers from May to June to July. Therefore, roots observed in May may give a relatively good estimate of the roots which will be present later in June and July, when the actual isotopic signatures are created by deep water and 15N uptake.

The meditation analysis provides insight into the deep root functions of crops. By establishing a causal chain from genotype to deep root traits to phenotype, mediation analysis provides explicit hypotheses for further functional exploration (Yang et al. 2022). Moreover, the greater direct effect of genotype on 15N uptake, and 13C discrimination in 2018 and 2019 than on the other mediator variables suggests that additional genotypic traits besides the ones recorded by image-based root data influenced 15N uptake and 13C discrimination (Wacker et al. 2022).

Our study proposed a novel estimate of root depth by the sigmoid fitted to pRLD measurements across depths. This SI root depth estimate improved the prediction of root function as investigated by the stable isotopes 15N and 13C. The importance of the deep roots was further confirmed by the random forest machine learning model that optimally integrated pRLD across depths into a single model. This RF model was slightly better than the SI estimator at explaining isotope tracer measurements, and the RF feature importance analysis revealed that this was achieved with a strong focus on the deepest roots.

Further, the mediation analysis showed that both the SI and RF models provided additional information on top of the genotype, showing a potential for applying these in plant breeding experiments.

Code and Data Availability

Codes are available on GitHub at https://github.com/satyasaran/CropML.git

Data will be made available on request.

Conflict of interest

The authors declare no conflict of interest.

Awika HO, Mishra AK, Gill H et al (2021) Selection of nitrogen responsive root architectural traits in spinach using machine learning and genetic correlations. Sci Rep 11:9536. https://doi.org/10.1038/s41598-021-87870-z
Baker RE, Pena J-M, Jayamohan J, Jérusalem A (2018) Mechanistic models versus machine learning, a fight worth fighting for the biological community? Biol Lett 14:20170660. https://doi.org/10.1098/rsbl.2017.0660
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Breiman L (1996) Out-of-bag estimation, Technical Report; Department of Statistics: UC Berkeley, 1996
Burridge JD, Schneider HM, Huynh B-L et al (2017) Genome-wide association mapping and agronomic impact of cowpea root architecture. Theor Appl Genet 130:419–431. https://doi.org/10.1007/s00122-016-2823-y
Cai G, Vanderborght J, Klotzsche A et al (2016) Construction of minirhizotron facilities for investigating root zone processes. Vadose Zone J 15. https://doi.org/10.2136/vzj2016.05.0043
Community S (2019) SciPy Documentation—Scipy. optimize. curve_fit
Eberbach PL, Hoffmann J, Moroni SJ et al (2013) Rhizo-lysimetry: facilities for the simultaneous study of root behaviour and resource use by agricultural crop and pasture systems. Plant Methods 9:1–19. https://doi.org/10.1098/rsbl.2017.0660
Fan J, McConkey B, Wang H, Janzen H (2016) Root distribution by depth for temperate agricultural crops. Field Crops Research 189:68–74. https://doi.org/10.1016/j.fcr.2016.02.013
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 1189–1232. https://doi.org/10.1214/aos/1013203451
Han E, Smith AG, Kemper R et al (2021) Digging roots is easier with AI. J Exp Bot 72:4680–4690. https://doi.org/10.1093/jxb/erab174
Hodgkinson L, Dodd IC, Binley A et al (2017) Root growth in field-grown winter wheat: some effects of soil conditions, season and genotype. Eur J Agron 91:74–83. https://doi.org/10.1016/j.eja.2017.09.014
Ishwaran H (2007) Variable importance in binary regression trees and forests. Electron J Stat 1:519–537. https://doi.org/10.1214/07-EJS039
Kasza J, Wolfe R (2014) Interpretation of commonly used statistical regression models. Respirology 19:14–21. https://doi.org/10.1111/resp.12221
Kiranyaz S, Avci O, Abdeljaber O, Ince T, Gabbouj M, Inman J D (2021) 1D convolutional neural networks and applications: A survey. Mech Syst Signal Process 151:107398. https://doi.org/10.1016/j.ymssp.2020.107398
Kirkegaard JA, Lilley JM, Howe GN, Graham JM (2007) Impact of subsoil water use on wheat yield. Aust J Agric Res 58:303–315. https://doi.org/10.1071/AR06285
Lilley JM, Kirkegaard JA (2011) Benefits of increased soil exploration by wheat roots. Field Crops Research 122:118–130. https://doi.org/10.1016/j.fcr.2011.03.010
Lindeman RH (1980) Introduction to bivariate and multivariate analysis
Pachepsky YA, Rawls WJ (1999) Accuracy and reliability of pedotransfer functions as affected by grouping soils. Soil Sci Soc Am J 63:1748–1757. https://doi.org/10.2136/sssaj1999.6361748x
Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: Machine learning in Python. J Mach Learn Res 12:2825–2830
Schaap MG, Leij FJ (1998) Using neural networks to predict soil water retention and soil hydraulic conductivity. Soil Tillage Res 47:37–42. https://doi.org/10.1016/s0167-1987(98)00070-1
Simelton E, Fraser ED, Termansen M et al (2009) Typologies of crop-drought vulnerability: an empirical analysis of the socio-economic factors that influence the sensitivity and resilience to drought of three major food crops in China (1961–2001). Environ Sci Policy 12:438–452. https://doi.org/10.1016/j.envsci.2008.11.005
Smith AG, Han E, Petersen J et al (2022) RootPainter: deep learning segmentation of biological images with corrective annotation. New Phytol 236:774–791. https://doi.org/10.1111/nph.18387
Smith AG, Petersen J, Selvan R, Rasmussen CR (2020) Segmentation of roots in soil with U-Net. Plant Methods 16:1–15. https://doi.org/10.1186/s13007-020-0563-0
Strobl C, Boulesteix A-L, Zeileis A, Hothorn T (2007) Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinformatics 8:1–21. https://doi.org/10.1186/1471-2105-8-25
Svane SF, Dam EB, Carstensen JM, Thorup-Kristensen K (2019a) A multispectral camera system for automated minirhizotron image analysis. Plant Soil 441:657–672. https://doi.org/10.1007/s11104-019-04132-8
Svane SF, Jensen CS, Thorup-Kristensen K (2019b) Construction of a large-scale semi-field facility to study genotypic differences in deep root growth and resources acquisition. Plant Methods 15:1–16. https://doi.org/10.1186/s13007-019-0409-9
Thorup-Kristensen K, Kirkegaard J (2016) Root system-based limits to agricultural productivity and efficiency: the farming systems context. Ann Botany 118:573–592. https://doi.org/10.1093/aob/mcw122
Toms BA, Barnes EA, Ebert-Uphoff I (2020) Physically interpretable neural networks for the geosciences: Applications to earth system variability. J Adv Model Earth Syst 12. https://doi.org/10.1029/2019MS002002. e2019MS002002
Wacker TS, Popovic O, Olsen NA et al (2022) Semifield root phenotyping: Root traits for deep nitrate uptake. Plant Cell Environ 45:823–836. https://doi.org/10.1111/pce.14227
Yang Z, Xu G, Zhang Q et al (2022) Genome-wide mediation analysis: an empirical study to connect phenotype with genotype via intermediate transcriptomic data in maize. Genetics 221:iyac057. https://doi.org/10.1093/genetics/iyac057
Zhang Y, Tobias HJ, Brenna JT (2009) Steroid isotopic standards for gas chromatography-combustion isotope ratio mass spectrometry (GCC-IRMS). Steroids 74:369–378. https://doi.org/10.1016/j.steroids.2008.10.001
Zuo Q, Jie F, Zhang R, Meng L (2004) A generalized function of wheat’s root length density distributions. Vadose Zone J 3:271–277. https://doi.org/10.2113/3.1.271

Download PDF

Journal Publication

published 06 Sep, 2023

Read the published version in Plant and Soil →

Editorial decision: Major revisions
30 Mar, 2023
Reviewers agreed at journal
02 Mar, 2023
Editor invited by journal
21 Feb, 2023
Editor assigned by journal
21 Feb, 2023
First submitted to journal
21 Feb, 2023

You are reading this latest preprint version

Non-invasive Phenotyping for Water and Nitrogen Uptake by Deep Roots Explored using Machine Learning

Status:

Journal Publication

Version 1

Abstract

Background and aims

Methods

Results

Conclusions

Figures

Introduction

Material And Methods

Statistical analysis

Results

Discussion

Conclusions

Declarations

Code and Data Availability

Conflict of interest

References

Status:

Journal Publication

Version 1

		2018		2019
		\(\text{l}\text{o}\text{g} {\delta }15\text{N}\)	\({\delta }13\text{C}\)	\(\text{l}\text{o}\text{g} {\delta }15\text{N}\)	\({\delta }13\text{C}\)
D₅₀	May	0.26^***	-0.17^**	0.17^*	0.0 NS
	June	0.31^***	-0.21^***	0.17^*	0.12 NS
	July	0.30^***	-0.18^**	0.18^**	0.07 NS
SI	May	0.34^***	-0.27^***	0.15^*	-0.01NS
	June	0.37^***	-0.30^***	0.16^*	0.05 NS
	July	0.39^***	-0.31^***	0.24^***	0.12 NS
RF	May + June + July	0.46^***	0.43^***	0.09 NS	0.03 NS
GB	May + June + July	0.39^***	0.38^***	0.02 NS	-0.03 NS