The treatment factors planting dates, fertilization doses can be treated as both qualitative (P, F) and quantitative (D, N) characteristics. Additionally, the factor variety (V) and year (Y) can be explored by its quantitative FAO classification (M) and time trend (T), respectively. For the latter two we do not expect that quantification can explore all variation seen within the factors. However, the aim of the current analysis is to explore the quantitative nature of all four factors. A general overview on model development is shown in Fig. 4. Two final models were developed: One model with a single response surface curve fitted across years and another model fitting separate response surface curves for good and bad years, with below average and above average yields, respectively.
2.3.1 Model development for single surface model
Block model
The statistical model used for analysis is developed here by first considering a single year and then extending to multiple years. To represent the field layout and allocation of treatments to experimental units, the following block model (Piepho et al. 2003) is used for a single year, using the block factors defined before
SR + SC + SR.SC + SR.LR + SR.SC.CL + SR.SC.LR.CL (1)
Here, SR.SC corresponds to main plots, and SR.SC.LR.CL corresponds to subplots. Hence, all six block factors shown in Fig. 3 are represented. All design effects in this block model are modelled as random. We extend the model to multiple year by extending each effect with the factor year (Y):
SR.Y + SC.Y + SR.SC.Y + SR.LR.Y + SR.SC.CL.Y + SR.SC.LR.CL.Y (2)
Year is denoted as the repeated factor (Piepho et al. 2004) because it indexes repeated observations on the same design unit. The design unit itself is identified by the level of the effect in (1) that is extended by factor Y in (2). For example, main plots are identified by levels of the effect SR.SC, and all observations on the same main plot are assumed to be serially correlated. Here, we use the first order autoregressive AR(1) model with all design effects in (2) except for the subplots where we additionally allow for heterogeneous variances (ARH(1)).
Treatment model
The treatment model is developed using the P, F and V factors included in the experiment. The basic treatment model for a single year is
V × F × P = V + F + P + V.F + V.P + F.P + V.F.P (3)
This model is modified to account for the random factor Y by adding a random main effect for Y and also adding all effects in (3) crossed with Y as random. Hence, the extended model is
V × F × P × Y = V + F + P + V.F + M.P + F.P + V.F.P + (4)
Y + V.Y + F.Y + P.Y + V.F.Y + V.P.Y + F.P.Y + V.F.P.Y
where random effects are listed after a colon. The full model is obtained by combining models (2) and (4).
Response surface regression
Models so far treated P, V, F and Y as qualitative factors. However, P can be quantified by day of year (D) and F by the amount of nitrogen (N). Additionally, V can be quantified by the FAO number M and Y can be quantified by the continuous variable T for calendar year. In the latter two cases, we do not expect that all differences in V and Y can be covered by M and T, respectively. Furthermore, note that preliminary inspection revealed that N shows a response with diminishing returns reaching a plateau and then dropping only slowly with further increasing N. A quadratic model in N may not represent this well. Hence, we experimented with different powers of N and decided to replace N with N1/2. To simplify the presentation, we replaced N values by their square root but kept the symbol N to represent the factor. We fitted a second-order response surface model with all four variables D, N, M and T. Such a model is satisfying if there are no serious deviations from the response surface. To check this assumption, we first fitted a second-order response surface regression to single year data using D, M and N as quantitative regressor variables. In that first step, we fitted the treatment model given by
D + D.D + M + M.M + N + N.N + D.M + D.N + M.N + V.F.P (5)
separately for each year (Y) and assessed the lack of fit using the fixed effect V.F.P. In addition, the model comprised all random design effects in Eq. (1). The model fit was satisfactory (see Supplementary Material), hence we considered an extension of the second-order response surface model by inclusion of the continuous factor T for the calendar year. In this case, no lack-of-fit test (test of V.P.F.Y) was performed for the across-year analysis as we assumed that the trend will not explain years completely and there is for sure year-by-year variation that will be captured by the random effects involving Y in Eq. (4). Additionally, both variables Y and V were now assumed as random, while quantitative variables M and T were taken as fixed. The full model therefore included the second-order response surface regression on variables M, D, N and T as fixed effects, all effects from (4) including either V or Y as random effects, and all effects of (2) as random effects.
After fitting the response surface regression for all four quantitative variables, the fixed effects were subsequently pruned by discarding non-significant effects, observing the marginality principle. Thus, we started by inspecting the highest-order interaction and removed it if it was not significant (at α = 0.05), in which case we moved on to the nearest lower-order interactions to proceed with the next tests, etc. A summary of the covariates used in the regression analyses is presented in Table SM1.
2.3.2 Model development to fit separate response surface curves for good and bad years
For the final model developed above we inspected best linear unbiased predictions (BLUPs) for the random deviations of Y from trend T. Based on these BLUPs we fitted a separate response surface regression for (i) years with positive (= good years) and negative BLUP (= bad years). Again, fixed effects crossed with group were selected via backward selection.
The fitted models are reported as contour plots for two of the four variables (two out of the four variables D, M, N and T), fixing the other two at specific values. For analysis, we centered M at 350, T at 2010 and D at 120. This linear shift is intended to numerically stabilize the regression analysis. It does not affect slope estimates but does shift the intercept. The fitted values are not affected.
The statistical analysis were performed using ASReml 4.2 standalone for analysis and PROC RSREG in SAS for graphics.
Note, that in exploring possible temporal trends the effect of a total of 58 annual and monthly environmental factors (see examples in Fig. SM1) was also investigated. These factors were included as covariates in the single response surface model. As detailed in the Discussion section, weather related covariates (Table SM2.) were not used in the final model, since year factor (Y) has been shown to be a reasonably good integrator that aggregates the impact of weather factors and their variations with sufficient statistical confidence.
Though there are many reasons why results from a LTE can be useful for providing useful information on promising measures to adapt to a changing climate; yet, there are also limitations in view of various aspects: (i) shifts in future seasonality (i.e. shifts in rainfall patterns but also frost risk patterns, etc.) that are part of climate change projections, need to be considered when deriving potentially promising adaptation options from experiments conducted historical (past) weather conditions; (ii) effects of elevated atmospheric CO2 concentration: although these have to be considered most for crops of the C3 photosynthetic type (like wheat or barley) (e.g. Lobell and Gourdji 2012; Rötter et al., 1999) elevated CO2 also has beneficial effects on C4 crops like maize, in particular in improving their water use efficiency under drought conditions (e.g. Durand et al., 2018) as is also the case for C3 crops ( O'Leary et al., 2015). To overcome these shortcomings crop growth simulation was used for extrapolating the patterns detected in the LTE results for the future. Crop modeling has the potential to help us understand the relative influence of environmental factors (such as climate and soil) and genotype and management on the outcomes of long-term experiments. For example, Dobermann et al. (2000) explored this in their research.