Identification of a reference gene is integral to the conduct of copy number estimations by relative quantification. The ideal reference for these purposes is a single or low copy gene exhibiting a stable copy number in the studied species [6, 17]. Given that there is limited genomic data available from white clover [18], identification of such a gene is challenging. With this objective in mind, BLASTN searches of the NCBI white clover EST database were performed to identify putative orthologues of single copy genes reported in Arabidopsis [19]. Eight white clover ESTs with high sequence identity to the genes from Arabidopsis were identified (Table 1).
Table 1 List of candidate single copy reference genes, and the results of BLASTN searches of the NCBI white clover EST database with sequences of single copy genes reported in Arabidopsis
Primers were designed and evaluated by qPCR for consistent amplification (Cq values below 30 in undiluted samples) among the 30 transgenic events available.
Only ATP-dependent protease (FY464051.1), Pyruvate dehydrogenase (PDH) (FY466505.1), and Ribosomal protein (FY458968.1) genes exhibited a consistent amplification among the 30 samples analysed using the primers and probes in Table 2.
Table 2 Primers and probes directed to transgenes and candidate reference genes used for copy number estimations
These three genes were compared for their stability in copy number between samples. Variation of Ct values between the genes, calculated as ΔCt [20], was estimated for each sample. In this analysis, we assumed that the difference in the Cq of two genes remains constant in different samples only if the copy number of these two genes is constant among samples. To calculate the Ct difference, each gene was compared with the remaining two genes across the 30 samples. The SD of ΔCt for the 30 samples was then calculated and the mean of the SD was estimated in order to select the gene whose copy number was more stable among samples. Lowest SD of the ΔCt was observed between PDH and ATP-dependent protease (0.98), and higher values were observed when PDH or ATP-dependent protease were compared to Ribosomal protein (2.5 in both cases). Based on this data, PDH and ATP-dependent protease were selected and evaluated for use as references for copy number estimation by dPCR.
All the transgenic plants used in this work contain the transgenes Isopenthenyl transferase (IPT), nodule enhanced Malate dehydrogenase (TrneMDH), Alfalfa mosaic virus coat protein gene (CP-AMV) in a single T-DNA (Fig. 1) [15]. Duplexed reactions were optimized for transgene copy number estimation by qPCR and dPCR. Two pairs of primers directed to each of the three transgenes were designed and tested in duplex reactions. FAM-labelled probes were used to detect the transgenes, and HEX-labelled probes the reference gene. Amplification efficiency of primer/probe combinations for each transgene and reference gene was evaluated by estimating signal-to-noise ratios of dPCR results, by dividing the mean fluorescence amplitude of positive droplets over the mean amplitude of negative droplets. High signal-to-noise ratios are indicative of better separation between positive and negative droplet fluorescence signals, which enables a more reliable copy number estimation. Single primer pair and probe with highest signal-to-noise ratios per transgene were selected for the analysis (Table 1 and Online Resource 2). All the primer-probe combinations directed to the target transgenes and the two reference genes exhibited signal-to-noise ratios higher than two (Online Resource 2), indicative of an optimal separation between positive and negative reactions [21].
Transgenes were digested with restriction enzymes for a better transgene copy number resolution (Fig. 1). Discrimination of one or more target transgenes in a single droplet is not otherwise possible by dPCR. Enzyme digestions within the transgene but outside the amplicon can resolve this issue, as they increase the probability of the target being contained in different droplets. For qPCR, primer efficiencies were estimated using the formula E = 10(-1/slope) from a plot of Cq versus log cDNA dilution [22]. Primer efficiencies were 0.96 for IPT, 0.94 for TrneMDH, 0.93 for CP-AMV, 0.91 for PDH, and 0.98 for ATP dependent protease, which are near optimal values for relative quantification [23]. However, ATP-dependent protease exhibited a low R-squared of 0.84.
Transgene copy number was estimated using dPCR, by calculating ratios between concentrations of the target transgene and the reference gene. For qPCR, copy number was calculated by the ΔΔCt method [23]. Event 34, which exhibited a single T-DNA insertion, and low SD among replicates by dPCR, was selected from the 30 events available as the calibrator sample for estimating transgene copies by qPCR. Given that white clover is allotetraploid and has two sub-genomes [18], genes used as a reference, if single copy, would have 2 copies at a single locus in each sub-genome. Therefore, if amplification occurs in the two homeologous genes, the expected ratio between the reference gene and the transgene of interest will be 4:1 when there is a single transgene insertion.
In order to compare the correspondence between copy number estimates obtained using each of the reference genes, dPCR was performed on the transgenic events using both the ATP-dependent protease and PDH genes as references. Estimates of copy numbers for each of the transgenes using the ATP-dependent protease and PDH genes were highly correlated (R2 = 0.96–0.99) (Fig. 2).
Figure 2 Correlation of copy number estimations for each of the three transgenes when using the reference genes ATP-dependent protease versus Pyruvate dehydrogenase (PDH), measured by dPCR
Copy number among the 30 transgenic events using either reference gene was estimated to be between 1 and 10 for the transgenic events by dPCR (Fig. 3). Forty three percent of the generated events exhibited putative single copy transgene insertions, whilst no events with transgene copy numbers below one were identified.
The coefficient of variation (%CV) of the estimated copy number was markedly lower for dPCR relative to qPCR in all events analysed, with some exceptions (Fig. 3). Mean %CV values for each transgene across events were between 1.8 to 3.7 fold lower in dPCR (5.3, 5.5, and 5.2 for IPT, TrneMDH, and CP-AMV respectively) than in qPCR (11.6, 10.3, and 19.4).
Results generated by qPCR and dPCR were compared, and accuracy and precision were estimated. A linear correlation was observed between transgene copy numbers estimations by qPCR and dPCR, with the highest R2 value observed for the CP-AMV transgene (Fig. 4). The slopes of the linear functions were greater than one for all three transgenes. This is evidence of a tendency for higher estimates of copy number by qPCR over dPCR.
Furthermore, transgene copy number approximated integer values more closely when estimated by dPCR. This was observed in 19 cases for IPT, 18 for TrneMDH and 17 for CP-AMV, in 29 of the 30 events evaluated. The observation of values closer to integral represents a reduction in ambiguity when rounding copy number estimates up or down.
Figure 3 Comparison of copy number estimates with dPCR and qPCR. Copy numbers measured using dPCR and qPCR by relative quantification (left y-axis). dPCR data was generated using ATP-dependent protease as a reference, and qPCR data was generated using PDH as a reference. %CV dPCR and %CV qPCR are the coefficients of variation of measured copy numbers using either technique (right y-axis)
Figure 4 Scatter plots for correlation of copy number of each transgene using dPCR and qPCR