Cotton (Gossypium. spp) is one of the important industrial and economic crops.1 Cottonseed, the main by-product of cotton production, can be used to produce food, animal feed, and other products. Cottonseed contains many kinds of nutrients, including proteins, oils, fatty acids, and amino acids, making it a potential food resource for human beings with the rapid growth of global population2. However, the Gossypium species are characterized by the presence of gossypol, which is toxic to human beings and monogastric animals,3 such that the utilization of cottonseed products is limited.
Gossypol, 1, 1’, 6, 6’, 7, 7’-hexahydroxy-5, 5’-diisopropyl-3, 3’-dimethyl-(2, 2’ binaphthalene)-8, 8’-dicarbaldehyde, is a terpenoid compound that help cotton defend against biotic stress.4–6 Due to the toxicity of gossypol, breeding for both lower gossypol content in cottonseeds and higher gossypol content in cotton plants has been practiced in many cotton-planting countries. The cottonseed breeding work often requires analyzing a large number of cottonseed samples to measure gossypol content. Conventionally, gossypol content is assayed by UV spectrophotometry which not only involves reagents with great toxicity, but also is not accurate and reliable. High-performance liquid chromatography (HPLC) is generally expensive and time-consuming, although it was high accuracy and sensitivity for gossypol determination. In addition, both classical analytical methods cause undesired destruction of the testing samples which frequently needed to be planted in cotton breeding program. So, a rapid and non-destructive method for gossypol determination is required.
Near infrared (NIR) spectroscopy combined with chemometrics is a rapid, convenient, and environmentally-friendly analytical technique in the quality analysis for crops.7–20 However, it is a challenge to determine gossypol content in intact cottonseeds by NIR, due to (i) cottonseed being bigger than other crop seeds, so large voids are left between packed samples in sample cells; (ii) some of immature and wizened cottonseeds can be mixed in the samples, which can introduce irrelevant information into the spectra data; and (iii) the tough and thick shell of cottonseed can impact the penetration of NIR light and result in a lower S/N ratio and poor information. Because of these factors, the spectral data of intact cottonseeds are far more complex than that of other crop seeds, which may contain a large amount of useless and uncorrelated information such as noise and background. To overcome these difficulties, sophisticated chemometric methods are applied to extract useful information from NIR spectra and calibrate robust models for gossypol content in intact cottonseeds. Essentially, these include regression methods such as principal component regression (PCR)21, partial least squares (PLS)22, support vector machines (SVM)23, least squares support vector machines (LS-SVM)24, and artificial neural networks (ANN)25, coupled with spectral pretreatments such as standard normal variate (SNV)26, Savitzky-Golay (SG) smoothing27, multiplicative scatter correction (MSC)28, and first derivate29.
Due to undesired destruction of the test sample, previous NIR models which can be used in detection of gossypol in cottonseed meal, can be barely applied in breeding trails.30 In this present study, spectroscopy was investigated the feasibility of analyzing gossypol in intact cottonseeds based on NIR spectrometer. The main aim of this study was to establish an optimal model which could provide a powerful technical support for cotton breeders and other people who work on cottonseeds.