Patients and stool samples
Methodological comparisons were carried out using an evaluation cohort with stool samples from 24 patients (nine cancers, eight polyps, and seven controls) who underwent colonoscopy procedures at Akershus University Hospital (Ahus) from 2022 to 2023. After establishing the best performing protocol for RNA extraction and RT-PCR analysis, the chosen protocol was tested with 68 stool samples from a test cohort of patients (22 cancer, 24 polyps, and 22 controls) who underwent colonoscopy from 2014 to 2017 (Table 1). The study design is depicted in Fig. 1. Stool samples were collected prior to bowel preparation or 1 ̶ 2 weeks after colonoscopy and were immediately preserved in RNAlater and stored at -80 °C after 1-3 days. Colonoscopies were scheduled for various medical reasons, such as gastrointestinal bleeding, weight loss, alterations in bowel habits, or the detection of polyps or malignancies through CT colonography. Following findings from the colonoscopy, the patients were grouped into cancer, polyps, or controls. Prior to colonoscopy, the patients were invited to participate in the study and received written and oral information about the additional samples that would be collected and about their rights to withdraw from the study at any time. Written informed consent was obtained from all patients. The study is approved by the regional committee for medical and health-related research ethics (REK 2012/1944) and the data protection manager at Ahus.
(PLACE TABLE 1 HERE)
(PLACE FIG. 1 HERE)
RNA extraction from stool samples using three different methods.
The 24 stool samples from the evaluation cohort were used to extract RNA with three different extraction methods: Stool total RNA purification kit (Norgen Biotech Corp., Ontario, Canada), miRNeasy Mini kit (Qiagen, Hilden, Germany), and the NucliSENS EasyMAG system with the generic protocol for stool samples (BioMérieux, Lyon, France). The methods were chosen based on different extraction principles, as described in Table 2.
(PLACE TABLE 2 HERE)
Standardization of the extraction process was achieved by using 200 µL of stool preserved in RNAlater combined with the respective lysis buffer from each kit (Table 2). The mixtures were homogenized in respective bead tubes using a Vortex Genie 2 (Scientific Industries, Bohemia, NY) at a speed of 2850 rpm for five minutes (Table 2).
(PLACE TABLE 2 HERE)
Following homogenization, the protocols showed minor variations. In the Norgen and EasyMAG protocols, the tubes underwent centrifugation at 17000 g for a duration of three minutes. Afterward, 600 µL of the supernatant was carefully transferred to a new tube for extraction, following the manufacturer's specifications. In the Qiagen protocol, tubes were kept at room temperature for 5 minutes, then added 140 µl of chloroform and mixed thoroughly for phase separation. Tubes were centrifuged at 12000 g at 4 °C for 15 minutes before the supernatant (350 µL) was transferred for further extraction following instructions from the manufacturer. For quality control of the extraction processes, 3 µl of cel-miR-39 from the microRNA Cel-miR-39 Spike-in kit (Norgen Biotek) was added to the supernatant of each sample prior to further processing. RNA extraction was performed in duplicate for each sample in all three extraction kits.
DNase treatment to eliminate DNA contamination
The removal of DNA contamination from feces samples was achieved through specific enzymatic procedures, ensuring the purity of the RNA. The Norgen and Qiagen kits utilized the Qiagen RNase-Free DNase Set for the samples. This treatment involved applying the kit components directly to the spin columns to degrade any potential DNA contaminants. For the nucleic acids obtained through the EasyMAG (EM) system, RQ1 RNase-Free DNase (Promega, Madison, WI, USA) was used for DNA degradation following the instructions from the manufacturer.
Assessment of RNA quantities and qualities.
To measure the concentration and evaluate the purity of RNA extracted from stool samples, the Nanodrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and the Qubit 3.0 fluorometer with a Qubit RNA HS assay kit (Life Technologies, CA, USA) were employed. Furthermore, for a selection of samples from all kits, the integrity and size distribution of the RNA molecules were analyzed using the Agilent 2100 Bioanalyzer (Agilent, Palo Alto, CA, USA), along with the Agilent RNA 6000 Nano Kit.
(PLACE TABLE 2 HERE)
Real-time reverse transcription PCR protocols
Based on previous results with high expression of the housekeeping gene glyceraldehyde phosphate dehydrogenase (GAPDH) in mucosal samples, this target was selected for analysis in stool [17]. Three RT-PCR protocols were evaluated. First protocol was a two-step procedure based on cDNA synthesis with the iScript cDNA synthesis kit (Bio-Rad Laboratories, CA, USA), followed by PCR with SYBR green detection using the QuantiNova SYBR Green PCR kit (Qiagen, Hilden, Germany) and primers from Origene qstar (Table 3). The second protocol was also a two-step protocol with the same iScript cDNA synthesis, followed by a TaqMan probe-PCR using Brilliant III Ultra-Fast QPCR Master Mix (Agilent) and a premade GAPDH assay from ThermoFisher (Table 3). The third protocol was a one-step protocol using the Superscript III one-step RT-PCR kit (Invitrogen) and the TaqMan probe GAPDH assay from Thermofisher. Details of each protocol are described in Table 3. Quality controls: Negative controls were included for all assays and with each set-up. To test for false-positive results due to genomic DNA contamination, samples were also tested without reverse transcription. Additionally, agarose gel electrophoresis was performed to check for extra bands, and melting curve analysis was performed for the SYBR green assay. To assess for successful cDNA synthesis, RNA from all samples was spiked with 1 µl of the 200 base-pair RNA control (oligo-IC) (Qiagen) prior to reverse transcription, and the QuantiNova SYBR Green PCR kit (Qiagen) was used to detect this control oligo. To detect the cel-miR-39 “spike-in” control RNA that was added prior to RNA extraction, primers from the same kit (microRNA Cel-miR39 Spike-In Kit, Norgen Biotek) were applied with the miRCURY LNA RT kit and miRCURY LNA miRNA SYBR Green PCR kit for detection (Qiagen).
(PLACE TABLE 3 HERE)
Detection of cancer-associated gene transcripts in stool from cancer, polyp, and control groups
Based on the results from the methodological comparisons, RNA extraction with Norgen and one-step Superscript III RT-PCR were used for the analysis of the test cohort with 69 patient samples from three groups: cancer patients (n = 22), polyp patients (n = 24), and controls (n = 22) (Table 1). Immune-related genes that have been shown to be highly expressed in tumors were compared between the patient groups; these were CXCL1, IL1B, IL6, IL8 (CXCL8), PTGS2, and SPP1 [17]. The following primer/probe assays were used: CXCL1 Assay ID: Hs00236937_m1, IL1B Assay ID: Hs01555410_m1, IL6 Assay ID: Hs00174131_m1, IL8/CXCL8 Assay ID: Hs00174103_m1, PTGS2 Assay ID: Hs00153133_m1, and SPP1 Assay ID: Hs00959010_m1 (ThermoFisher). All RT-PCR experiments were performed in technical duplicates. PCR efficiency was evaluated and corrected for each PCR assay using LinRegPCR [36]. Negative controls were included in all experiments.
Statistical analysis
To assess the normality of the distribution of differences between the extraction methods, the Shapiro-Wilk test was employed. The criteria for a normal distribution were not met across all analyses; hence, subsequent analyses utilized a non-parametric statistical approach. For comparative evaluation of the extraction methods and different PCR protocols, box plots were constructed, and a one-way repeated measures analysis of variance by ranks, specifically Friedman's test for non-parametric samples, was performed. To determine which specific methods were significantly different, we performed post-hoc pairwise comparisons using Dunn's test.
In the test cohort, variations in sample quantity were normalized using GAPDH. Transcription profiles were compared using the 2-ΔCt method [37], with normality tested by Shapiro-Wilk. Statistical differences across groups were assessed using the Kruskal-Wallis test and specific comparisons through the Mann-Whitney U test.
Statistical analyses were conducted using Python (version 3.10.9). The following libraries were utilized: pandas for data manipulation and analysis [38], numpy for numerical operations [39], matplotlib for data visualization [40], scikit-learn for machine learning and statistical modeling [41], and scipy for scientific and technical computing [42].