Study population
A total of 50 unrelated Chinese patients with symptoms consistent with DLSS and 25 unrelated healthy controls were consecutively recruited between the year 2016 January and 2017 December from the Orthopaedic Department in China-Japan Friendship Hospital. This study was conducted with the approval of the ethics committee of the China-Japan Friendship Hospital Institutional Review Board. Written informed consent was obtained from all patients whose specimens and clinical information were used for this study. All patients were examined both clinically and radiologically to confirm the degenerative nature of the stenosis. All DLSS patients met the all following three criteria: (1) diagnosis of DLSS by magnetic resonance imaging (MRI); (2) conservative treatment and monitoring for more than 3 months by spinal surgeons; and (3) a history of typical LSS symptoms, consisting of self-reported intermittent neurogenic claudication, stenotic symptoms extending to the lower extremities upon extension of the lumbar spine, or numbness or weakness of the lower extremities. Patients with neurological disorders, spinal fracture, spondylolisthesis, spinal tumor, presentation with trauma, and diagnosis with an infectious disease were excluded. 25 biologically unrelated healthy individuals, with a similar ethnic background, were enrolled as the control group.
Whole exome sequencing
Genome DNA was extracted from whole blood with TIANamp Blood DNA kit (TIANGEN BIOTECH,Beijing, China). Exome capture was performed using Agilent SureSelect Human All Exon V6 kit (Agilent Technologies, Santa Clara, CA) as per the manufacturer’s instructions. A whole exome sequencing was performed on an Illumina HiSeq Xten platform. Sequencing‑derived raw image files were processed by Illumina base calling Software with default parameters, and sequence data for each individual were generated as paired-end reads, defined as ‘raw data’. The experiment was performed according to manufacturer’s protocol. One sample was loaded per flow cell lane to obtain a minimum of 10× read depth across ~96% of the target regions.
Exome data analysis
Bioinformatics analysis started with the raw sequencing data from the Illumina pipeline. Raw data were processed and filtered and low-quality reads were discarded. Low-quality sequencing reads were filtered by the following criteria:1) removal of reads containing the abnormal sequencing adapter, 2) removal of reads with a low-quality base ratio (base quality less than or equal to 5 that was more than 50%; 3) removal of reads with an unknown base (‘N’ base) ratio that was more than 10%. Variants located in exon regions or alternative splicing regions were retained. Variant frequency was compared with the 1000G SNP database (http://www.1000genomes.org/). Exome Aggregation Consortium (ExAC) (http://exac.broadinstitute.org;) Analysis for potential deleterious mutations was performed using various algorithms, such as PolyPhen2 (http://genetics.bwh.harvard.edu/pph2/) (8), SIFT (http://sif.jcvi.org/) (9), Mutation Taster (http://www.mutationtaster.org/) (10) as well as CADD (Combined Annotation Dependent Depletion). We used the identification of rare combined with at least two of the four algorithms above to prioritize potential pathogenic variants. R software (http://www.bioconductor.org/packages/release/bioc/html/maftools.html) was used to perform the clustering analysis to determine statistical significance in genotype and allele frequencies between patients and controls (FDR, false discovery rate < 0.05). Phenolyzer (Phenotype Based Gene Analyzer, http://phenolyzer.wglab.org/) was used to discover genes based on user-specific disease/phenotype terms.
Read Mapping to reference sequences
All single nucleotide variants (SNVs) of 75 samples were obtained by comparing the valid sequencing data with the human reference genome (UCSC Genome Browser hg19) using Burrows-Wheeler Aligner (BWA) software to gain the primary mapping results. And then, the aligned data were sorted by SAMtools to select the best mapping positions, and the duplicated reads were marked by Picard (http://sourceforge.net/ projects/picard/) so that they could be used in the next analysis.
In addition, the functional annotation was conducted to find the genetic variation associated with DLSS. First, all variants were annotated using the ANNOVAR software tool. Then, the normal population variant databases, including 1000g2015aug_all, 1000g2015aug_Chinese,esp6500siv2_all,NovoDb_WES_SNP and ExAC database, were performed to exclude the common variations occurring with no more than 1% minor allele frequency (MAF).
Candidate susceptible variants and gene selection
The SIFT, PolyPhen-2 and MutationTaster were performed to predict whether the substitution of amino acid affected the function of the protein. The mutation was selected as candidate susceptible variant if one of the three software showed it was pathological.
Gene function enrichment analysis
Toppgene suite was used for gene list enrichment analysis and candidate gene prioritization based on functional annotations and protein interactions network (https://toppgene.cchmc.org/). A web-based portal named Metascape (http://metascape.org/) was used to do Gene Ontology (GO) enrichment analysis. GSEA software (version 3.0) with “c2.cp.kegg.v6.1symbols.gmt” gene sets database was used to do Gene Set Enrichment Analysis (GSEA) (http://www.broadinstitute.org/gsea/index.jsp), to identify significantly enriched or depleted genes that show statistically significant, concordant differences between two given clusters.
Statistical analysis
A Fisher's exact test was used for evaluating the association between rare variants and disease phenotype (case/control data) with Benjaminiand-Hochberg method, which is a type of false discovery rate (FDR) test and is used in multiple hypothesis testing to correct for multiple comparisons.