Blood samples collection
Blood samples were collected from both 20 healthy controls (H_B) and 20 SLE patients (SL_B) in Shenzhen People’s Hospital (Guangdong, China). Recruited criteria: SL_B patients diagnosed as having SLE disease while H_B without SLE or other immune diseases or treated with immunosuppressant. The study was approved by the ethics committee of Shenzhen People's Hospital, and the experiments were undertaken and carried out by the guidelines of the Declaration of Helsinki of 1995 (as revised in Edinburgh 2000). Besides, all participants signed informed consent forms.
PBMCs preparation
PBMCs from each donor were obtained by using the Ficoll density gradient separation method (Sigma, USA). Then PBMCs was stored at –80 °C for further treatment.
Whole-genome bisulfite sequencing (WGBS)-DNA methylation
Genomic DNA library construction and sequencing
Total DNA was extracted by QIAamp Fast DNA Tissue Kit (Qiagen, Dusseldorf, Germany) and measured by spectrophotometer. The fragmented DNA samples by using sonication were subjected to bisulfite conversion. The Accel-NGS Methyl-Seq DNA Library Kit (Swift, MI, USA) was utilized for attaching adapters to single-stranded DNA fragments. Then, the pair-end 2×150bp sequencing was conducted by an Illumina Hiseq 4000 platform.
Bioinformatics analysis for WGBS data in PBMCs
Sequence quality verification was performed by the FastQC online analysis tool (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). WALT was used for mapping qualified reads to the reference genome[21], and samtools was used for further deduplication of reads[22]. The DNA methylation level was determined by using perl scripts and MethPipe[23]. The R package-MethylKit was utilized for calculation of differentially methylated regions (DMRs)[24].
Whole transcriptomics
Small-RNA library construction and sequencing
Total RNA was extracted from PBMCs by using TRIzol reagent (Invitrogen, CA, USA). TruSeq Small RNA Sample Prep Kits (Illumina, San Diego, USA) was used for small RNA sequencing library construction, and then the library was sequenced by an Illumina HiSeq 2500 with 50 base single-end reads.
Differential expression analysis for RNAs
The raw datasets were processed by the program of and Cutadapt and ACGT101-miR (LC Sciences, Houston, TX, US)[25]. Qualified reads were mapped to the human reference genome (GRCh38) by Bowtie 2[26] and TopHat2[27]. Differential expression levels of mRNAs, miRNAs, circRNAs and lncRNAs were calculated [28] and Fisher’s exact test according to |log2 (fold change) |> 1 and p-value < 0.05.
Bioinformatics analysis for mRNAs and lncRNAs
Transcripts Assembly: Firstly, qualified reads were sorted by Cutadapt[25], then verified by FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Reads were mapped to the genome of humans by the usage of Bowtie2[26] and Tophat2[27]. The mapped reads of each sample were assembled using StringTie[28]. Then, all PBMSs’ transcriptomes were merged to reconstruct a comprehensive transcriptome. Finally, tools of StringTie[28] and Ballgown [29] were utilised for calculating transcripts’ expression.
LncRNAs identification
First of all, the known mRNAs and transcripts shorter than 200 bp were castoff. Then CPC[30], CNCI [31] and Pfam[32] were used for prediction of transcripts.
Localization of lncRNAs on genome
With the help of program Circos[33], localization and abundance of lncRNAs in the genome could be present in the way of diagrams. The lncRNAs in diagram were subdivided into six categories (I, J, O, U, X, K) according to their class code generated by StringTie: (I) A transfrag falling entirely within a reference intron (Intronic); (J) Potentially novel isoform or fragment at least one splice junction is shared with a reference transcript; (O) Generic exonic overlap with a reference transcript; (U) Unknown, intergenic transcript (intergenic); (X) Exonic overlap with reference on the opposite strand (antisense); (K) The known, lncRNAs .
Prediction of miRNA targets
TargetScan[34] and Miranda[35] were tools for prediction of miRNA targets.
Bioinformatics analysis for circRNAs
Firstly, Cutadapt[25] was used for un-quality reads filtering, then sequence quality was verified by online analysis tool (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Bowtie2[26] and Tophat2[27] were used for humans genome mapping. Unmapped reads were mapped by using tophat-fusion[28]. CIRCExplorer [26] was used to assemble the mapped reads to circRNAs; Then, back splicing reads were identified in unmapped reads. The expression level of circRNAs from different samples or groups was calculated. By the R package edgeR[27], the comparisons with p-value < 0.05 were selected.
Bioinformatics analysis of miRNAs
RNA families (rRNA, tRNA, snRNA, snoRNA) and repeats were applied to filter non-miRNA sequences in the ACGT101-miR program (LC Sciences, Houston, Texas, USA). Subsequently, the remaining sequences were mapped to human precursors in miRBase 22.0 by BLAST to identify both known miRNAs and novel 3p- and 5p- derived miRNAs.
The Prediction of Target Genes of miRNAs
Two computational target prediction algorithms (TargetScan 50 and Miranda 3.3a) were used for the prediction of genes targeted by these most abundant miRNAs.
Proteomics-iTRAQ
Samples were eluted in lysis buffer and further processed by the procedures in the previous report [20] and finally get the supernatant aliquot. Protein was digested with Trypsin Gold (Promega). Then peptides were dried and reconstituted in 0.5 M TEAB and processed according to the manufacture’s introduction. The peptides were subjected to nanoelectrospray ionization. Intact peptides were detected in the orbitrap. The details of the protocol were referred from a previous study[20].
Bioinformatics analysis for proteomics
Msquant search engine (version 2.3.02) and related database were utilized for protein identification[36]. Each confident protein contains at least one unique peptide. The median peptide ratio in Mascot was used for normalization of the quantitative protein ratios.
Functional enrichment analysis
The databases of Gene Oncology (GO) (http://www.geneontology.org) and Kyoto Encyclopedia of Genes and Genomes (KEGG) (http://www. genome.jp/kegg) were utilized for predicting the underlying biological functions and pathways of the candidate target genes. DNA methylation, ceRNAs (lncRNAs, circRNAs), miRNAs, mRNAs and proteins related genes were put into the database for annotation, visualization and integrated discovery (DAVID) for GO and KEGG pathway analysis.
Statistical analysis
Before further analysing of data, all variables were go through normality and equal tests. Unpaired Student’s t test was used for analysis the difference of normal distributed data while nonparametric tests were used for non-normal distributed data. Data were analysed by GraphPad Prism (Version 6, CA) and GraphPad Prism uses the abbreviation mean±SEM. P <0.05 was considered as statistically different.
Signaling networks construction
TargetScan (v. 5.0)[37], miRanda (v. 3.3a) [38] were used for predicting potential regulation networks of miRNA-mRNA and lncRNA-miRNA-mRNA. The regulatory networks of methylation-mRNA, methylation-miRNA-mRNA and protein-mRNA were unveiled in PBMCs-SLE. The regulation networks in proteins-proteins was predicted by the STRING database (http://string-db.org/) and KEGG database. The overlapping predictions between two datasets were considered as effective target pairs, and all signaling networks were constructed by Cytoscape (v. 3.5.0) [39].
qRT-PCR verified the identified genes in the PBMCs from SLE patients and healthy people
The technical verification of identified RNA sequencing data in the PBMCs of SL_B and H_B was under the help of the Quantitative real-time polymerase chain reaction (qRT-PCR). Total RNA was extracted from the PBMCs described above and subjected to first-strand cDNA synthesis using the TRUEscript 1st Stand cDNA SYNTHESIS Kit (Aidlab, Beijing, China). The specific primers used in the qRT-PCR are listed in supplement material (Table S10), and qRT-PCR was performed using an SYBR PrimeScript miRNA RT-PCR Kit (TianGen Biotech, Beijing, China). GAPDH and U6 were used as internal controls. Three independent biological replicates for each gene. The comparative 2-ΔΔCt analyzed method was used in this study.