Data reporting.
No statistical methods were used to predetermine sample size. The experiments were not randomized, and investigators were not blinded to allocation during experiments and outcome assessment.
Enrichment and cultivation conditions
The oily sludge and oil-produced water used as inoculum were collected from the Shengli oilfield in China (37°54’N, 118°33’E). Upon collection, samples were stored anaerobically at 4 °C. Additional details and parameters of the oily sludge were described previously 17. The basal medium for incubation was made of 9 g L−1 NaCl, 0.3 g L−1 MgCl2·6H2O, 0.15 g L-1 CaCl2·2H2O, 0.3 g L−1 NH4Cl, 0.2 g L−1 KH2PO4, 0.5 g L−1 KCl, 0.5 g L−1 cysteine-HCl, 1 mL L−1 resazurin solution and 2 mL L−1 trace element solution 42. The medium was boiled for 30 min and then dispensed in serum bottles (Shuniu) with butyl rubber stoppers (Bellco Glass). The headspace was replaced by 99.999% oxygen-free N2. After sterilization at 121°C for 30 min, the basal medium was supplied with filter-sterilized vitamin mixture (2 mL L−1) 43, vitamin B12 (2 mL L−1) and vitamin B1 (2 mL L−1) solutions. Fresh medium was made of the basal medium supplied with 10 mg L−1 filter-sterilized 2-mercaptoethanesulfonic acid (CoM), 1% selenite-tungstate solution 44, 0.5 g L−1 yeast extract, and 20 mM 2-morpholinoethanesulfonic acid monohydrate (MES) buffer. 1 M HCl or NaOH solutions were used to adjust the pH of media to 6.5-7.0 prior to use. The oil-produced water incubated with autoclaved 20 mM acetate, 20 mM propionate, 20 mM butyrate, and oily sludge incubated with n-hexadecane (1 mL L-1, Sigma-Aldrich), eicosane (1 mL L-1, Sigma-Aldrich), alkanes mix including n-docosane (1 g L-1, Sigma-Aldrich), n-hexadecylcyclohexane (1 mL L-1, TCl), and n-hexadecylbenzene (1mL L-1, TCl), or and crude oil (1 g L-1) as substrates in basal medium were performed at temperatures from 25 °C to 75 °C. The enrichments with the relative abundance of Verstraetearchaeia above 20% were chosen for further isolation (Extended Data Fig. 1). These cultures were 10-fold serially diluted in 96-well microplates with fresh medium of different compositional substrates (10 mM acetate, 10 mM glucose, 10 mM methanol, 10 mM lactate, 10 mM pyruvate, 10 mM succinate, 10 mM formate, 0.5 g L-1 casamino acids, 0.5 g L-1 rumen fluid) at 55 °C. Detection of Verstraetearchaeia in 96-well microplates was determined by PCR with specific- Verstraetearchaeia primer (MSR4F/MSR4R, Supplementary Table 6). After 3 times transfer with dilution of 1:100000 in mixed substrates of acetate, methanol, and lactate, Verstraetearchaeia was detected. These positive Verstraetearchaeia cultures in 96-well microplates were transferred into 10 mL Hungate tubes. Consecutive subcultures with dilutions varying from 0.1%-10% in fresh medium supplemented with 10 mM methanol were then conducted in the Hungate tubes. To remove bacteria, 10-fold serial dilutions supplemented with 500 mg L-1 lysozyme and several antibiotics including 100 mg L-1 ampicillin, 10 mg L-1 chloramphenicol, 100 mg L-1 gentamicin, 100 mg L-1 kanamycin, 100 mg L-1 streptomycin, and 50 mg L-1 vancomycin separately or mixed together were used.
If not especially mentioned, the Verstraetearchaeial cultures were incubated in 120 mL serum bottles with 50 mL fresh medium, 10 mM methanol, and 10 kPa hydrogen at 55 °C. BES was added to a concentration of 10 mM. Methylated compound utilization was examined by adding 5 mM dimethylamine, 5 mM methanol, 5 mM methanethiol, 5 mM monomethylamine or 5 mM trimethylamine into fresh medium at 55 °C. Temperature, pH, salt tolerance, and essential growth factors tests were determined as previously described 45. All incubations were performed triplicate in the dark without shaking.
In the stable isotope tracer experiment, 1, 2, 4, and 8 % 13C-labeled methanol (Sigma, USA) was made by mixing 0.1 mL, 0.2 mL, 0.4 mL, and 0.8 mL 0.05 M 13C-labeled with 0.495 mL, 0.49 mL, 0.48 mL, and 0.46 mL 1 M unlabelled methanol, respectively. 0.5 mL 1 M unlabelled methanol was used as the control group. Methanol was added to a final concentration of 10 mM. The actual content of the initial labelled methanol was measured by Delta V advantage (Thermo Fisher). To determine the utilization of carbon sources, 5 mM 1,2-13C-acetate (Sigma, USA), 1% (w/w) 13C-NaHCO3 (Sigma, USA), or 0.5 g L−1 13C-yeast extract were carried out in fresh medium with 10 mM unlabeled methanol. 10 mM 13C-methanol was performed in fresh medium. The 13C-yeast extract was made from the labeled 13C cell extract of Pichia pastoris 46. Briefly, P. pastoris was incubated in a minimal medium 46 with 2 g L-1 13C6-D-glucose (Sigma, USA) as the sole carbon source. After incubation for 4 days at 28 °C, the cells were centrifuged at 8000 g for 10 min and washed with phosphate-buffered saline (PBS, 1 M, pH 6.8) three times. The washed cells were then ultrasonicated on ice for 50 min. The supernatant from the ultrasonic cells was collected and the precipitation was ultrasonicated in 50% methanol for 10 min twice and centrifuged. All the supernatant was freeze-dried for further use. All the incubations were performed in triplicate at 55 °C in the dark without shaking.
To determine the fermentation growth, the Verstraetearchaeial co-culture (strain LWZ-6/CY-2) was incubated with 20 mM acetate, 20 mM glucose, 20 mM pyruvate, or 5 g L-1 yeast extract in fresh medium, and 4 mM 20 amino acids mix with 10 mM BES in fresh medium (0.5 g L-1 yeast extract changed into 0.05 g L-1), and 5 g L-1 casamino acids, 5 g L-1 keratin hydrolysates or 5 g L-1 yeast extract with 10 mM BES in fresh medium (without 0.5 g L-1 yeast extract). 20 amino acids include alanine, arginine, aspartate, cysteine, methionine, glutamate, glycine, histidine, lysine, L-Asparagine, L-glutamine, L-isoleucine, L-leucine, L-Phenylalanine, L-Proline, L-Serine, L-Threonine, L-Tryptophan, L-Tyrosine, L-Valine. All the incubations were performed in triplicate at 55 °C in the dark without shaking.
Methane production and qPCR were used to determine the growth of strain LWZ-6. The specific growth or methane production rate (µ) of strain LWZ-6 at the log phase was calculated according to the equitation (1): which means the doubling copy numbers or methane production from X1 to X2 at incubation time from t1 to t2. The methanogenetic rate was calculated according to the equitation previously described 47.
Chemical analysis
The concentration of CH4 and CO2 in the headspace was measured by using an Agilent GC 7820A gas chromatography (GC) system equipped with a Porapak Q column (length, 3 m; inner diameter, 0.32 mm) and a thermal conductivity detector (TCD) 48 using hydrogen (99.999%) as carrier gas at a flow rate of 27 mL min−1. The GC temperature of injection, column, and TCD was 65 °C, 120 °C, and 130 °C, respectively. The concentration of H2 in the headspace was determined using the same Agilent GC 7820A parameters with nitrogen as carrier gas at a flow rate of 38 mL min−1. The gas pressure in Hungate tubes or serum bottles was measured using a barometer (Ashcroft) at room temperature. Each measurement was carried out by injection of 0.1 mL gas using a pressure lock syringe (Vici) at room temperature.
Methanol was measured by using an Agilent GC 7890A gas chromatography system equipped with a DB-WAX column (length, 30 m; inner diameter, 0.32 mm) and a flame ionization detection (FID) using nitrogen (99.999%) as carrier gas at a flow rate of 27 mL min−1. The initial oven temperature was 50 °C, held for 1 min, and increased to a temperature of 220 °C at the rate of 60 °C min-1, FID temperature remained at 250 °C. Acetate was determined by high-performance liquid chromatography (Agilent HPLC 1200) 49.
The carbon isotopic compositions of CH4 and CO2 in the headspace were determined by coupling gas chromatography and a mass spectrometer (IsoPrime100) 50. Briefly, CH4 and CO2 in gas samples were first divided by GC (Agilent GC 7890B) with an HP-PLOT/Q column (30 m; inner diameter, 0.32 mm; film thickness, 20 μm). The carrier gas is helium (99.999%) at a flow rate of 2.514 mL min-1. The oven and injector temperatures were 60 °C and 105 °C, respectively. The CH4 was then converted to CO2 in a combustion furnace (IsoPrime) at a temperature of 1050 °C. The carbon isotopic composition of CO2 was measured by the IsoPrime100 isotope ratio mass spectrometer. The carbon isotope composition of methanol was determined by Delta V advantage (Thermo Fisher) coupled with TraceGC 3000 and equipped with column HP-INNOWAX (30 m*0.25 mm*0.25 μm, USA) using helium (99.999%) as carrier gas at a flow rate of 1.5 mL min-1. BCR-653 WINE (EtOH, low level) (JRC-IRMM) was used as standard.
The labeled methanol in the medium and labeled CH4 in the headspace was calculated by equations (2) and (3), respectively.
DAPI staining
Cells were collected and centrifuged at 16000 g for 10 min. The cells were then washed by PBS (1 M, pH 6.8) twice and fixed in 4% formaldehyde for 3 h. The fixed cells were washed again and mixed with Mini-Q water. Cells were dropped on the slide and dyed with 2-(4-Amidinophenyl)-6-indolecarbamidine dihydrochloride (DAPI) staining solution (Beyotime Biotechnology). The dyed cells were dried and mounted with antifade mounting medium (Beyotime Biotechnology). The mounted slide was imaged under laser scanning confocal microscope observation (ZEISS).
CARD-FISH
Samples were fixed in 4 % formaldehyde for 3 h and then washed with PBS (1 M, pH 6.8). The cells were filtrated on a 0.22 μm filter. Embedding was carried out with 0.2% agar on the filter. The embedded filter was then dehydrated with 96% pure ethanol. Permeabilization of the samples was performed with lysozyme solution (0.05 M EDTA pH 8.0, 0.1 M Tris/HCl pH 8.0, 10 mg mL−1 lysozyme, Sigma-Aldrich) at 37 °C for 60 min and proteinase K solution (0.05 M EDTA pH 8.0, 0.1 M Tris/HCl pH 8.0, 15 μg mL−1 proteinase K, Merck) at room temperature for 5 min. Inactivation of cells was conducted by incubating with endogenous peroxidases (10 mL methanol, 50 mL 30% H2O2) at room temperature for 30 min. The samples were hybridized with 16S rRNA gene-specific oligonucleotide probes of EUB-338 targeting bacteria and ARCH-915 26,51 for Archaea (Takara Bio). The optimal formamide concentration in hybridization buffer for the stringency of probes was tested with increasing formamide concentrations (20%-45%). 20% and 35% formamide were chosen for EUB-338 and ARCH-915 for hybridization, respectively. Double hybridization was carried out after inactivation of peroxidases from the first hybridization. The signal amplification was continued using tyramide in dark conditions. Alexa Fluor 488 and Alexa Fluor 594 (AAT Bioquest, Inc) were used for laser scanning confocal microscope observation (ZEISS).
SEM
Cells were fixed with 2.5% glutaraldehyde for 1 h at room temperature. Samples were then washed with sterilized ultra-pure water for 5 min twice. Dehydration of the samples using serial ethanol concentration (30%, 50%, 70%, 80%, 90%, 95%, and 100%, each for 10 min) was performed. Cells were finally sputter coated with osmium (Hitachi, Ltd) and observed under SEM (FEI Company).
TEM
Samples were fixed with 3% glutaraldehyde and then with 1% osmium tetroxide. The cells were dehydrated with acetone by a serial concentration of 30%, 50%, 70%, 80%, 90%, and 95% once and 100% three times. Then the cells were incubated in a mixture of acetone and resin (Quetol-812: Nissin) with a ratio of 3:1, 1:1, 1:3, and polymerized with resins. The polymerized cells were ultrathin sectioned at 60-90 nm by ultramicrotome (UC&rt, LEICA) and sections were mounted on copper grids. The grids were stained with uranyl acetate for 15 min and then stained with lead stain solution for 2 min, after which the grids were imaged by a transmission electron microscope (JEOL).
NanoSIMS
The cells were fixed in 2.5% glutaraldehyde for 2 h. The fixed samples were then rinsed three times in PBS buffer to remove excess glutaraldehyde. The rinsed samples were dropped onto silicon wafers, and then dehydrated sequentially with 25%, 50%, 80%, and 100% solutions of ethanol. SEM imaging was used to localize regions of interest (ROI) with cocci cells of LWZ-6 for further NanoSIMS analysis. Prior to SEM imaging, the wafers were coated with 5 nm platinum. SEM was conducted on a Zeiss EVO18 at a working distance of 6 mm and an electron high tension of 2.0 kV. The wafer samples were further analyzed with NanoSIMS 50L (CAMECA) using a Cs+ primary source 52,53. Secondary ions of 12C and 13C were collected by electron multipliers. The samples were scanned in 10 × 10 µm2 area with 256 × 256 pixel raster. The 13C/12C ratios of the ROI were measured with the OpenMIMS plugin in ImageJ. All images were corrected for the electron multiplier dead time (44 ns) as well as drift corrected.
Lipid-SIP
Lipids of the culture cell were extracted using an acid hydrolysis extraction method 54. In brief, the culture cells were collected by centrifugation at 12000 g for 10 min, the cells were acid hydrolyzed by 10% hydrochloric acid in methanol and then phase separated by adding dichloromethane (DCM) and ultrapure water. The total lipid extracts (TLEs) were filtered and dried prior to analysis.
The TLEs were analyzed by a Waters ACQUITY I-Class Ultra-performance liquid chromatography (UPLC) coupled to an SYNAPT G2-Si quadrupole time-of-flight (qTOF) high-resolution mass spectrometer (HRMS) through an electrospray ionization (ESI). The MS setting was identical to that previously described 55. The mass acquisition mode was Fast-DDA (digital differential analyzer) with a mass range for MS m/z 100–2000 and MS2 m/z 50–2000.
The raw data were processed with Masslynx software (V4.1). The unlabeled samples were employed for lipid structure identification, according to exact molecular mass and MS2 fragment spectra. The degree of 13C stable isotope labeling into lipids was estimated based on their MS1 mass spectra following calculations by Thiele and Matsubara 56. Two representative archaeal lipids 57, archaeol (C43H88O3) and Me-GMGT (C87H160O6) were selected for the calculation of labeling degree. The mass range of isotoplogs was narrowed to m/z 650-710 for archaeol and m/z 1300-1400 for Me-GMGT. The base peak intensity of an isotoplog with i13C atoms (BPIi) was calculated by normalizing it to the highest intensity of all isotoplogs, which was further normalized to the sum BPI of all isotoplogs, resulting in the BPIi(norm). The degree of 13C labeling of an isotopolog with i13C atoms (DoLi) and the total degree of 13C labeling of a lipid (DoL) was calculated with the following equations:
i refers to the number of 13C atoms; n refers to the number of 13C in a molecule.
DNA and RNA extraction
DNA was extracted by bead-beating (Sigma, ≤106 μm) methods combined with an Ezup Column Bacteria Genomic DNA Purification Kit (Sangon Biotech) according to the manufacturer’s protocol. DNA concentration was measured using a NanoDrop 2000 spectrophotometer (Thermo Scientific), and DNA was stored at −80 °C until further processing.
RNA extraction was conducted using bead-beating for releasement and extracted by using an RNAprep Pure Cell /Bacteria Kit (TIANGEN Biotech Co, Ltd.) according to the manufacturer’s protocol.
qPCR
Quantitative PCR (qPCR) was performed to quantify the bacteria, and Methanoculleus using the primers targeting their 16S rRNA gene of 519F/907R58, and ZC2F/ZC2R (Supplementary Table 6), respectively, and primers targeting on 16S rRNA gene and mcrA gene for strain LWZ-6 of MSR4F/MSR4R and mcrA4F/mcrA4R (Supplementary Table 6). The qPCR primers of strain LWZ-6 and Methanoculleus were designed by using NCBI/Primer-BLAST 61. The qPCR process was performed following the previous description 16. The standard DNA for qPCR of bacteria, Methanoculleus and strain LWZ-6 were obtained from the 16S rRNA gene of E.coli DH5α and Methanoculleus recpetaculi ZC-2T, and the 16S rRNA gene and mcrA gene of strain LWZ-6. The qPCR for each targeted gene was determined in triplicate. PCR and qPCR products were sequenced by Sanger sequencing (Sangon Biotech).
16S rRNA gene amplicon sequencing
1 mL culture was taken for DNA extraction to amplicon sequencing. 16S rRNA gene was amplified using the primers sets for bacteria with barcode (341F/806R) 27,59, primers set for archaea Arch519F/Arch915R (Supplementary Table 6) 60, and universal primers targeting both bacteria and archaea with barcode 515FmodF/806RmodR 61 (Supplementary Table 6). The amplicon product was then sequenced by a NovaSeq6000 sequencer (Illumina) with paired-end 250 bp mode (PE250) at Novogene Bioinformatics Technology (Novogene). All the sequence data were first filtered to remove the low-quality reads 62,63. Further reads were analyzed according to the Qiime2 pipeline 64. The sequences were clustered into operational taxonomic units (OTUs; 97% similarity). The taxonomy was determined by using the Naive Bayes method in Qiime2 and the Silva NR99 database (release138) as the reference 65,66.
Metagenome sequencing, assembly, genome binning, and annotation
Genomic DNA was extracted and sent to Novogene for library sequencing using a NovaSeq 6000 platform with PE150, generating raw metagenomic sequencing data. The raw reads were processed as follows: qualified trimmed reads were obtained using Trimmomatic v.0.38 67. De novo assembly was performed using metaSPAdes (v.3.12.0) with the k-mer sizes (21, 33, 55, 77) 68. Genome binning of the assemblies was proceeded using BBMap (v.38.24) and MetaBAT (v.2.12.1). The ambiguous contigs and redundant contigs were removed from binning. Completeness, contamination, and strain heterogeneity were identified using CheckM to evaluate the estimated quality and completeness of each recovered MAGs 69. Taxonomic classification of MAGs was performed according to the GTDB database (release95.0, July 2020) 70. 2 A total of 23 MAGs grouping with Verstraetearchaeia (i.e. Ca. Methanomethylicia) were retrieved from the enrichments. Similarity of each MAG with the other MAGs and publicly available Verstraetearchaeia MAGs was determined using the 16S rRNA gene sequence identity, the average amino acid identity (AAI), the average nucleotide identity (ANI), and the percentage of conserved proteins (POCP) with other published Verstraetearchaeia MAGs. Barrnap was used to obtain the 16S rRNA gene sequences (https://github.com/tseemann/barrnap). ANI of the MAGs was calculated by OrthoANIu (Orthologous ANI using USEARCH) tool 71. The AAI was calculated with CompareM (V0.1.2) (https://github.com/dparks1134/CompareM). POCP calculation was followed by Qin et al 72. The 23 MAGs could be dereplicated 73 into 5 species-level clusters by using an ANI cut-off of 97% (Supplementary Table 3). One MAG was chosen per cluster for downstream analyses. These MAGs were first analyzed by prodigal (v.2.6.3) 74, and then annotation of open reading frames (ORFs) was predicted by using the KEGG server (BlastKOALA) 75 and eggnog-mapper v2 76. The predicted genes ascribing to methanogenesis were further verified using Uniprot 77 and CD-search (conserved domain) in NCBI 78.
Nanopore for closed genome binning and analysis
Metagenome sequencing of the simple culture (stage 3 in Fig. 1) was conducted as described above. The Illumina metagenome was assembled using metaSPAdes v3.13.0 68 under default parameters. Subsequent binning of the contigs using MetaWRAP v1.3 79 gave a Verstraetearchaeial bin (1.54 Mb, 2 contigs). Nanopore data was generated to close the genome: the library was prepared with 60 ng genomic DNA using the ligation sequencing gDNA kit (ONT) following the manufacturer’s instructions and minor modifications. In short, DNA repair and end-prep were performed using the NEBNext FFPE DNA Repair Mix and NEBNext Ultra II End repair/dA-tailing Module (New England Biolabs), followed by the adapter ligation using Quick T4 ligase (New England Biolabs). Subsequently, a clean-up using AMPure beads (Beckman Coulter) and the short fragment buffer (SFB) was performed to retain DNA fragments of all sizes with a final incubation at 37 °C for 10 min. The library was then directly loaded onto a primed SpotON R9.4.1 flow cell in a MinION Mk1C (1366 pores available). Sequencing was carried out for 44 h resulting in 1.57 M reads and 2.73 Gb raw data. Raw Nanopore reads were corrected using Canu version 2.2 80 and used to perform a metaSPAdes hybrid assembly with the Illumina data. After binning the contigs with MetaWRAP, this again resulted in a Verstraetearchaeial bin (1.54Mb, 2 contigs). Additionally, corrected Nanopore reads were assembled using metaFlye version 2.9.1 81. This resulted in a Verstraetearchaeial single-contig of 1.43 Mb. This contig was corrected with the raw Illumina reads, in three rounds/iterations, using Pilon version 1.23 82. Analysis showed that the overlapping areas of the 5 contigs of the 3 Verstraetearchaeial bins were identical and could be combined into a single contig. To this end, raw Illumina reads were mapped onto the 5 contigs using Burrows-Wheeler Aligner (bwa, https://github.com/lh3/bwa) after which mapped reads were extracted from the original Illumina dataset. These reads were used in a SPAdes v3.14.0 assembly to which the 5 contigs were added under the flag --trusted-contigs. This gave a single circular contig (1538194 bp) of strain LWZ-6.
Phylogenomic and phylogenetic tree construction
To phylogenomically place the 5 representative MAGs, publicly available MAGs classified as Ca. Verstraetearchaeota, Ca. Culexarchaeia, Ca. Nezhaarchaeota and Ca. Methanomethylicia were downloaded from GTDB, JGI, and NCBI repositories, together with a set of 92 high-quality genomes spanning archaeal diversity. All MAGs were first analyzed with checkM2 (https://github.com/chklovski/CheckM2). MAGs with completeness below 60% and contamination above 10% were removed from the dataset. Subsequently, the proteomes of all remaining MAGs were predicted using Prodigal 83. The phylogenomic tree was constructed using a set of 76 archaeal markers (Archaea76) 84. To retrieve orthologues from the proteomes and create single marker alignments, GToTree was run with standard parameters. Maximum likelihood (ML) individual protein phylogenies were generated using IQ-TREE v2.0.3 85 under the LG+C20+F+G substitution model with 1000 ultrafast bootstraps. Phylogenetic trees were manually inspected for erroneous inclusion of paralogous or contaminated sequences. If present, such sequences were removed from the dataset, after which remaining sequences were realigned, and phylogenies were re-estimated, as described above. The final single marker alignments were then concatenated into one. Columns with gaps in more than 90% of the sequences were removed using trimAl v1.4.rev22 86, resulting in an alignment of 12084 positions. A first maximum likelihood (ML) phylogenetic tree was generated using IQ-TREE v2.0.3 85 (-bb 1000 -alrt 1000) with the model LG+C60+F+G. The resulting ML tree was then used to generate a posterior mean site frequency ML tree (-tbe -b 100 flags).
Furthermore, a McrA gene phylogeny tree was constructed using 115 McrA sequences retrieved from the 6 representative Verstraetearchaeal MAGs and draft genomes from publicly available (potential) methanogenic taxa of the TACK superphylum, Euryarchaeota, and Helarchaeota. The sequences were first aligned using MAFFT v7.310 87. Columns with gaps in more than 10% of the sequences were removed using trimAl v1.4.rev22 86. The maximum likelihood (ML) phylogenetic tree was then generated using IQ-TREE v2.0.3 85 (-bb 1000 -alrt 1000) with the model Q.pfam+C40+F+G8, as chosen by the Bayesian information criterion (BIC) using ModelFinder Plus 88. The resulting ML tree was then used to generate a posterior mean site frequency ML tree (-tbe -b 100 flags).
Transcriptome sequencing and data analysis
Total RNA was extracted at the exponential methane production stage of strain LWZ-6 when Verstraetearchaeial culture was incubated with methanol and hydrogen in fresh medium. RNA was sent to Novogene for sequencing. The RNA-seq data was produced by NovaSeq6000 instrument with PE150 at Novogene (https://en.novogene.com). The raw data were first trimmed by removing the adaptors and low-quality sequences using Trimmomatic 67 and the mRNA was retrieved with SortMeRNA 89 with default settings after removal of tRNA and rRNA.
Evaluation of the activity of strain LWZ-6 was conducted using the complete circular LWZ-6 genome for analysis. The transcription of genes was calculated using Burrows-Wheeler Aligner (BWA, v. 0.7.17-r1188) 90. The SAM mapping file was transformed into BAM files by SAMtools (v. 1.13) 91. The read coverage was calculated using BEDTools (v. 2.30.0) 92. Numbers of reads was normalized to the length of the genome. Fragments per kilobase million (FPKM) were used to normalize the expression level. The transcribed rank of genes was calculated on log2[FPKM].
Screening 16S rRNA of Verstraetearchaeia in public databases
The 16S rRNA gene reference sequences extracted from the Verstraetearchaeia genomes were submitted to the Integrated Microbial Next Generation Sequencing (IMNGS) platform (https://www. imngs.org/). SRA sequences were retrieved using a similarity cutoff of 90% at a minimum length of 200 bp. 10774 sequences were obtained from IMNGS in which 5187 were present in a relative abundance of at least 0.1% in the dataset. These were grouped into 692 clusters with a 95% similarity cutoff using CD-HIT 93. The Verstraetearchaeial reference sequences and the representative sequence of each CD-HIT cluster together with near-complete 16S rRNA gene sequences from other Archaea and Bacteria were aligned with MAFFT v7.310 87. This alignment was used to construct a phylogenetic tree using IQ-TREE v2.0.3 REF (-bb 1000 -alrt 1000) 85. From this tree, CD-HIT cluster representative sequences robustly grouping with Verstraetearchaeales were linked back to the unclustered dataset of 5187 sequences. This resulted in 1091 sequences clustering with Verstraetearchaeales. The metadata from the sequencing datasets in which these sequences were found was used to analyze the global distribution of Verstraetearchaeales. World map shape with geographical coordinates of Verstraetearchaeales 16S rRNA gene sequences and genomes was created using ggmap 94.