Study design, participants, and data collection procedures
This cross-sectional analysis utilized baseline data from a prospective cohort of 286 women from the U54CA221205 project. The details of recruitment and enrollment for this study have been described previously [20]. Participants eligible for the study were recruited from Jos University Teaching Hospital (JUTH) and Lagos University Teaching Hospital (LUTH) between March 2018 and September 2022. The eligibility criteria included women aged 18 years or older who were not pregnant, had no history of hysterectomy, and were not receiving cervical cancer treatment at the time of recruitment. Eligible and enrolled participants completed an interview-administered survey to assess their clinical and sociodemographic data, personal behaviors, and practices in the participants’ language of choice (English or Hausa).
HIV diagnosis and care information
For this study, the HIV status of participants who received care and treatment at the Presidential Emergency Plan for AIDS Relief (PEPFAR) program of the two participating institutions was obtained from the adult HIV treatment and care database, as previously described [21, 22]. HIV testing followed the national serial algorithm, which involves the use of Rapid Determine Test (Abbott, California, USA), Unigold (Trinity Biotech Plc., Ireland), and STAT Pack (Chembio Diagnostic Systems, Inc., New York, USA) quick HIV diagnostic test kits. All HIV-positive women who were receiving care in the PEPFAR program at both study sites were on antiretroviral therapy (ART) at the time of study enrollment. For those whose HIV infection was diagnosed during enrollment, HIV counseling was provided, and they were linked to care and initiated on ART in the PEPFAR program of the participating institutions.
Specimen collection and processing
Suspected cases of cervical cancer seen at the gynecologic oncology units of JUTH and LUTH were evaluated by the oncology team of investigators at both institutions. The evaluation followed standard care of the diagnostic assessment of suspected cases of cervical cancer at both institutions, as previously described [21, 22]. This included examination under anesthesia (EUA), colposcopy, clinical staging, and cervical tissue biopsy for histopathological diagnosis. The consent form for this project provided details of these evaluations and procedures, and only those who provided written informed consent to participate were enrolled. Women suspected of having cervical cancer and presenting at the gynecologic oncology unit underwent colposcopy. Tissue biopsy forceps were used to obtain three punches of specimens. Two pieces of cervical tissue were immediately placed in transport medium and sent to the genomic laboratory at JUTH and LUTH, where they were stored at -80°C until DNA extraction. The third biopsy specimen was fixed in formalin and transported to the histopathology laboratory for processing and histologic examination by a trained pathologist. Histopathological diagnosis, clinical staging, and tumor grading diagnostic evaluation of cervical tissue were subsequently performed by expert pathologists at the two enrollment institutions with quality control through telepathology review by Northwestern University’s Pathology core [23].
Cervical tissue DNA extraction and quantification
DNA was extracted from approximately 25–30 mg of tumor and normal cervical tissue biopsies following our previous method [20] using QIAGEN QIAamp DNA Mini Kit (Qiagen, Hilden, Germany). DNA was quantified using a Qubit 4.0 fluorometer (Thermo Fisher Scientific, Waltham, MA, USA) with a dsDNA BR Assay (Life Technologies, Grand Island, NY). The DNA samples were stored at −80°C until shipment. All the DNA samples were shipped on dry ice to the Pathogenomic Core facility at Northwestern University and stored at -20°C. This study transferred 10 µL of 5 ng/µL concentration from all samples into 96-well microplates (Thermo Fisher Scientific, Waltham, MA, USA). The samples in the 96-well microplates were subsequently transferred on dry ice to the Genomics and Microbiome Core Facility at Rush University for the detection and genotyping of HPV.
Detection of human papillomavirus using next-generation sequencing
The cervical tissue DNA was processed by next-generation sequencing (NGS) using a two-stage PCR protocol, as previously reported [24]. The DNA was amplified with pooled PGMY primers (Integrated DNA Technologies, Coralville, IA, USA) targeting the 450 bp L1 gene fragment. These primer sequences were originally published by Dube et al. (Additional File 1: Table S1) [25]. The pools consisted of five PGMY11 and 14 PGMY09 primers, as described previously [25], but were modified with Fluidigm CS1 (PGMY11) and CS2 (PGMY09) linkers [24]. The PGMY amplicons were generated using Tough Mix PCR Master Mix (Quantabio, Beverly, MA, USA) with the following thermocycling conditions: initial denaturation at 98°C for 120 s; 32 or 40 cycles of 98°C for 10 s, 50°C for 1 s, and 68°C for 1 s. Samples generating no amplification at 32 cycles were re-amplified with 40 cycles of PCR. A negative control was generated using 1 µL of DNA-free water as the template. The amplicons generated during the first stage of PCR were subsequently used as a template for the second stage of PCR amplification (8 cycles) with Fluidigm primers containing sequencing adapters and sample-specific barcode sequences using the same master mix conditions described above [24]. The thermocycling conditions were the same as those described above, except that the annealing temperature was 60°C, and only 8 cycles were performed. The final libraries containing the PGMY amplicons from the HPV L1 region were sequenced on an Illumina MiSeq sequencer (Illumina, Inc., San Diego, CA, USA) using V3 chemistry and 2 × 300 base reads. The mean and median depth of sequencing were approximately 16,500 clusters/sample (range 834–29,571).
To verify that the samples contained amplifiable DNA, PCR reactions were also performed with primers targeting human beta-actin (GH2O_FP and PC04_RP) [24]. The PCR conditions were the same as those for the PGMY amplicons, with the exception that only 28 cycles were performed. Amplicons were evaluated using agarose gel electrophoresis.
Bioinformatics
We counted the number of HPV sequences per genotype for each sample using a data analysis pipeline implemented within the software package CLC Genomics Workbench (v22). Briefly, raw reads were imported and trimmed at the Q20 level. Forward and reverse reads were merged using the read merging function with default settings. Subsequently, sequences without both forward and reverse primer sequences in the proper orientation were removed from the dataset. Merged, primer, and quality trimmed sequence data were mapped against a reference database of 34 reference HPV sequences (Additional file 2: Table S2) to identify the HPV genotypes within each sample.
Data management
All clinical and survey data were retrieved from REDCap (Research Electronic Data Capture) and analyzed using Stata/SE version 17 for Windows (Statacorp LLC, College Station, TX, USA). We have previously reported on the details of our experience using REDCap to manage research data for this study cohort [20].
Statistical analysis
The study participants were categorized into three groups based on their HIV and ICC status: HIV-negative women with ICC, HIV-positive women with ICC, and HIV-positive women without ICC. We compared the baseline sociodemographics, personal behaviors, and practices of the participants across the three groups. ANOVA, or the Kruskal‒Wallis test, was used to compare the means of continuous variables across groups. Pearson's chi-square tests were used to evaluate categorical datasets, or Fisher’s exact tests were used for categorical variables with small cell sizes.
Our primary outcome was HR-HPV infection, which was defined (yes vs. no) according to the recommendations of the International Agency for Research on Cancer (IARC) [26]. The primary exposure/covariate of interest was HIV status (positive vs. negative), and all other covariates were selected a priori as possible conceptual confounders on the basis of their demonstrated relationships with HR-HPV, HIV, and cervical cancer [1-3]. These covariates included age, body mass index (BMI), marital status, socioeconomic status (employment, educational attainment, and income), age at sexual initiation, smoking history, self-reported history of treatment for any sexually transmitted infections (STIs), parity, and total number of lifetime sex partners.
We categorized parity as ≤ 3, 4-5, 6-7, or >7 term pregnancies based on the literature supporting the importance of these cutoff points in relation to cervical cancer [27]. The total lifetime number of sex partners was categorized as 1, 2--3, or >4. The CD4+ T-cell count was dichotomized (< 350 cells/μl and CD4 ≥ 350 cells/μl) following WHO recommendations [28]. Income was dichotomized as earning <N100,000 (< $ 250) per month and >N1,000,000 (> $ 250) per month using the Nigerian Central Bank exchange rate for dollars [29].
We first performed bivariate analysis and identified variables significant at the p<0.10 level for inclusion in our multivariable regression models. Additionally, we utilized a single stratified analysis to determine what should be included in the multivariable models. A robust (modified) Poisson regression model was used to estimate prevalence rate ratios (PRRs) and identify factors potentially associated with HR-HPV [30]. A backward selection procedure was used to select a parsimonious model, and those with significance at p < 0.05 were retained in multivariable models. Model evaluation was conducted using the Akaike information criterion (AIC), where a minimized AIC indicates a better-fitting and more parsimonious model [31]. The crude and adjusted PRRs and their corresponding 95% confidence intervals (95% CIs) are reported.