Study population
The Multi-modality Independent Screening Trial (MIST) of breast cancer was a trial that aimed to evaluate and compare the screening performances of clinical breast examination (CBE), breast ultrasonography (BUS), and mammography (MAM) among Chinese women. In briefly, a total of 33,234 asymptomatic women aged 45 to 65 years and lived in local communities for at least 3 years were initially recruited from 5 cities in China (Tianjin, Beijing, Nanchang, Shenyang, and Feicheng) to receive the first-round screening between July 2008 and December 2010. After consent informed and questionnaire interview, all women received CBE, BUS, and MAM concurrently. Baseline blood samples were collected from 10,852 women in Tianjin and Feicheng. All three screening modalities were performed followed unified screening protocols. Physicians performed and interpreted the screening results independently and blindly. Patients with suspicious malignancy and highly suggestive of breast cancer from any of three screening modalities were recommended for pathological examination. All breast cancers were confirmed with combinations of pathological examination, clinical diagnosis, and active or passive follow-up within one year after screenings. Detailed information of MIST referred to our previous studies.[13, 21]
All participants were invited to receive second-round screening from October 2013 to May 2015. Breast cancer incidence and mortality were linked to the local cancer registry and death registry. The diagnosis of cancer and the survival for women in Tianjin were further linked to the Tianjin Electronic Medical Record Information System (TEMRIS) until October 2021. The TEMRIS covered 95% hospitals in Tianjin, and the diagnosis from different hospital were encoded according to International Classification of Diseases (10th Revision). In order to develop long-time risk prediction model of breast cancer, women in Tianjin from MIST (MIST-TJ, N=7826) were included in the final analyses. This study was reviewed and approved by the institutional review board of Tianjin Medical University Cancer Institute and Hospital (TMUCIH).
Sociodemographic and epidemiologic information
After informed consent, all women received a face-to-face questionnaire-based interview conducted by trained investigators to collect information on sociodemographic (age at enrollment, race, marital status, education, family income, and insurance), family history of breast cancer in first- and second degree relatives, history of benign breast disease, diet and lifestyle factors, and female-specific factors (age at menarche, age at first birth, menopausal status, duration of breastfeeding, oral contraceptive use and hormone replacement therapy). Body weight (kg) and height (m) were measured by trained investigators, and the body mass index was calculated as the weight in kg divided by the square of height in meters (kg/m2). Women were classified into three BMI groups: underweight (<18.5 kg/m2), overweight (≥24 kg/m2), and obese (≥28 kg/m2). Family history of breast cancer referred to at least one of the) with breast cancer. Regular cigarette smoking was defined as smoking at least one cigarette per day for six months or more.
Screening methods and assessment of mammographic density
CBE, BUS, and MAM were performed by physicians with at least 5 years of work experience. Bilateral MAM was conducted with a full-field digital mammography system. Bilateral BUS was performed with color Doppler and high-resolution transducers with maximum frequency of at least 10 MHz. Results of CBE and BUS were classified into four groups: 1, normal; 2, abnormal benign; 3, suspicious malignancy; and 4, highly suggestive of a malignancy. Results of MAM were classified into six groups according to Breast Imaging Reporting and Data System (BIRADS) of the American College of Radiology (ACR): 0, additional imaging needed; 1, negative; 2, benign finding; 3, probably benign finding; 4, suspicious malignancy; and 5, highly suggestive of a malignancy. All assessments of MAM and BUS were double-checked at local screening sites. Disagreements in two MAM/BUS physicians were reassessed by another more experienced physician. During mammography screening, both craniocaudal and mediolateral oblique views were used to determine mammographic density according to the BI-RADS, Qualitative assessment of MD was classified into four groups: 1, fatty breast (< 25% glandular); 2, scattered fibro-glandular breast (25%–50% glandular); 3, heterogeneously dense breast (51%–75% glandular); and 4, extremely dense breast (>75% glandular). Detailed information referred to our previously published papers.[13, 21]
Single Nucleotide Polymorphism selection and genotyping
Until 2016, a total of 93 Single Nucleotide Polymorphisms (SNPs) achieved genome-wide significant associations with breast cancer in 34 published GWAS[22]. Among these GWAS-identified SNPs, 9 SNPs were initially identified in East Asians[23-27], while another 16 SNPs were initially identified in Europeans[28-32] and were further validated in large East-Asian populations.[28-32] Among 25 initially selected SNPs, 2 SNPs with high linkage disequilibrium with other SNPs (r2>0.8) and low risk allele frequency in East Asian populations were further excluded. Finally, 23 SNPs were selected for subsequent genotyping testing. A total of 5 ml ETDA-anticoagulated venous blood was collected from each participant. Leukocytes were separated from the collected plasma and stored in a cryotube at -80°C Celsius refrigerator for DNA extraction. The QIAGEN DNA Extraction Kit (QIAGEN Inc.) was used to extract genomic DNA and the Wafergen SmartChip platform was used to genotype the targeted 23 SNPs.[34, 35] In order to ensure the accuracy and reliability of the genotyping results, approximately 5% of the samples were randomly selected for retesting. Since rs6472903 was not successful genotyped in most samples, it was also excluded in the final analysis.
Statistical analysis
The logrank test based on KaplanMeier curve was used to compare the incidence risk of breast cancer between subgroups within each CRF, MD, and PRS. Due to limited CRF associated with the risk of breast cancer under the significant level of 0.05 in log-rank tests (Table 1), CRF with p value <0.20 in log-rank tests were included in the risk predication models. Four risk prediction models with CRF (ModelCRF), CRF and MD (ModelCRF+MD), CRF and PRS (ModelCRF+PRS), and all three components (ModelFULL) were developed to identify different high-risk groups of breast cancer and compare the screening performance between different high-risk groups. Relative risks were estimated using hazard ratios (HRs) and 95% confidence intervals (95%CI) with Cox regression model.
Table 1
Long-time risk of breast cancer by baseline characteristics
Characteristic* | No. (%) of women | Follow-up, 1000 women years | No. of cancer | IR per 1000 women years | P value for K-M curve | Age-adjusted HR (95% CI) |
Overall | 7794 (100.0) | 78 | 217 | 2.8 | | |
Age at enrollment | | | | | | |
45–50, years | 2514 (32.3) | 24 | 60 | 2.5 | 0.644 | |
51–60, years | 4328 (55.5) | 44 | 134 | 3.0 | | |
≥ 61, years | 952 (12.2) | 10 | 23 | 2.3 | | |
Age at menarche | | | | | 0.607 | |
≤ 12, years | 1101 (14.2) | 11 | 28 | 2.5 | | 0.91 (0.61, 1.35) |
> 12 years | 6652 (85.8) | 67 | 188 | 2.8 | | Ref. |
Age at first birth | | | | | 0.893 | |
Nulliparous | 143 (1.9) | 1 | 3 | 3.0 | | 0.80 (0.26, 2.51) |
< 30, years | 6004 (79.6) | 60 | 465 | 7.8 | | Ref. |
≥ 30 years | 1395 (18.5) | 14 | 42 | 3.0 | | 1.04 (0.74, 1.47) |
Breastfeeding | | | | | 0.139 | |
No | 1532 (20.0) | 15 | 51 | 3.4 | | Ref. |
Yes | 6143 (80.0) | 62 | 161 | 2.6 | | 0.79 (0.58, 1.08) |
Menopausal status | | | | 0.005 | |
Premenopausal | 2332 (30.7) | 24 | 75 | 3.1 | | 1.85 (1.31, 2.61) |
Postmenopausal | 5264 (69.3) | 53 | 136 | 2.5 | | Ref. |
Family history of breast cancer | | | | 0.363 | |
No | 7545 (96.8) | 76 | 207 | 2.7 | | Ref. |
Yes | 249 (3.2) | 3 | 10 | 3.3 | | 1.34 (0.71, 2.52) |
History of breast benign disease | | | | 0.058 | |
No | 5562 (74.3) | 55 | 143 | 2.6 | | Ref. |
Yes | 1924 (25.7) | 20 | 67 | 3.4 | | 1.32 (0.99, 1.77) |
Hormone replacement therapy | | | | 0.251 | |
No | 6409 (96.3) | 64 | 180 | 2.8 | | Ref. |
Yes | 246 (3.7) | 3 | 5 | 1.7 | | 0.59 (0.24, 1.45) |
Oral contraceptives | | | | 0.595 | |
No | 6383 (88.3) | 64 | 181 | 2.8 | | Ref. |
Yes | 843 (11.7) | 8 | 24 | 3.0 | | 0.90 (0.59, 1.38) |
Body mass index | | | | 0.815 | |
< 18.5 | 151 (1.9) | 1 | 4 | 4.0 | | 1.11 (0.41, 3.03) |
18.5–23.9 | 3641 (47.0) | 36 | 95 | 2.6 | | Ref. |
24.0-27.9 | 3054 (39.4) | 31 | 92 | 3.0 | | 1.11 (0.83, 1.48) |
≥ 28.0 | 898 (11.6) | 9 | 25 | 2.5 | | 0.91 (0.59, 1.42) |
Ever smoking | | | | | 0.575 | |
No | 6998 (94.0) | 70 | 200 | 2.9 | | Ref. |
Yes | 444 (6.0) | 4 | 11 | 2.8 | | 0.84 (0.46, 1.55) |
Negative events | | | | | 0.140 | |
No | 6690 (89.3) | 67 | 176 | 2.6 | | Ref. |
Yes | 804 (10.7) | 8 | 31 | 3.9 | | 1.33 (0.91, 1.96) |
Mammographic density | | | | 0.002 | |
Fatty | 803 (11.7) | 8 | 15 | 1.9 | | Ref. |
Scattered | 3005 (43.8) | 30 | 82 | 2.7 | | 1.76 (1.01, 3.07) |
Heterogeneous | 2943 (42.9) | 30 | 98 | 3.3 | | 2.69 (1.53, 4.74) |
Dense | 114 (1.7) | 1 | 4 | 4.0 | | 2.87 (0.93, 8.83) |
22-locus PRS quartiles | | | | < 0.001 | |
1st quartile | 897 (20.2) | 9 | 12 | 1.3 | | Ref. |
2nd quartile | 1135 (25.6) | 11 | 29 | 2.6 | | 1.71 (0.87, 3.36) |
3rd quartile | 1377 (31.0) | 14 | 43 | 3.1 | | 2.23 (1.18, 4.24) |
4th quartile | 1026 (23.1) | 10 | 65 | 6.5 | | 4.75 (2.57, 8.81) |
Note: *, unknown group in index variables were not shown; IR, incidence rate; HR (95%CI), hazard ratio (95% confidential interval); PRS polygenic risk score. |
Polygenic risk score (PRS) was calculated to measure the cumulative effect of multiple genetic risk variants with the following formula:
where βk is the per-allele log OR for breast cancer associated with SNPk from univariate cox regression, xk is the alleles dosage for SNPk (0, 1, or 2), and n is the total number of SNPs included in the PRS.
Discrimination of risk prediction model was measured by the area under the receiver operating characteristic curve (AUC). Calibration of 10-year risk prediction model was assessed by comparing the observed and expected number of cases overall and within risk categories. [36, 37] CIs for expected-to-observed ratios (O/E) were calculated by assuming a Poisson distribution for the observed numbers of cases with the following formula:
O/E=1 would indicate perfect calibration. The 10-year risk prediction model for breast cancer were further visualized using nomograms. Due to the inconsistent missing data in CRF, MD, and PRS, sensitivity analyses in subgroup population with complete data were conducted to further compare discrimination of different risk models.
In order to compare with the 10-year breast cancer risk reported in previous studies,[38] the 10-year breast cancer risk in MIST-TJ were divided into the following five categories: below average risk (<0.40%), average risk (0.4% to <0.6%), above average risk (0.6% to <1.0%), moderately increased risk (1.0% to <2.0%) and high risk(≥2.0%). Moreover, the women in MIST-TJ was further simplified into high-risk and low-risk groups according to the optimal cut-off values under the receiver operating curve of different risk prediction models.
The relative risk measured by HR of breast-cancer mortality for high-risk groups compared to low-risk groups was calculated to determine potential benefit of risk-reducing interventions. The detection rate (DR), accuracy and cancer-stage for CBE, BUS, and MAM were compared to determine the optimal screening method for high-risk groups.
The analyses were conducted with R software (version 4.0.3) and SPSS software (version 24). All statistical tests were two-sided, and a P value equal to or less than 0.05 was considered statistically significant.