Study population
This study was conducted using cross-sectional data of NHANES from 1999–2000 to 2017–2018. The NHANES data was openly available online[3]. In brief, it is an ongoing complex sample survey that includes in-home interviews, physical examinations, and laboratory tests. It collects information on the demographics, health, and nutritional status of the US civilian noninstitutionalized population. Since 1999, data was collected continuously and released in 2-year cycles. The study protocols of NHANES were approved by the National Center for Health Statistics Institutional review board and written informed consent was obtained from each participant[4]. Our participants were limited to those aged 20 years or older. Those without information on the measurement of blood cadmium level or the diagnosis of cancers were excluded.
Exposure and outcome measurements
The exposure of interest is whole-blood cadmium level. During physical examinations, blood samples were collected from participants. The samples were processed and sent to a central laboratory for testing. A detailed description of the laboratory methodology of whole-blood cadmium measurement can be found on the NHANES website. In brief, whole-blood cadmium concentration was determined using inductively coupled plasma mass spectrometry (ICP-MS). This multi-element analytical technique is based on quadrupole ICP-MS technology. For values below the detection limit, the square root of two was substituted for the detection limit.
The outcome was based on the self-reported personal interview data collected from the medical condition questionnaire. If participants answered “yes” to the question “Have you ever been told by a doctor or other health professional that you had a cancer or a malignancy of any kind”, they would further answer the question “What kind of cancer”. We created a binary variable for participants with NMSC or without any cancers.
Covariates
Important covariates were determined in the literature on NMSC, and included were: age, gender, race/ethnicity, education, income, body mass index (BMI), hypertension, diabetes, dyslipidemia, current smoke, alcohol use, hours of physical activity per week, and sunscreen or protective clothing. There were four categories of race/ethnicity: non-Hispanic white, non-Hispanic black, Mexican American, and others Education included less than high school, high school graduate, some college, and college graduate or higher. Income was determined by the income-to-poverty ratio, which was defined as annual family income divided by the poverty threshold after adjusting for inflation and family size. BMI was calculated as the weight (kilograms) divided by height (meters). Hypertension was defined as systolic BP of 130 mm Hg or higher, diastolic BP of 80 mm Hg or higher, or the use of antihypertensive medications. Diabetes was defined as hemoglobin A1c of 6.5% or higher, or the use of antidiabetic medications. Dyslipidemia was defined as total cholesterol of 240 mg/dL or higher, or the use of lipid-lowering medications. Current smoke was evaluated based on questions about whether participants were currently smoking or not. Alcohol use was classified as equal to or less than 1 drink per week, between 1 to 5 drinks per week, and more than 5 drinks per week. By a drink, it means a 12 oz. beer, a 5 oz. glass of wine, or one and half ounces of liquor. Physical activity was calculated as hours of moderate-intensity activity plus twice the minutes of vigorous-intensity activity[6]. Sunscreen or protective clothing was based on questions of whether participants use sunscreen or wear a long-sleeved shirt.
Statistical analysis
All statistical analyses were conducted in R version 4.1.3 using the “Survey” package. The complex sampling design was considered during the analysis. Our analysis followed NHANES recommendations for combining survey cycles and generating sample weights for subsamples to ensure accurate association and variance estimation. All statistical tests were 2-sided and P < 0.05 was considered statistically significant.
Continuous variables were presented as means and categorical variables as percentages. The levels of cadmium were right-skewed and natural log-transformed for the analysis. The difference in continuous variables between those with NMSC and without cancer was examined with a t-test and the difference in categorical variables was examined with a Chi-squared test. Binomial logistic regression analysis was applied to investigate the association between cadmium level and the odds of NMSC. After natural log-transformation, the cadmium level was first modeled as a continuous variable in the regression analysis. Second, it was divided into quartiles and modeled as an ordinal variable with the first quartile being the reference. Third, to identify the possible non-linear relationship between cadmium level and the odds of NMSC, restricted cubic splines (RCS) were used in the regression analysis. Three knots were applied in the RCS in order to better fit the model while avoiding overfitting. The P-value for the non-linear trend was determined by the Wald test for the coefficient of RCS. In addition, to test whether the association between cadmium and NMSC differed in different subgroups by age, gender, race, education, and income, a two-way interaction term between the quartile of cadmium level and subgroup status was added to the regression model. Further stratified analysis would be conducted if a difference was detected in a specific subgroup.
We developed 3 models for the regression analysis. No covariables were adjusted in model 1, and age, gender, race/ethnicity, education, and income were adjusted in model 2. In model 3, BMI, hypertension, diabetes, dyslipidemia, current smoke, alcohol use, hours of physical activity, and sunscreen or protective clothing were further adjusted based on model 2.