Search strategy and inclusion criteria
Two investigators (W.R., L.C.) independently searched, PubMed, Web of Science from their inceptions (1966 and 1947, respectively) to June 1, 2020, for cohort studies related to family SEP in childhood and lung cancer in adulthood, without language restrictions. Keywords used were “socioeconomic position (SEP)”, “cohort”, “lung cancer”, “risk”, as well as their Medical Subject Headings (MeSH) terms. Only papers published in English were included. Details of the search terms and inclusion/exclusion criteria are shown in Figure 1.
The inclusion criteria were: (i) cohort study design; (ii) all cases were lung cancer patients, and data were classified according to different SEP, regardless of subtypes of lung cancer; (iii) sufficient data on the association between family SEP in childhood and lung cancer in the adulthood. In addition, the exclusion criteria were: (i) small sample sizes; (ii) non-English publications; (iii) incomplete or insufficient data on the association between family SEP in childhood and lung cancer in the adulthood.
All the included studies were divided into two groups: (I) studies reported adjusted estimates of the hazard ratios (HR) (including relative risk and odds ratios) and 95% confidence interval (95%CI); (II) studies missed data mentioned above. We also included studies that graded data based on parents' financial status and family life in childhood, and overall impact. The quintile or quartile method was incorporated and the first level was used as the reference.
Data acquisition quality assessment
Data were extracted by two investigators (W.R., H.Z.) independently, and any disagreements came to consensus after discussion. Basic data were recorded from all eligible studies, including the first author’s name, publication year, country, study period, period of follow-up, number of lung cancer patients and HR with their 95% Cl. The results were reviewed by two senior investigators (L.W., H.J.).
Quality assessment
The quality of the studies was evaluated using a score system that was designed with reference to the Newcastle-Ottawa Scale (NOS) tool.[32-34] The system is based on a 0-9 points, with 9 reflecting the highest quality and 0 the lowest. Each point was allocated for the following: (I) representativeness of exposure arm(s), (II) selection of the comparative arm(s), (III) origin of exposure source, (IV) demonstration that outcome of interest was not present at start of study, (V) studies controlling the most important factors, (VI) studies controlling the other main factors, (VII) assessment of outcome with independency, (VIII) adequacy of follow-up length (to assess outcome), (IX) lost to follow-up acceptable (less than10% and reported).
Two researchers (W.R. and G.F.) independently evaluated the methodological quality of each included published cohort study. The results of the quality assessment were used for descriptive purposes, to provide an overall assessment of the quality of the included studies.[32, 35-38]
Statistical analyses in Meta-analysis
A meta-analysis was performed following the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines.[39, 40] We collected study-specific HR with 95%Cl for lung cancer to combine the data. Through Cochran’s I2 statistic, we examined the heterogeneity across studies and statistical heterogeneity was considered if an I2 statistic≥50%. A random effect model was employed to synthesize hazard ratios (HRs) for lung cancer-specific mortality if high heterogeneity existed (p<0.5, I2>50%), otherwise a fixed effect model was conducted.
We utilized Funnel plot tests, Begg’s test and Egger’s test to evaluate the publication bias. [41-43] In addition, subgroup analyses were performed based on genders and data. Sensitivity analysis was conducted by excluding each study in turn to access the stability of results and potential sources of heterogeneity. All statistical manipulation was employed by Stata software (version 12, StataCorp, TX, USA). All P values were 2-tailed; statistical significance was set at P value <0.05.[44]
The dose-response study was analyzed by Benchmark Dose Software 3.1.2(BMDS 3.1.2).[45] All steps were established under the BMDS guidelines.[46-49] Multiple models were selected for analysis, including extra risk assumptions for background and a benchmark response of 5%.[33, 34] According to the included studies, the dose response was divided into two categories: family economic status in childhood and childhood housing conditions. At the same time, according to the original data with or without adjustment, it was divided into adjusted and unadjusted group for subgroup analysis and publication bias analysis. In all dose-response models, we used Exponential models and Hill models as the main dose-response observation results,[50] while using other results for verification. Duplicate studies were excluded.
MR Analysis Using Summary Statistics
The MR method was based on the following three assumptions: (i) the instrumental variables are strongly associated with the family SEP in childhood; (ii) the instrumental variables affect cancer only through their effect on family SEP in childhood and not through any alternative causal pathway; and (iii) the instrumental variables are independent of any confounders.[51, 52] To assess the potential violation of these assumptions, we evaluated the directional pleiotropy based on the intercept obtained from the MR Egger analysis.[53] We also performed a leave-one-out analysis in which we sequentially omitted one SNP at a time, to evaluate whether the MR estimate was driven or biased by a single SNP.
The analysis was conducted to estimate the effect of family SEP in childhood (X) on the risk of lung cancer (Y) using genetic variants (g), and the causal estimate is equal to Yg/Xg [54]. For the association between genetic variants and family SEP in childhood (Xg), summary data were utilized from published Genome-Wide Association Studies (GWASs),[55-70] including Social Science Genetic Association Consortium (SSGAC)(1,060,068 individuals), MRC Integrative Epidemiology Unit (MRC-IEU)(249,790 individuals), UK Biobank(75,244 individuals) and Neale Lab (455,571 individuals)(Supplement Table 1).[71-73] Summary statistics for the association between genetic variants and lung cancer (Yg) are from the International Lung Cancer Consortium (ILCCO) (27,209 individuals) (Supplement Table 2). [62]
We selected uncorrelated variants to construct the instrumental variables in the two-sample MR analysis. Using both summary statistics for Yg and Xg, an Inverse Variance-Weighted (IVW) meta-analysis was employed to estimate the effect of genetically determined family SEP in childhood on the risk of lung cancer using the method of Burgess et al.[74]:
where Xg is the beta estimate for the association between the SNP and family SEP in childhood, Yg is the beta estimate for the association between the SNP and lung cancer, and σYg is the standard error for Yg. Corresponding HR and 95% CIs were calculated using βIVW and se(βIVW).
In addition, MR-Egger and Weighted Median were also conducted to identify the causality. Ward ratio would only be observed when the first three models cannot be used due to lack of SNPs.[75] Leave-one-out analysis was conducted to estimate whether the result was driven by a single SNP. MR-Egger regression was also performed to access the pleiotropy.