Overview of Mendelian randomisation approaches
We applied four approaches that use MR to assess the effect of exposure to a relative’s smoking and therefore ETS exposure (Table 1.A), which we then meta-analyse (Table 1.B). The MR analyses were conducted using a multivariable extension of the ‘proxy gene-by-environment’ MR design (21).
Traditional ‘proxy gene-by-environment’ MR designs use the index individual’s genotype as a proxy of a parent’s genotype, and thereby estimate the parental genetic liability to smoke in settings where parental phenotypic but not genetic data is available. This is possible because children inherit half of each parent’s variants, but these variants also affect the respective parental phenotype. Because this approach uses the offspring’s genetic variants, an advantage of its two-sample MR implementation is that outcome data can be extracted from a conventional genome wide association studies (GWAS), which are much larger than current within-family studies. However, because it uses inherited variants, the design can be biased by the direct effect that a variant has on the index individual’s smoking, and therefore outcome (Fig. 1) (22). We address this with multivariable MR (MVMR) by additionally including the index individual’s genetic liability to smoke in the model. MVMR estimates are interpreted as the direct effect of the exposure conditional on the other phenotype. MVMR has been used to adjust MR estimates for postulated biases, by ensuring that the effect of interest is conditionally independent of a phenotype of concern (24–26). We call this application of MVMR ‘multivariable gene-by-environment MR’. We consider six outcomes (lung cancer, COPD, stroke, coronary heart disease, hypertension, and depression).
In the first approach we perform multivariable gene-by-environment MR by including both maternal genetic liability to smoke and the index individual’s liability to smoke as exposures in the model. This approach explores the effect of maternal genetic liability to smoke on the index individual’s outcomes independent of the offspring’s liability to smoke (Table 1.A1).
In the second approach we include the paternal and offspring’s respective liabilities to smoke as exposure phenotypes. This approach assesses the effects of paternal genetic liability to smoke on the index individual’s outcomes conditional on the index’s liability to smoke (Table 1.A2).
The third and fourth approaches include both maternal genetic liability to smoke and paternal liability to smoke as the exposures in a multivariable gene-by-environment MR model. The third approach investigates paternal outcomes, and assesses as the direct effects of maternal genetic liability to smoke on the paternal outcome, independent of paternal liability to smoke (Table 1.A3). The fourth approach investigates maternal outcomes, and assesses the direct effects of paternal genetic liability to smoke on maternal outcomes conditional on maternal genetic liability to smoke (Table 1.A4).
The last two approaches leveraged GWASs (described in the next section and the Supplementary Methods) for maternal smoking and paternal outcomes to explore the effects that one parent have on each other parent. Here the likely source of bias is assortative mating, where parents tend to partner with people who have a similar smoking status. A naive association between one parent’s smoking and other’s outcomes could represent the combined effects of first-hand smoking, and assortative mating (13). Using similar logic to the first two approaches, assortative mating can be controlled for by adjusting for the other parent’s genetic liability to smoke in a multivariable gene-by-environment MR model.
Data sources
Data on maternal smoking and index individual smoking were used publicly available GWAS summary statistics using a sub-sample of genetically unrelated European ancestry participants in the UKB (n ~ 400,000), which have been described elsewhere (27, 28). Variants associated with paternal smoking were also selected using a multivariate European ancestry UKB GWAS (29). Since this GWAS made strong parametric assumptions (30), variant-paternal smoking associations used in the analysis were estimated in ALSPAC (n = 5,766, see Supplementary Methods and (31, 32) for details).
The index individual outcome GWASs combined the UKB with FinnGen r10 and independent, but demographically similar, samples of the same trait taken from available genetic consortia and meta-analyses. These include the International Lung Cancer Consortium for lung cancer, Psychiatric Genetic Consortium for depression, the European sub-sample of the Global Biobank Meta-Analysis Initiative for COPD, CARDIoGRAMplusC4D for coronary heart disease, and the European ancestry sub-sample of GIGASTROKE for stroke. This resulted in 20,359 cases and controls for 810,746 lung cancer; 58,559 cases and 937,358 controls for COPD; 170,756 cases and 329,443 controls for depression; 1,234,808 cases and 1,308,460 controls for stroke; 239,785 cases and 1,355,114 controls for coronary heart disease; and 242,724 case and 649,418 controls for hypertension. Parental outcome data was also taken from European ancestry UKB GWASs of index individual reported maternal and paternal outcomes (see Supplementary Figure S1). To be consistent with the CARDIoGRAMplusC4D study, we use the term “coronary heart disease” when describing the results of our meta-analysis, although the outcome used in the UK Biobank questionnaire of parental disease outcomes used the less precise term “heart disease”.
The data sources, and number of participants and variants, included in each analysis for the 4 MR approaches are summarised in Supplementary Figure S1. Additional details such as information about the data sources and statistical methods used in the MR approaches can be found in the Supplementary Methods and at the following references (28, 33–41).
Instrument construction
For each of the MR approaches, we created a list of variants that were genome-wide significantly associated with either of the smoking phenotypes used in the analysis, which we then clumped (r2 = 0.001, kb = 10,000) after ranking SNPs by the smallest p-value that they had with either exposure. The analysis of maternal smoking on the index individual (Table 1.A1) therefore used the maternal smoking and index individual smoking GWASs which resulted in at most 129 variants. Analogously, the procedure resulted in 127 variants for the second approach (Table 1.A2), and 19 variants in the third and fourth approach (Table 1.A3 and 1.A4). Winner’s Curse in the maternal and index individual smoking GWASs was addressed by using false discovery rate inverse quantile transformation Winner’s Curse correction (42).
We used the TwoSampleMR R package to harmonise the exposure and outcome data (43). Palindromic SNPs were only excluded if TwoSampleMR could not use the allele frequency to infer which strand was positive. Where possible, we additionally imputed LD proxy variants with TwoSampleMR when SNPs were missing in the outcome dataset, using an r2 of 0.8 from the European subsample of the 1000 genomes project.
Sensitivity analyses for the MR approaches
We ran three sensitivity analyses for the MR approaches. First, as a positive control, we ran a univariable MR analysis of the index individual’s own cigarette smoking on their own outcomes. This also demonstrates that there was sufficient statistical power to detect an effect on the outcomes. Second, we used hair colour as a negative control outcome for residual population structure. Natural hair colour is known to vary by population sub-group in the UK but should not be causally related to any of the outcomes or smoking status. Therefore, if the any of the genetic instruments used here (i.e., for the index individual’s smoking, maternal smoking, or paternal smoking) associate with hair colour, then this will most likely be due to residual population structure (44). Third, the UK Biobank genotyped participants (recruited into the UK BiLEVE study) at the extreme and middle of the distributions for smoking and lung function were genotyped using a different genotyping chip as compared with the rest of the sample. This could potentially introduce confounding into genotype-phenotype associations, and previous guidance suggests adjusting for the genotyping chip in analyses (27). Although our previous research indicates that there is little bias induced in MR estimates of first-hand smoking (45), we contrast MR estimates using chip adjusted and non-chip-adjusted GWASs for lung-function related phenotypes.
Meta-analysis of MR approaches
A literal interpretation of our MR estimates requires strong, and implausible, assumptions, such as that ETS exposure does not vary across the life course (46). One can alternatively treat MR as a test of a causal null hypothesis (47). In this context the approaches test the joint null hypothesis that exposure to another individual’s genetic liability to smoke does not increase an index individual’s disease risk independent of the index individual’s genetic liability to smoke. It has been similarly argued that meta-analyses of randomised controlled trial should be used to "determine whether or not some type of treatment - tested in a wide range of trials - produces any effect”, rather than “provide exact quantitative estimates” (48). Power to reject the joint null hypothesis can therefore be increased by a quantitative meta-analysis of the four MR estimates.
For such a meta-analysis to be valid we must assume that the estimates are independent of each other. In the context of a two-sample MR analysis, this requires that either the instruments are uncorrelated, the genetically predicted exposures are conditionally independent (by adjustment in the MVMR model), or the outcome samples are independent. The first and second are analogous to a factorial experiment in which people are independently randomised to two separate interventions with the same intended therapeutic effect (e.g., two different drugs that lower blood pressure). The second is equivalent to the assumption of no participant overlap made in traditional meta-analyses of randomised controlled trials.
Supplementary Table S1 presents six pairwise comparisons of each of the MR approaches and the rationale for the independence, or lack of independence, between them. Independence was only questionable for the pair with the two approaches assessing parental smoking and the index individual outcomes. This is because assortative mating and the social transmissibility of smoking mean that one parent’s liability to smoke might predict the other parent’s liability. We accounted for potential within-pair correlations by meta-analysing the unweighted average of the effect estimates and 95% CI of potentially correlated pairs (49, 50). Of the remaining 5 pairs, 2 had both non-overlapping outcome GWASs and independent primary instruments, while 3 only had non-overlapping outcome GWASs.
Since our aim is to test the joint null hypotheses, we follow Peto and others and use a common effect meta-analysis (of the IVW MR estimates) as our primary analyses (51, 52). The summary measure of the association in meta-analyses was the log odds in the outcome per genetically predicted standard deviation increase in exposure to ETS.
Sensitivity and additional analyses for the meta-analysis
Heterogeneity was assessed using the I2 and τ 2 statistics, and we used the random-effects model as a secondary estimator. We also used a leave-one-out sensitivity analysis to explore the potential effect of, and robustness to, outliers in the primary analysis. Finally, we control for multiple testing across the 6 outcomes using the Benjamini and Hochberg correction (53).
Certainty assessment for the meta-analysis
Uncertainty was evaluated for each outcome using the Grading of Recommendations, Assessment, Development and Evaluations (GRADE) approach for the primary estimate (i.e., common effect meta-analysis) (54, 55). Since our estimates are from theoretically low risk of bias natural experiments, we follow Kim and colleagues and start all meta-analyses at high certainty and then downgrade them where appropriate (56). To aid with the certainty evaluation we developed a bespoke risk of bias tool described in the Supplementary Methods. Although we otherwise adhere to the GRADE guidelines (57), judged imprecision, heterogeneity, and publication bias are inherently somewhat subjective.