Full-waveform adjoint seismic tomography. Seismograms were obtained for 206 earthquakes occurred between 2004 and 2021, at 6099 individual stations (Extended Data Fig. 1). Magnitudes are mainly between 5.5 and 6.8, to both avoid low-quality data and satisfy the point source assumption which is often not valid for larger earthquakes16. All 206 earthquakes are used during the coarse-grid stage which took 126 iterations; among those, 110 earthquakes are selected for the following 90-iteration fine-grid stage (Extended Data Fig. 1). Due to strong noise on horizontal components, only vertical components were used for ocean-bottom seismometers, and both tilt and compliance noise were removed54.
We used multiple body wave phases including P, S, PP, SS, PPP, SSS and their depth phases, as well as surface waves. For each body wave phase, we cut a waveform segment that starts 30 s before and ends 50 s after the arrival time for waveform fitting. During the coarse-grid stage, body waves were filtered between 17-100 s, and for the fine-grid stage, the filter is 14-100 s. Rayleigh and Love waves were filtered in multiple period bands, including 20-40 s, 40-60 s, 60-80 s, and 80-150 s. Waveform segments were also cut for surface waves to contain their full envelopes. In total, we used 1.5 × 106 waveform segments.
The inversion scheme is similar to the one in ref.16. During the inversion, we tried to maximize the correlation coefficient between synthetic and observed waveforms, which guides the velocity structure to fit phase shapes, while not being strongly affected by less constrained factors like attenuation, and compared to methods based purely on travel times, this method can exploit more information in the detailed shape of waveforms. A preconditioned conjugate gradient method is used to update the model16, and a diagonal Hessian is approximated by calculating the sensitivity kernel difference after perturbing the model55. Weights for different segments are based on four metrics: correlation coefficients, maximum cross-correlation values, time differences between synthetic and observed arrivals, and the signal-to-noise ratio of observations16. In this study, a segment is weighted as unity when these four criteria are met: correlation coefficient > 0.8, maximum cross-correlation value > 0.95, time shift < 4 s, and signal-to-noise ratio > 12; in contrast, it is weighted as zero, if one of these criteria is met: correlation coefficient < 0.55, maximum cross-correlation value < 0.7, time shift > 8 s, or signal-to-noise ratio < 8. For metrics in between these thresholds, the weight is determined by the cosine-shape function in ref.16.
To overcome the cost of wavefield simulations that often limits adjoint tomography to a few tens of iterations16,55, and to reduce the possibility of being trapped in local minima, we included the mini-batch algorithm by using only a portion of earthquakes for each iteration. This algorithm is commonly used for stochastic gradient descent in machine learning18. Here, the algorithm is similar to the one in ref.19, where mini-batches greatly enhance the convergence rate by reducing the redundancy among sensitivity kernels from different earthquakes. However, unlike ref.19, the Hessian is not estimated via the L-BFGS algorithm56, so we do not need to worry about the positive definiteness of the Hessian during mini-batch selection. Hence, there only remain two key procedures for mini-batch selection: 1) choosing earthquakes to be excluded for the next iteration; 2) adding earthquakes not used in the current iteration.
We used a similar algorithm to ref.19 to select earthquakes to be excluded. The angular distance between the summed kernel using all earthquakes in the current iteration and the summed kernel for all preserved earthquakes with one earthquake removed is calculated, and the earthquake with the smallest angular distance is removed. Such earthquake exclusion is performed iteratively19 until the angular distance reaches 30º or the total number of excluded stations exceeds 55% of the earthquakes used for the current iteration.
The way to add new earthquakes for the next iteration is more stochastic to avoid some earthquakes being constantly chosen. Specifically, we give each candidate earthquake a probability based on: 1) the average distance from a candidate earthquake to its three nearest earthquakes that have secured a spot for the next iteration, 2) the number of iterations that this earthquake was used in. Hence, earthquakes either far from other earthquakes or not frequently used have a better chance of being selected.
We used model US2257 as our starting structure. To account for attenuation, we assume the 3D attenuation model QRFSI1258, and while SpecFem3D Globe handles attenuation with a constant Q across frequencies59, velocities shown in this study are for 1 s. We assume CRUST1.060 for Moho topography, and S362ANI61 for the topography of the 410- and 660-discontinuities.
Accurate earthquake sources are crucial for tomography. In this study, we invert all source parameters16 within the gCMT solution62 before inverting for structure, or once the structure inversion has produced substantial updates from the model used for the last source inversion. Sources were updated eight times, and each time, the number of iterations for different earthquakes depends on their convergence rates. Meanwhile, to overcome the uneven distribution of stations, in addition to the original weight based on the four metrics, the weight for a certain station is further normalized by the sum of weights for all stations around it with azimuth difference < 7.5º and epicentral distance difference < 6º. A water level of 2‰ the summed weight of all stations is applied during normalization to not overweight isolated stations.
The forward and adjoint simulations for both structural and source inversions were performed using SpecFem3D Globe v8.0.014 on a server with 4 Nvidia A100 GPUs. For each earthquake, the duration of simulated wavefields is determined by the end time of the 20-40 s Rayleigh wave segment at the farthest station. The simulation domain has a dimension of 57º × 79º (Extended Data Fig. 1), and during the more influential fine-grid stage, horizontal element size is ~44 km on the free surface, making the spacing between Gauss-Lobatto-Legendre interpolation points ~11 km, and the minimum resolved period 11 s.
In this study, we mainly focus on isotropic velocity anomalies. During simulations, we make the structure isotropic below the 410-discontinuity, and transversely isotropic above it. Hence, both isotropic velocity and radial anisotropy were solved for during inversion, and here we used the Voigt average63 to approximate the isotropic velocity. To compare velocity anomalies at different depths, seismic velocities are shown as velocity perturbations with respect to a reference model. A 1D mantle reference model for VS and VP is obtained by averaging the final velocity model for all regions with good resolution (Extended Data Fig. 6), while in the crust, 3D velocities from CRUST1.060 are assumed as the reference.
Resolution tests of the model. Two resolution tests are performed separately for VS and VP (Extended Data Fig. 5). In each test, the same point-shape velocity perturbations were added to the final ΔVS /VS and ΔVP /VP models. These perturbations are horizontally 4º and vertically 200 km apart from each other and have reversed polarization between neighbors (Extended Data Fig. 5). The maximum amplitude of the perturbations is 0.01, and their shapes are characterized by a Gaussian with a standard deviation of 40 km. Such spatial extent is comparable with the wavelength of body waves used in this study, making the test challenging, and as set up, preferable to checkerboard tests64.
During the test, we generated waveforms for the perturbed model as “observations”, and then started from the unperturbed model to invert for the perturbations. Meanwhile, to resemble the resolution of real data, we use the same set of weighting for waveform segments as in the last tomography iteration. Due to computation resources, we did not replicate the 206 iterations for tomography, but 16 iterations were still performed for each test with the mini-batch strategy (Extended Data Fig. 5). Compared to commonly used one-iteration tests for full-waveform tomography16,55, this multi-iteration test refines structures at places with lower data coverage to better indicate the actual resolution.
A specific test is also designed for the resolution on the high VS dripping bodies (Extended Data Fig. 7). Here, due to computation limits, we use the point-spread function method16,65 with one iteration. We first made a model with all high VS anomalies around the dripping region removed, and then calculate the VS sensitivity kernel for this modified model. After that, we obtain the difference in preconditioned VS sensitivity kernel between the original model and the modified one, which represents the VS change that the data would favor to compensate for the removed dripping body.
Based on results from these tests, the dripping bodies are not likely to be artifacts based on their geometries. Though the general shape of these bodies is sub-vertical, they also contain secondary features that are horizontally aligned (Fig. 2a, b), which cannot be produced by smearing along ray paths, and the resolution test shows no distortion of structures around the region (Extended Data Fig. 5). Also, given the earthquake-station distribution (Extended Data Fig. 1), there are no ray paths following the eastward dripping direction of high VS bodies on the east side (Fig. 2b).
Composition and temperature estimation. Because there are two observables (VS and VP), and temperature is one variable to be constrained, only one compositional variable can be solved for, so we model compositions between two endmembers. Endmember compositions are expressed as oxide weight percentages for the SiO2-Al2O3-MgO-FeO-CaO-Na2O system. We take one endmember to be pyrolite following ref.26 with SiO2 = 45%, Al2O3 = 4.45%, MgO = 37.8%, FeO = 8.05%, CaO = 3.55%, and Na2O = 0.36%. For the endmember composition of a depleted cratonic mantle lithosphere, we compiled whole rock major element compositions for peridotite xenolith samples from four cratons (Extended Data Fig. 9): Greenland66-68, Slave69, Kaapvaal70-72 and Siberia73-75. Based on these samples, clear negative correlations are seen between the Mg number (Mg# = 100×Mg/(Mg+Fe)) and the FeO content; MgO and SiO2 content; MgO and Al2O3 content; MgO and CaO content, and all these relationships could be generalized through linear regression (Extended Data Fig. 9). Therefore, by setting the Mg# to 93.7 for the craton endmember, we can get a corresponding FeO from the first relationship, and a MgO content is obtained based on that Mg# and FeO. With the MgO, based on other relationships, other major element contents are determined. The amount of Na2O is very small and can be ignored. Together, the craton endmember has SiO2 = 46.55%, Al2O3 = 1.1%, MgO = 45.99%, FeO = 5.7%, CaO = 0.62%.
With these endmember compositions, we estimated the dependence of ΔVS /VS and ΔVP /VP on composition and temperature at different depths. In this study, we performed the conversion between 200 and 380 km depths, because dripping bodies are clearly observed there (Extended Data Fig. 3); the depth is not too deep to be affected by potential inaccuracy from the assumed 410-discontinuity topography; and ΔVP /VP is best resolved at ~300 km depth (Extended Data Fig. 5).
To estimate the temperature dependence, we calculated seismic velocities at different temperature and pressure conditions. We assumed an ambient mantle potential temperature of 1350ºC76, and an adiabatic temperature gradient of 0.4ºC/km77. At each depth, we estimate the elastic VS and VP 50ºC above and below the adiabat using Perple_X27 with thermodynamic data and solutions in ref.28, and pressures from PREM78. Then, we estimated the anelasticity effect for these two temperatures at the period of 20 s based on the relationship in ref.29 through the very-broadband rheology calculator79 with the solidus from ref.80 assumed for calculation in ref.29. We assume the attenuation effect on bulk modulus to be negligible, and how complex anelastic compliance in ref.29 would affect VP is derived in the same way as in ref.81 for VS. The period of 20 s is chosen as it is close to the frequency of body waves used in this study. After that, gradients of VS and VP with respect to temperature are obtained by calculating the differences in anelastic VS and VP between these two temperatures and dividing them by the temperature difference of 100ºC. These gradients are further divided by the anelastic VS and VP at adiabatic temperatures for further conversion.
The compositional dependence was estimated similarly. By assuming mechanical mixing82, at different depths along the assumed 1350ºC adiabat, we calculated the difference in anelastic VS and VP between the two endmember compositions, and these differences are divided by unity to represent the compositional gradients. These gradients of VS and VP are then divided by the 1D reference VS and VP of SATONA (Extended Data Fig. 6) at these depths for conversion.
VS and VP perturbations are then converted to temperature and compositional perturbations. Here, we convert velocity perturbations rather than absolute velocities, because absolute velocities could be strongly influenced by the assumed reference anelasticity model and the exact composition of mantle rocks, while perturbations are less dependent on these reference conditions. Before conversion, to avoid misalignment of independently constrained VS and VP features, ΔVS /VS and ΔVP/VP are smoothed by convolving with a 3D Gaussian, whose horizontal and vertical standard deviations are 120 km and 20 km. Velocities are also converted to 20 s based on the attenuation model58 for consistency. Velocity perturbations are expressed by temperature perturbation (ΔT) and compositional perturbation obtained by solving these two equations (Fig. 2d, e, and Extended Data Fig. 9). When converting for the dripping anomalies (e.g., Fig. 2d, e), only geographic locations within the dripping area with positive ΔVS /VS and ΔVP/VP are considered.
The range of converted compositional perturbation appears to be large for the whole model (Extended Data Fig. 9i,m). However, considering these endmember compositions might not be suitable for other areas; the anelasticity model could be more complicated for hot environments29 such as in the western US and Mexico (Extended Data Fig. 3). Since our goal is mainly to show whether overall VS and VP patterns for dripping bodies could correspond to an increased amount of cratonic material, we focus more on the relative compositional difference between the dripping area and regions excluding the dripping area rather than the exact values.
Estimation of thinning. An approximate estimate of the amount of lithospheric thinning follows from assuming the amount of cratonic material is proportional to ΔVS /VS, and all previously dripped lithosphere is still present in the tomographic model as fast anomalies. We first set ΔVS /VS at 180 km as a reference for craton material, since it is within the craton while not too shallow to be strongly influenced by the cold temperature. Then we calculated the sum of positive ΔVS /VS in the dripping area (Fig. 2c) between 250 and 550 km depths and obtained the ratio between them and the reference summed ΔVS /VS at 180 km depth, which represents the volume proportion of cratonic materials at those depths and is on average ~20%. Hence, the thinning would be (550 km – 250 km) × 20% = 60 km if dripping locally. Then, since horizontal flow induced by the slab (Fig. 3a versus c) could focus material from the whole cratonic region covered by this model (Fig. 1) with an area of ~4×106 km2, and the dripping area only has an area of ~1.5×106 km2 (~38% of the cratonic area), the average amount of thinning is calculated to be 60 km × 38% ~ 23 km. However, some regions may experience more thinning than others, for instance, flanks at the western edge of the craton in Fig. 2a, b could partly be caused by the thinning. Meanwhile, although we have estimated compositional perturbation, and it acts as a powerful way to show the significant distinction in composition, we did not use it for the thinning estimation as what is obtained there are perturbations with large uncertainties and are also subject to many assumptions.
Geodynamic modeling of mantle convection. Mantle flow fields driven by different density structures were modeled. The calculation is based on ref.33,83 up to spherical harmonic degree 63. The 1D viscosity distribution from ref.84 and the surface MORVEL85 plate motion are assumed, and we only consider vertical variations in viscosity, for simplicity, which leads us to trust patterns more than amplitudes of flow, e.g. ref.86.
We first estimated flow fields from two density models based on the global tomographic model TX2019slab34. Starting from its VS model, we removed all velocity perturbations around the dripping area at 280-600 km depths (Fig. 3b,d) to eliminate local buoyancy flows. Then for one model, at each depth slice below 600 km, we outlined the geometry of the high-velocity Farallon slab and removed all perturbations within it (Fig. 3d). After that, two VS models with or without the Farallon slab are converted to density based on a VS-density relationship84, and their corresponding flow fields were calculated (Fig. 3).
To evaluate whether the downward flow induced by the Farallon slab is strong enough to drag down the dripping materials, we calculated flow fields after merging SATONA with TX2019slab. For SATONA, we only used reliable regions based on resolution tests (Extended Data Fig. 5). For locations that are >200 km within the resolution boundary (Extended Data Fig. 5), ΔVS /VS in model SATONA is used, and for locations >200 km outside the boundary, TX2019slab34 is used. For locations within 200 km of the boundary, we weighted averaged ΔVS /VS between these two models to make the merged model continuous at ± 200 km from the boundary. A similar merging strategy is designed for the bottom of SATONA, with depths < 700 km using SATONA, while using TX2019slab34 velocity for depths > 900 km, and in between, weighted averaged values are used. After smoothing, ΔVS /VS of the dripping area is about 1% (Extended Data Fig. 9e), in agreement with some longer wavelength previous models (Extended Data Fig. 7). Then, to understand how dripping body densities could affect flows, we tested four scenarios by making the VS-density scaling factor at 260-600 km depths for positive VS anomalies within the dripping area to be 1, 0.5, 0, -0.5 times the factor for the rest of the region84 (Extended Data Fig. 10), which in the last scenario, makes dripping bodies positively buoyant (Extended Data Fig. 10d).
We also used past slab locations to help our interpretation (Fig. 1). Mantle structures in history were predicted in ref.36 by advecting velocity structures in TX2019slab34 back in time. Because the present-day Farallon slab does not always appear above the 660-discontinuity, we chose to use the contour of 0.12% density anomaly at 800 km depth to outline the slab for those paleo-density models. Then, to show the relative location of the Farallon slab to the North American plate, these outlines were translated to their present location based on the assumed plate motion87 in ref.36. These past slab locations were analyzed with dated magmatism (Fig. 1) in Kansas88, Arkansas89, Monroe uplift90, Kentucky91, and Virginia92.
Methods References
54 Janiszewski, H. A., Gaherty, J. B., Abers, G. A., Gao, H. & Eilon, Z. C. Amphibious surface-wave phase-velocity measurements of the Cascadia subduction zone. Geophysical Journal International 217, 1929-1948 (2019).
55 Zhu, H., Bozdağ, E. & Tromp, J. Seismic structure of the European upper mantle based on adjoint tomography. Geophysical Journal International 201, 18-52 (2015).
56 Liu, D. C. & Nocedal, J. On the limited memory BFGS method for large scale optimization. Mathematical programming 45, 503-528 (1989).
57 Zhu, H., Komatitsch, D. & Tromp, J. Radial anisotropy of the North American upper mantle based on adjoint tomography with USArray. Geophysical Journal International 211, 349-377 (2017).
58 Dalton, C. A., Ekström, G. & Dziewoński, A. M. The global attenuation structure of the upper mantle. Journal of Geophysical Research: Solid Earth 113 (2008).
59 Komatitsch, D. & Tromp, J. Spectral-element simulations of global seismic wave propagation—I. Validation. Geophysical Journal International 149, 390-412 (2002).
60 Laske, G., Masters, G., Ma, Z. & Pasyanos, M. in Geophysical research abstracts. 2658 (EGU General Assembly 2013, Vienna, Austria).
61 Kustowski, B., Ekström, G. & Dziewoński, A. Anisotropic shear‐wave velocity structure of the Earth's mantle: A global model. Journal of Geophysical Research: Solid Earth 113 (2008).
62 Ekström, G., Nettles, M. & Dziewoński, A. The global CMT project 2004–2010: Centroid-moment tensors for 13,017 earthquakes. Physics of the Earth and Planetary Interiors 200, 1-9 (2012).
63 Panning, M. & Romanowicz, B. A three-dimensional radially anisotropic model of shear velocity in the whole mantle. Geophysical Journal International 167, 361-379 (2006).
64 Rawlinson, N. & Spakman, W. On the use of sensitivity tests in seismic tomography. Geophysical Journal International 205, 1221-1243 (2016).
65 Fichtner, A. & Leeuwen, T. v. Resolution analysis by random probing. Journal of Geophysical Research: Solid Earth 120, 5549-5573 (2015).
66 Bernstein, S., Kelemen, P. B. & Brooks, C. K. Depleted spinel harzburgite xenoliths in Tertiary dykes from East Greenland: restites from high degree melting. Earth and Planetary Science Letters 154, 221-235 (1998).
67 Hanghøj, K., Kelemen, P., Bernstein, S., Blusztajn, J. & Frei, R. Osmium isotopes in the Wiedemann Fjord mantle xenoliths: a unique record of cratonic mantle formation by melt depletion in the Archaean. Geochemistry, Geophysics, Geosystems 2 (2001).
68 Wittig, N. et al. Origin of cratonic lithospheric mantle roots: A geochemical study of peridotites from the North Atlantic Craton, West Greenland. Earth and Planetary Science Letters 274, 24-33 (2008).
69 Liu, J. et al. Plume-driven recratonization of deep continental lithospheric mantle. Nature 592, 732-736 (2021).
70 Griffin, W., Graham, S., O'Reilly, S. Y. & Pearson, N. Lithosphere evolution beneath the Kaapvaal Craton: Re–Os systematics of sulfides in mantle-derived peridotites. Chemical Geology 208, 89-118 (2004).
71 Simon, N. S., Carlson, R. W., Pearson, D. G. & Davies, G. R. The origin and evolution of the Kaapvaal cratonic lithospheric mantle. Journal of Petrology 48, 589-625 (2007).
72 Simon, N. S., Irvine, G. J., Davies, G. R., Pearson, D. G. & Carlson, R. W. The origin of garnet and clinopyroxene in “depleted” Kaapvaal peridotites. Lithos 71, 289-322 (2003).
73 Boyd, F. et al. Composition of the Siberian cratonic mantle: evidence from Udachnaya peridotite xenoliths. Contributions to Mineralogy and Petrology 128, 228-246 (1997).
74 Ionov, D. A., Doucet, L. S. & Ashchepkov, I. V. Composition of the lithospheric mantle in the Siberian craton: new constraints from fresh peridotites in the Udachnaya-East kimberlite. Journal of petrology 51, 2177-2210 (2010).
75 Ionov, D. A., Doucet, L. S., Carlson, R. W., Golovin, A. V. & Korsakov, A. V. Post-Archean formation of the lithospheric mantle in the central Siberian craton: Re–Os and PGE study of peridotite xenoliths from the Udachnaya kimberlite. Geochimica et Cosmochimica Acta 165, 466-483 (2015).
76 Herzberg, C. & Gazel, E. Petrological evidence for secular cooling in mantle plumes. Nature 458, 619-622 (2009).
77 Katsura, T. A revised adiabatic temperature profile for the mantle. Journal of Geophysical Research: Solid Earth 127, e2021JB023562 (2022).
78 Dziewonski, A. M. & Anderson, D. L. Preliminary reference Earth model. Physics of the earth and planetary interiors 25, 297-356 (1981).
79 Havlin, C., Holtzman, B. K. & Hopper, E. Inference of thermodynamic state in the asthenosphere from anelastic properties, with applications to North American upper mantle. Physics of the Earth and Planetary Interiors 314, 106639 (2021).
80 Hirschmann, M. M. Mantle solidus: Experimental constraints and the effects of peridotite composition. Geochemistry, Geophysics, Geosystems 1 (2000).
81 McCarthy, C., Takei, Y. & Hiraga, T. Experimental study of attenuation and dispersion over a broad frequency range: 2. The universal scaling of polycrystalline materials. Journal of Geophysical Research: Solid Earth 116 (2011).
82 Xu, W., Lithgow-Bertelloni, C., Stixrude, L. & Ritsema, J. The effect of bulk composition and temperature on mantle seismic structure. Earth and Planetary Science Letters 275, 70-79 (2008).
83 HC, a global mantle circulation solver (2009).
84 Steinberger, B. & Calderwood, A. R. Models of large-scale viscous flow in the Earth's mantle with constraints from mineral physics and surface observations. Geophysical Journal International 167, 1461-1481 (2006).
85 DeMets, C., Gordon, R. G. & Argus, D. F. Geologically current plate motions. Geophysical journal international 181, 1-80 (2010).
86 Ghosh, A., Becker, T. & Humphreys, E. Dynamics of the North American continent. Geophysical Journal International 194, 651-669 (2013).
87 Torsvik, T. H. et al. Pacific‐Panthalassic reconstructions: Overview, errata and the way forward. Geochemistry, Geophysics, Geosystems 20, 3659-3689 (2019).
88 Blackburn, T. J., Stockli, D. F., Carlson, R. W. & Berendsen, P. (U–Th)/He dating of kimberlites—A case study from north-eastern Kansas. Earth and Planetary Science Letters 275, 111-120 (2008).
89 Eby, G. N. & Vasconcelos, P. Geochronology of the Arkansas alkaline province, southeastern United States. The Journal of Geology 117, 615-626 (2009).
90 Baksi, A. K. The timing of Late Cretaceous alkalic igneous activity in the northern Gulf of Mexico basin, southeastern USA. The Journal of Geology 105, 629-644 (1997).
91 Heaman, L. M., Kjarsgaard, B. A. & Creaser, R. A. The temporal evolution of North American kimberlites. Lithos 76, 377-397 (2004).
92 Zartman, R. E. Geochronology of some alkalic rock provinces in eastern and central United States. Annual Review of Earth and Planetary Sciences 5, 257-286 (1977).
93 Boyce, A. et al. A New P‐Wave Tomographic Model (CAP22) for North America: Implications for the Subduction and Cratonic Metasomatic Modification History of Western Canada and Alaska. Journal of Geophysical Research: Solid Earth 128, e2022JB025745 (2023).
94 Bedle, H., Lou, X. & van der Lee, S. High-resolution imaging of continental tectonics in the mantle beneath the United States, through the combination of USArray data sets. Authorea Preprints (2022).
95 Wessel, P. et al. The generic mapping tools version 6. Geochemistry, Geophysics, Geosystems 20, 5556-5564 (2019).
96 Kennett, B. L., Engdahl, E. & Buland, R. Constraints on seismic velocities in the Earth from traveltimes. Geophysical Journal International 122, 108-124 (1995).