Abstract
A geologically rapid Neoproterozoic oxygenation event is commonly linked to the appearance of marine animal groups in the fossil record. However, there is still debate about what evidence from the sedimentary geochemical record—if any—provides strong support for a persistent shift in surface oxygen immediately preceding the rise of animals. We present statistical learning analyses of a large dataset of geochemical data and associated geological context from the Neoproterozoic and Palaeozoic sedimentary record and then use Earth system modelling to link trends in redox-sensitive trace metal and organic carbon concentrations to the oxygenation of Earth’s oceans and atmosphere. We do not find evidence for the wholesale oxygenation of Earth’s oceans in the late Neoproterozoic era. We do, however, reconstruct a moderate long-term increase in atmospheric oxygen and marine productivity. These changes to the Earth system would have increased dissolved oxygen and food supply in shallow-water habitats during the broad interval of geologic time in which the major animal groups first radiated. This approach provides some of the most direct evidence for potential physiological drivers of the Cambrian radiation, while highlighting the importance of later Palaeozoic oxygenation in the evolution of the modern Earth system.
Similar content being viewed by others
Main
Earth’s oceans and atmosphere are traditionally thought to have undergone two major episodes of oxygenation1. During the Great Oxidation Event (2.2–2.4 billion years ago), atmospheric oxygen increased from trace concentrations to above 0.1% of present atmospheric levels (PAL)2,3. A second, the Neoproterozoic Oxygenation Event (NOE), has been inferred around the Ediacaran/Cambrian boundary (~538.8 million years ago (Ma)), originally interpreted as an increase in atmospheric oxygen from Proterozoic to near-modern levels (generally from 1–10% to ~100% PAL)1,4. The NOE has received extensive attention due to its broad temporal coincidence with the first unambiguous macroscopic animal fossils5. Ocean–atmosphere oxygenation has been widely invoked as a plausible environmental driver of early animal evolution since all extant animal groups rely on environmental oxygen to complete their life cycle, and observations from modern oceans indicate that many Cambrian animal body plans and ecologies would not have been permitted at very low oxygen levels6.
The inferred trajectory of Neoproterozoic oxygenation has been increasingly challenged. Multiple lines of evidence support the general persistence of anoxic water masses in the deep ocean (and intermittently on the shelves) until the mid-Palaeozoic era7,8,9,10,11, punctuated by a series of extreme 1–10 Myr-scale oscillations in the oxygenation of Earth’s oceans and atmosphere (oceanic oxygenation events) through much of the Neoproterozoic era and into the Cambrian period12,13,14. These data call into question a late Neoproterozoic \(p_{{\mathrm{O}}_2}\) increase to near-modern levels. They further highlight the importance of spatial and temporal scale in framing questions about Neoproterozoic environmental change and the evolution of animals. Many geochemical proxies for palaeoredox record fluctuations in bottom-water oxygenation along the outer shelf and slope, whereas the majority of modern marine animals and those recorded in the fossil record live in shallow shelf environments. It is therefore the oxygenation of these shallow environments that would control the degree of hypoxic stress experienced by marine animals. While Earth system boundary conditions such as continental configuration can play a major role in global deep-ocean oxygenation at 10 Myr timescales via impacts on ocean circulation15, dissolved [O2] in shallow marine habitats is expected to be controlled primarily by ocean–atmosphere gas exchange. Furthermore, geochemical proxies commonly record fluctuations on 10 kyr to 1 Myr timescales, whereas Neoproterozoic–Palaeozoic animal radiations were evolutionary singularities, with new body plans and ecologies remaining viable on 100 Myr timescales. To test permissive environment hypotheses relating to environmental oxygen and the diversification of early animals, we must therefore specifically investigate the long-term oxygenation of Earth’s continental shelf environments.
Dramatic changes in organic carbon burial have also been inferred across the Neoproterozoic–Palaeozoic transition. On short-term, physiological timescales, an enhanced benthic carbon flux is expected to increase food supply to benthic marine ecosystems, potentially acting as an alternative or synergistic physiological driver of ecological and evolutionary change16. If sustained over geologic timescales, increased organic carbon burial would also have driven increased oxygen flux to the atmosphere17, in turn increasing the oxygenation of shallow-water habitats where dissolved [O2] is dominated by air–sea gas exchange. Organic carbon flux also impacts the sequestration of trace metals such as uranium (U) and molybdenum (Mo) in sediments18, complicating inferences based on some of the most commonly used geochemical proxies for late Neoproterozoic ocean oxygenation9,14,19,20,21.
In this Article, we combine approaches from statistical learning, biogeochemical modelling and ecophysiology to better constrain changes in global ocean biogeochemistry and marine animal habitats through the Neoproterozoic and Palaeozoic eras. We analyse geochemical data and associated geological context assembled by the Sedimentary Geochemistry and Paleoenvironments Project22 (SGP), a community sedimentary geochemical database, for Tonian–Carboniferous shales (~1,000–300 Ma; Extended Data Fig. 1)22. We investigate sedimentary records of the trace metals molybdenum and uranium, proxies widely used to investigate global ocean oxygenation because of their high burial rates in modern anoxic settings relative to modern oxygenated settings14,19,20,21,23. The magnitude of authigenic trace metal enrichments in sedimentary archives (for example, anoxic shales) is positively correlated with the dissolved seawater inventory, which in turn is negatively correlated with the global spatial extent of reducing conditions1,18,19,20,21. Thus, sediments deposited beneath anoxic water columns during times of globally widespread anoxia will record muted authigenic enrichments, whereas higher enrichments will occur in globally well-oxygenated oceans. We also investigate trends in the local redox state of anoxic environments using the iron speciation proxy24 and in organic carbon burial flux using shale total organic carbon (TOC). Reconstructing global-scale biogeochemical processes from sedimentary geochemical proxies is complicated by local biogeochemical and physical factors, as well as uneven geological sampling in space and time22,25. Our statistical analyses are designed to address these issues, enabling us to reconstruct meaningful biogeochemical trends from the aggregated sedimentary geochemical record.
First, we reconstruct temporal trends in the mean distributions of sedimentary geochemical data, accounting for sampling biases using a weighted bootstrap analysis that incorporates the spatial and temporal proximity of samples in the dataset25 (Methods). Improving on previous data compilations9,16,19,22,26, this analysis accommodates the impact of geographical sampling bias to better generate global mean trends in sedimentary concentrations25, as well as benefiting from the improved data density in the SGP dataset22. We then conduct a statistical learning analysis designed to isolate global long-term trends in the biogeochemical processes that justify our use of these proxies, separated from the complex caveats and interactions that affect interpretations of raw sedimentary data. We use a Monte Carlo random forest framework (with proxy-specific hyperparameter tuning; Methods) to generate partial dependence analyses that isolate the marginal effect of geologic time on the mean value of each geochemical proxy with all identified confounding geochemical and geologic context variables held constant. Although the absolute magnitudes of these results reflect the sampled sedimentary record, the directional trends produced in these statistical learning analyses enable us to reconstruct changes in global biogeochemical cycles. Our partial dependence analyses therefore track seawater metal inventories for Mo and U, sulfide levels in sampled shelf–slope settings for iron speciation and organic carbon fluxes to sampled shelf–slope settings for TOC.
Finally, we combine our statistical reconstructions with Earth system modelling to investigate trends in atmospheric \(p_{{\mathrm{O}}_2}\), nutrient levels, bottom-water redox and the oxygenation of shallow marine habitats over the Neoproterozoic and Palaeozoic eras. We conduct an ensemble biogeochemical modelling experiment, integrating three-dimensional ocean simulations from the cGENIE Earth system model of intermediate complexity27 with a Mo–U mass balance model28 and the CANOPS biogeochemical model29,30. We use this to investigate the impacts of different stable atmospheric oxygen and marine productivity scenarios30 on the three-dimensional distribution of global marine dissolved [O2], seawater Mo and U concentrations and organic carbon burial rates. We combine our statistical reconstructions of trends in seawater trace metal inventories and organic carbon burial with the results of these model simulations to provide new long-term (~10 Myr timescale) estimates of atmospheric oxygen, marine productivity, seafloor redox and shallow shelf dissolved [O2] through the Neoproterozoic and Palaeozoic eras.
Geochemical proxy records
Trace metal concentrations in anoxic black shales do increase in the late Neoproterozoic in our temporal–spatial weighted analyses of bootstrapped means, but they subsequently decrease in the early Palaeozoic before increasing again in the Devonian (Fig. 1a,b). Distributions of both Mo in euxinic shale and U in anoxic shale show similar trends when temporal and spatial sampling biases are accounted for. When trace metals are standardized to TOC, there is considerable noise in bootstrap means and no clear trend through the Neoproterozoic and early Palaeozoic, although both Mo/TOC and U/TOC increase in the later Devonian (Fig. 1c,d; note low number of units sampled for Mo/TOC in the Carboniferous). These analyses alone, therefore, suggest that there was no major sustained increase in marine Mo or U concentrations until the Devonian, contrasting sharply with previous interpretations of trace metal data.
A major (~130%) Ediacaran–Cambrian increase in TOC is also captured in our spatial–temporal bootstrap analyses, followed by further increases in the mid-Palaeozoic (Fig. 1f). There is a low proportion of euxinic (anoxic, sulfidic) samples for most of the Tonian–Carboniferous according to iron speciation data, indicating that most anoxic samples in the Neoproterozoic and Palaeozoic were deposited under ferruginous (anoxic, non-sulfidic) bottom-water conditions (Fig. 1e; see further discussion of the iron speciation proxy in Methods). The late Ediacaran and Devonian are notable exceptions, with relatively high proportions of anoxic shale samples classified as euxinic. Some transient trends, such as high late Tonian (~800 Ma) Mo and TOC values are difficult to interpret due to the limited number of euxinic shale units sampled through those intervals (Fig. 1), although the late Tonian merits further investigation given correlated shifts in other proxy records5. Distributions of raw data without temporal–spatial bootstrap analyses show similar long-term trends in central tendency for all proxies (Extended Data Fig. 2).
Deconvolved biogeochemical trends
Temporal reconstructions of mean global Mo and U, isolated from the impacts of local redox, organic carbon, detrital input and depositional and post-depositional processes (Methods) show minor, partially transient increases in trace metal concentrations in the late Neoproterozoic followed by major increases in the mid-Palaeozoic (Fig. 2a,b). This remains the case when iron speciation data are used as predictor data rather than thresholds (Extended Data Fig. 3), indicating the robustness of these trends to specific proxy interpretations (compare ref. 31). The trajectories of the Mo and U differ slightly in the timings and rates of change, although linking subtle temporal contrasts to proxy-specific redox sensitivities would require careful consideration of proxy sample distributions. Notably, these analyses are designed to investigate changes on relatively long geologic timescales and are therefore not expected to capture <~10 Myr redox instabilities, such as those observed in the Ediacaran–Cambrian U cycle13 and produced by biogeochemical models at similar timescales12. Variable importance plots (Extended Data Fig. 4) illustrate that TOC (followed by geologic age and [Al]) is the most important predictor of Mo and U concentrations in our random forest models. Model fit plots indicate that on average our hyperparameter-tuned random forest analyses account for >65% of observed variance in these datasets (Extended Data Fig. 5).
In contrast to our trace metal reconstructions, deconvolved TOC reconstructions exhibit a major sustained increase in the late Neoproterozoic and early Cambrian (Fig. 2d). This ~70% increase in deconvolved mean values broadly agrees with compilations of raw data16 but is of lower magnitude than previously estimated after accounting for confounding variables and sampling biases. This Ediacaran–Cambrian TOC increase is followed by a steady increase from the Middle Ordovician onwards in analyses including all shale samples. Alternative TOC treatments (restricted to anoxic shales to isolate deconvolved trends in organic carbon burial in anoxic settings, or with FePy/FeHR as a predictor variable to isolate deconvolved trends in organic carbon burial in anoxic settings independent of temporal changes in the Fe–S biogeochemistry; Extended Data Table 1) also show a major stepwise Ediacaran–Cambrian increase (Extended Data Fig. 6). Analyses of variable importance (Extended Data Fig. 4) indicate that Al concentrations, lithology and geographic coordinates (latitude and longitude) are the most important predictors of TOC in our primary analyses. The proportion of euxinic samples moderately increases in the late Ediacaran before decreasing in the early Palaeozoic, then increases dramatically in the Silurian–Devonian in agreement with previous statistical analyses8.
Implications for oxygen and productivity
In our ensemble Earth system modelling experiment, we estimate the impact of atmospheric oxygen concentrations and marine productivity on the biogeochemical processes that we aim to isolate in our statistical learning analyses and the oxygenation of both the global ocean and shallow marine habitats. Our cGENIE simulations of three-dimensional ocean biogeochemistry show that the oxygenation of shallow shelf and global bottom-water environments respond differently to changes in atmospheric \(p_{{\mathrm{O}}_2}\) and marine PO4 (as a key modulator of marine productivity) combinations predicted to be stable on geologic timescales30 (Figs. 3 and 4 and Methods). Most of the global seafloor is overlain by reducing bottom waters at relatively low atmospheric O2 levels (below ~25–100% PAL depending on marine PO4), with the proportions of anoxic and suboxic seafloor varying as a function of atmospheric oxygen and marine productivity (Fig. 3a,b and Extended Data Fig. 7). At relatively high atmospheric oxygen levels (~25–100% PAL, again depending on marine PO4), the majority of the global seafloor—including the deep ocean—is overlain by oxygenated bottom waters (Fig. 4). By contrast, dissolved [O2] in shallow shelf environments scales essentially linearly with atmospheric oxygenation (Figs. 3f and 4), with only minor decreases in shelf oxygenation with increasing productivity. Changing the continental configuration, global climate state or biological pump strength in cGENIE does not substantially impact these results (Extended Data Fig. 8). Although continental configuration may have played a role in structuring deep-ocean ventilation during the Palaeozoic15, this does not challenge our key observation that shallow shelf oxygenation scales essentially linearly with atmospheric oxygen while deep-ocean oxygenation exhibits a contrastingly nonlinear relationship with atmospheric oxygen (Fig. 4).
We link these oceanographic trends to the biogeochemical dynamics isolated in our statistical learning analyses by coupling cGENIE outputs to biogeochemical models of molybdenum, uranium and organic carbon. The predominance of reducing seafloor conditions at low atmospheric \(p_{{\mathrm{O}}_2}\) levels results in a limited sensitivity of seawater Mo and U concentrations to changing atmospheric oxygen below ~25% PAL atmospheric \(p_{{\mathrm{O}}_2}\) (Fig. 3c-d). Our simulations demonstrate that both anoxic and suboxic conditions can act as major controls on the trace metal inventory of seawater, with the same marine Mo and U concentrations simulated by a range of biogeochemical landscapes defined by the balance of the two reducing sinks28 (Extended Data Fig. 9). Seawater trace metal concentrations therefore broadly track the extent of oxic seafloor (or, equivalently, fanox+subox, the combined extent of reducing seafloor), and the seawater inventories of Mo and U are much more sensitive to both atmospheric oxygen concentrations and marine productivity above ~25% PAL \(p_{{\mathrm{O}}_2}\). In contrast to modelled trace metal concentrations, organic carbon burial rates are sensitive to changing atmospheric oxygen and marine productivity across the whole logarithmic range of \(p_{{\mathrm{O}}_2}\)-productivity scenarios investigated here (Fig. 3e). Global average organic carbon burial rates (and therefore average TOC in shales, unless separately impacted by other factors) are predicted to increase relatively continuously with increases in atmospheric oxygen and marine productivity, similar to shallow shelf dissolved oxygen concentrations (Figs. 3f and 4 and Extended Data Figs. 7–9).
Discussion
By integrating statistically deconvolved proxy records with a combined biogeochemical modelling approach, we provide new long-term reconstructions of Neoproterozoic–Palaeozoic ocean–atmosphere oxygenation and marine productivity. Coupling increased sampling with improved statistical treatment of confounding geologic context and geochemical variables enables us to both re-evaluate previous trace metal evidence for a stepwise Neoproterozoic oxygenation and to realize the potential of global TOC proxy records. Although previous studies have inferred a Palaeozoic shift in marine redox8,9,10,11, our combined trace metal and organic carbon analyses enable us to reconcile classical evidence for a Neoproterozoic oxygenation event19,20 with mid-Palaeozoic ocean–atmosphere oxygenation. Our analyses further establish the existence of sustained changes in ocean–atmosphere oxygenation and productivity at longer timescales than Ediacaran–Cambrian oscillations in ocean–atmosphere oxygenation12,13,14,21.
The contrasting sensitivities of modelled seawater trace metal concentrations and organic carbon burial rates to atmospheric oxygenation and marine productivity reveal a plausible environmental mechanism for the two major transitions observed in our deconvolved reconstructions of Mo, U and TOC (Fig. 2). In the late Neoproterozoic, we reconstruct a minor increase in marine trace metal concentrations but a major stepwise increase in organic carbon burial. These contrasting responses probably indicate a late Neoproterozoic increase in atmospheric oxygen, marine productivity and organic carbon burial without a major change in global deep-ocean oxygenation (Figs. 4 and 5). Relatively low seawater trace metal concentrations were maintained because most bottom waters remained reducing, although raw shale metal concentrations may have increased on average19,20 due to the impact of increased organic carbon loading and accompanying metal sequestration. In the mid-Palaeozoic, we reconstruct major increases in both marine trace metal concentrations and organic carbon burial, indicating that atmospheric oxygen and marine productivity increased to levels at which both proxies are expected to be sensitive. Deconvolved TOC and Mo–U records can therefore be reconciled if the Ediacaran–Cambrian increase in atmospheric oxygen and marine productivity was insufficient to oxygenate deep-ocean water masses and the global ocean did not become persistently oxygenated to near-modern levels until the Silurian–Devonian (Fig. 5).
Although trace metal proxies are commonly used to study ocean oxygenation through Earth history, our results highlight the indirect view that trace metals provide of shallow marine animal habitats (Fig. 4). While we refute suggestions that most of the global ocean became fully oxygenated in the late Neoproterozoic1, we do reconstruct a late Neoproterozoic increase in dissolved [O2] in shallow shelf environments. We therefore argue that whether there was an NOE depends on spatial and temporal perspective. From a global oceanographic viewpoint encompassing the deep ocean, there was essentially no NOE. From an ecophysiological perspective, considering the shallow shelf environments where most animals live and most of our fossil record is preserved, there was an NOE (Figs. 4 and 5). Specifically, our analyses indicate that shallow marine habitats were probably suboxic or severely hypoxic (0–22 µmol kg–1 O2) on average for most of the Neoproterozoic, hypoxic (22–63 µmol kg–1) in the early Palaeozoic and generally would not be considered oxic from a modern ecophysiological perspective (≥63 µmol kg–1) (ref. 6) until the Devonian (Fig. 5e). Across spatial oxygen gradients in the modern ocean, the differences in shallow marine oxygen concentrations that we reconstruct between the Neoproterozoic and Palaeozoic eras correspond to substantial differences in the functional and taxonomic diversity of marine animal communities6.
Establishing drivers of the sustained Ediacaran–Cambrian and Silurian–Devonian shifts in oxygenation and productivity is beyond the scope of this study, although we suggest that biogeochemical feedbacks related to volcanic reductant flux32 and the ecological expansion of land plants9,11 warrant particular research attention. Changes in proxy records may also have been modulated by changes in temperature and continental configuration15; however, these are unlikely to explain the scale of changes reconstructed in our statistical analyses (Extended Data Fig. 8). Moreover, changes in continental configuration impact primarily deep-water oxygenation15. Our multiproxy trace metal and organic carbon approach emphasizes a key role for atmospheric oxygen in controlling the oxygenation of shallow-water marine animal habitats.
The increase in marine productivity that we reconstruct in the late Neoproterozoic also has physiological implications for early animals, supporting hypotheses linking food supply to the Cambrian radiation16,33,34. The second major increase in atmospheric \(p_{{\mathrm{O}}_2}\), marine primary production, and shallow marine dissolved [O2] that we reconstruct for the Silurian–Devonian has similar implications for the potential roles of oxygen and food supply in the Devonian radiation of fishes35. We expect mechanistic ecophysiological modelling approaches—similar to those designed to investigate the capability of other inferred environmental changes to drive specific reconstructed biodiversity dynamics (for example, refs. 36,37)—to be critical in establishing the specific roles that oxygen and food supply may have played as drivers of early animal evolution. Nonetheless, these analyses establish sustained directional changes in both dissolved oxygen and export-driven food supply in late Neoproterozoic shallow marine habitats, with similar magnitudes to environmental gradients that play key roles in structuring the biodiversity and composition of modern marine ecosystems6. We thus demonstrate persistent reductions in key ecophysiological stressors at an appropriate time to have been bottom-up drivers of the polyphyletic radiation of marine animal groups during the Ediacaran–Cambrian transition.
Methods
Data processing
The complete Phase 1 dataset was downloaded from the SGP website (sgp-search.io; data API—{“type”:“nhhxrf”,“filters”:{},“show”:[“fe”,“fehr_fe_t”,“fe_py_fe_hr”,“toc”,“alu”,“mo”,“u”,“fe_t_al”,“coord_lat”,“coord_long”,“basin_type”,“meta_bin”,“environment_bin”,“lithology_name”,“max_age”,“min_age”,“interpreted_age”,“site_type”,“strat_name”,“strat_name_long”]}). This dataset was filtered to include only sedimentary samples with interpreted geological ages between 300 and 1,000 Ma (approximately Tonian–Carboniferous). The dataset was then further filtered to include only samples described as having shale-like or fine-grained lithologies (argillite, clay, claystone, dolomudstone, lime mudstone, meta-argillite, metapelite, metasiltstone, mud, mudstone, oil shale, pelite, phosphorite, shale, silt, siltite, siltstone and slate, plus samples assigned no lithology) and samples assigned marine depositional environments (basinal, outer shelf and inner shelf, plus samples assigned no depositional environment). We further removed samples with exceptionally high Mo or U concentrations that would be considered ore-grade metalliferous rocks (using a cut-off of 1,000 ppm). The distribution of categorical geological context variables in this primary dataset through geologic time is shown in Extended Data Fig. 1. The full filtering process and number of samples in the dataset after each filtering step as shown in our associated R Markdown files (Code availability).
Spatial–temporal weighted bootstrap analysis
For our spatial–temporal weighted analyses of bootstrapped means, we removed samples without geographic coordinate data and subsetted the primary SGP dataset using proxy-specific filters (Extended Data Table 1). Next, we binned the shale samples for each proxy into 25 Myr time bins. In each 25 Myr time bin, we assigned samples weights based on their temporal and spatial proximity to other samples in the bin using the inverse weighting algorithm of ref. 25. We then used these weights to generate distributions of 1,000 weighted bootstrapped means for each time bin and plotted those distributions as box and whisker plots. We present histograms illustrating the number of lithostratigraphic units sampled per time bin for each weighted bootstrap analysis. We also present box and whisker plots of the raw data used in these analyses (Extended Data Fig. 2). Full spatial–temporal weighted bootstrap methods are shown in our associated R Markdown files (Code availability).
Random forest analyses
We conducted Monte Carlo random forest analyses to generate statistical reconstructions of changes in four key marine biogeochemical variables from the Tonian to Carboniferous: seawater [Mo], seawater [U], proportion of sampled anoxic depositional environments that were euxinic and organic carbon burial. Each of these variables is semi-quantitatively reconstructed as a statistical model of changes in the mean value of a sedimentary geochemical proxy (shale [Mo], shale [U], proportion euxinic based on iron speciation, TOC) with geological time when all geologic context and geochemical variables known to impact the incorporation and preservation of the desired biogeochemical signal in sedimentary rocks are held constant. For each biogeochemical variable, a different combination of predictor variables is used, and additional filtering steps may be applied on the basis of the geological and biogeochemical processes associated with the specific proxy (Extended Data Table 1).
For Mo and U, these analyses are designed to identify trends in the seawater inventories of these metals, thus tracking the oxygenation of global bottom waters. Because authigenic trace metal enrichment will be strongly controlled by local redox state (which determines the reduction of a given metal) and organic carbon (which often acts as the sedimentary host18), these analyses include geochemical proxies (iron speciation and TOC) capable of tracking these processes. For iron speciation, these analyses are designed to reconstruct how sulfide levels have varied in sampled low-oxygen shelf–slope settings independent of other oceanographic factors. For TOC, these analyses are designed to reconstruct trends in organic carbon flux to sampled environments independent of other oceanographic factors, with supplementary analyses further indicating how organic carbon delivery to anoxic settings changed through this time interval. These analyses are targeted at directional trends rather than absolute values because the magnitude of deposition in shelf–slope environments is often higher than global averages. For example, mean organic carbon concentrations (Fig. 2) are higher than expected global averages because of the high concentrations found in shelf–slope settings where most of our sedimentary record comes from.
To avoid overfitting and to maximize model performance, we conduct hyperparameter tuning for each of the 100 random forest runs conducted for each biogeochemical variable. In this hyperparameter tuning process, we randomly subdivide our subsetted geochemical dataset into a training dataset (two-thirds of the data) and test dataset (one-third of the data) and then test model performance for a range of model parameters. The number of trees is varied (1, 4, 8, 16, 32, 64, 128 and 256 trees), the number of variables randomly sampled at each tree split is varied (mtry values of 2, 4, 6, 8 and 10) and the maximum node size is varied in logarithmic steps between 2 and the number of samples divided by 4. These three parameters are varied such that all combinations are explored and the variance and mean squared error are recorded for each random forest model. Of these models, the model with the lowest mean squared error is selected as the best model for that iteration (that is, that randomly assigned training dataset and associated assignments of geologic age and partial data; details follow). Extended Data Fig. 5 illustrates the mean squared errors and hyperparameters associated with each best-fit model. Extended Data Fig. 5 also shows the variance accounted for by each of these best-fit models: 60-70% of the data variance is accounted for by the Mo, U and TOC models, while 50–60% of the data variance is accounted for by the iron speciation model.
Some of the filtering steps used in data subsetting for our analyses (also in the preceding bootstrap analyses) involve iron speciation parameters that are interpreted according to standard thresholds24,42. These thresholds have recently been questioned31, although that compilation involved a large number of samples that would be deemed inappropriate for iron speciation according to criteria in ref. 42, and thus continued re-analyses of these thresholds are necessary. To test whether these iron speciation thresholds were influencing our results, we ran our molybdenum analyses (the analyses most dependent on iron speciation) with iron speciation values both as explicit filters on the dataset to isolate anoxic samples (Fig. 2) and as predictor variables in the analysis (Extended Data Fig. 3). The results are qualitatively very similar, indicating that ongoing debate about the use of specific iron speciation thresholds does not impact our results.
To maximize the number of samples that can be used in these models, samples that have data entered for the variable of interest, estimated age and any other necessary geochemical data for redox classification (FeHR/FeT and FePy/FeHR for Mo; FeHR/FeT or FeT/Al for U) but are missing other variables incorporated in the random forest analyses are included. For samples with partial data, missing data for categorical geologic context (lithology, site type, basin type, environmental bin, metamorphic bin) and geographic coordinates are randomly assigned in each Monte Carlo simulation from the full range of feasible values for each variable (as random forest analyses require a complete data matrix). For samples missing [Al] (in analyses that include [Al]), we impute [Al] by generating a random value between the 25th and 75th percentiles of [Al] values within the sample’s assigned lithology (on the basis of the entire primary dataset described in the preceding). For samples without an assigned lithology that are missing [Al], a random [Al] value between the 25th and 75th percentiles of [Al] values for all lithologies is imputed, and then lithology is randomly assigned. In our Monte Carlo approach, these random value assignments are conducted 100 times for each analysis, corresponding to 100 separate random forest models per proxy/biogeochemical process. The results of these 100 random forest models are then summarized as partial dependence plots and box plots to visualize the uncertainty associated with geologic age models (description follows) and samples with partial data.
All samples in our Monte Carlo random forest analyses are assigned ages using section-based age models and assigned geological age uncertainties. Samples that were not assigned maximum and minimum ages by SGP contributors are assigned age uncertainties of 25 Myr (±12.5 Myr relative to assigned interpreted age) for Neoproterozoic samples and 10 Myr (±5 Myr relative to assigned interpreted age) for Palaeozoic samples. In each Monte Carlo random forest analysis, we randomly assign each sample a geologic age based on its estimated maximum and minimum ages and its relationship with other samples in the sampled section to accommodate both uncertainties in geologic age models and the principle of stratigraphic superposition. These ages are assigned separately for each of 100 random forest models independently to incorporate uncertainty in assigned geologic ages. Age models are assigned by categorizing sections into eight categories under three broad classifications:
-
1.
Sections with height/depth data
-
a.
Samples are assigned continuously ascending interpreted age estimates.
-
b.
Samples are all assigned the same interpreted age estimate.
-
c.
Samples are assigned interpreted age estimates in clusters or steps.
-
a.
-
2.
Sections without height/depth data
-
a.
Samples are assigned continuously ascending interpreted age estimates.
-
b.
Samples are all assigned the same interpreted age estimate.
-
c.
Samples are assigned interpreted age estimates in clusters or steps.
-
a.
-
3.
Sections with only one sample
-
a.
Sample has stratigraphic height/depth.
-
b.
Sample does not have stratigraphic height/depth.
-
a.
In each scenario, ages are randomly assigned within younging upward sequences to incorporate both age uncertainty and the individual sample’s stratigraphic relationship to other samples in the section. Full age-model methods are available in the R Markdown code associated with this study (Code availability).
We present the results of our Monte Carlo random forest analyses as partial dependence plots. Partial dependence plots show the isolated marginal effect of a variable of interest (geologic time in the analyses presented in Fig. 2) on the predicted outcome of the random forest model43. We use partial dependence plots rather than other feature effect methods (for example, accumulated local effects plots) for both statistical and biogeochemical reasons. (1) Correlations between geologic time and model variables are consistently low in all model treatments. (2) While the mean tendency of proxy variables may shift through geologic time, observations that are less likely in a certain geological time interval (for example, rift basins or low TOC shale in the late Palaeozoic) are still represented in our dataset (Extended Data Fig. 1). This means that we never ask our models to make predictions in unrealistic feature space where the random forest model has not been trained. (3) Most important, we use global feature effects models because they exclude the impacts of other confounding variables linked to authigenic enrichments in sedimentary rocks. Partial dependence plots allow us (at least to a first approximation) to isolate changes in marine biogeochemistry by asking questions such as how do molybdenum concentrations in fine-grained siliciclastic rocks change independent of other secular changes in sampled sedimentary environments, including changes in average organic carbon loading, local redox and depositional environment? Accumulated local effects plots and other local feature effects models, however, would be expected to better represent average sedimentary enrichments for each geologic time interval (without, for example, standardizing for long-term variations in average local redox) because they do factor in changes in other predictor variables (in our case, characteristics of sedimentary rocks). Consequently, they are less likely to independently track the oceanographic and Earth system changes that we are most interested in investigating. We further choose to present partial dependence plots rather than individual conditional expectation plots because the research questions we address in this study are global in nature and benefit from the changes in global averages we can reconstruct from partial dependence analyses, although individual conditional expectation analyses have exciting potential to provide new insights into local-to-regional datasets. We present partial dependence plots as envelopes, summarizing the 100 random forest models in each Monte Carlo analysis (Fig. 2). The plotted envelopes are generated by linear interpolation of the 100 individual partial dependence plots generated per analysis at 0.1 Myr time intervals and computing the 5th, 25th, 75th and 95th percentiles of the interpolated partial dependence plot populations at each time step. Full Monte Carlo random forest methods are shown in our associated R Markdown files (Code availability).
Earth system modelling
We generated three-dimensional realizations of feasible ancient ocean biogeochemistry using the cGENIE Earth system model of intermediate complexity27. We conducted an ensemble modelling experiment, varying atmospheric \(p_{{\mathrm{O}}_2}\) and marine PO4 in 11 logarithmic increments between 1% and 100% of present atmospheric/oceanic levels (1.0%, 1.6%, 2.5%, 4.0%, 6.3%, 10.0%, 16.0%, 24.0%, 40.0%, 63.0%, 100.0%). Our main text results (ensemble #1) use a Cryogenian–Ediacaran (635 Ma) continental configuration, a shallow (50% of modern, 294.9745 m) e-folding depth to parameterize organic remineralization, 3,336 ppm (12 PAL) atmospheric \(p_{{\mathrm{CO}}_2}\) and appropriate Cryogenian–Ediacaran estimate of the solar constant (1,295.9701 W m–2, a 5.2653% reduction from modern), following ref. 44. All ensembles use a single limiting nutrient (PO4) scheme to parameterize biological export, following ref. 45. We conduct additional ensemble experiments as sensitivity analyses to investigate the impacts of the marine biological carbon pump (#2), continental configuration (#3), global climate (#4, #5) and combined continental configuration and global climate (#6). Ensemble #2 uses a modern e-folding depth (589.9451 m) and is otherwise identical to ensemble #1. Ensemble #3 uses an Ordovician continental configuration and solar constant39 and is otherwise identical to ensemble #1. Ensemble #4 and ensemble #5 use atmospheric \(p_{{\mathrm{CO}}_2}\) concentrations of 834 ppm (3 PAL) and 5,560 ppm (20 PAL), respectively, and are otherwise identical to ensemble #1. Ensemble #6 uses an Ordovician continental configuration and solar constant39, an atmospheric \(p_{{\mathrm{CO}}_2}\) concentration of 834 ppm (3 PAL) and is otherwise identical to ensemble #1. All cGENIE simulations are run for 10,000 years, at which point benthic oxygen concentrations have reached steady state.
For each experiment, we extract the mean annual dissolved oxygen concentrations for the final model year. We then categorize the bottom-water cells of each ocean realization into three categories on the basis of dissolved oxygen concentrations: anoxic (dissolved [O2] ≤ 0 μmol kg–1), suboxic (dissolved [O2] > 0 μmol kg–1; dissolved [O2] ≤ 4.8 μmol kg–1) and oxic (dissolved [O2] ≥ 4.8 μmol kg–1) (ref. 6). As all cGENIE configurations in this study use equal-area grid cells, the proportional extents of anoxic, suboxic and oxic seafloor are then computed as a direct proportion of all (non-land) model grid cells. For each experiment, we also calculate mean shelf [O2] by computing the mean dissolved [O2] of all ocean cells adjacent to land in the top three layers of the cGENIE ocean (<283.8 m water depth)38,39. All cGENIE experiments conducted for this study employ a 36 × 36 longitude–latitude grid with 16 depth layers. The spatial resolution of cGENIE results in a relatively coarse bathymetric grid that is not well suited for explicitly resolving bottom-water environments across steeply sloping shallow shelves because of the spatial coarsening applied to higher-resolution palaeogeographic reconstructions. We therefore employ an approximation of marine shelf environments (ocean cells adjacent to land in the top three layers of the cGENIE ocean, following refs. 38,39), that is, not dependent on a coarsened bathymetric grid and therefore not vulnerable to these spatial limitations. Our application of cGENIE thus enables us to capture well-defined oxygen minimum zones and decoupling of shallow- and deep-water oxygenation for a range of Earth system boundary conditions (Figs. 3 and 4 and Extended Data Fig. 7). These results are not impacted by the coarsened shelf–slope bathymetry inherent to intermediate complexity models. As the bathymetric gradients of deep-water environments are much shallower than on continental shelves, spatial coarsening does not dramatically impact our global estimates of seafloor anoxia, suboxia and oxic conditions. This approximation technique is therefore employed only for modelling shelf environments.
Stable atmospheric \(p_{{\mathrm{O}}_2}\) and marine productivity states
The comparatively coarse spatial resolution and associated computational efficiency of CANOPS means that it is viable to run large model ensembles with an open oxygen cycle on geologic timescales, whereas computational constraints mean that it would be unrealistic to run large-ensemble cGENIE experiments at the scale conducted here for the timescales required for an open oxygen cycle to reach equilibrium. We therefore use results from an ensemble of model experiments using the CANOPS Earth system model30 to establish a subset of our cGENIE Earth system model experiments that are expected to be stable on geologic timescales.
Trace metal mass balance
We use a three-sink Mo–U mass balance model28 to simulate seawater [Mo] and [U] for each cGENIE Earth system simulation. We use the extents of anoxic, suboxic and oxic seafloor calculated from our cGENIE ocean realizations (Extended Data Fig. 7) as forcings for the extents of redox-sensitive sinks in the mass balance model. The naming conventions fanox (fractional extent of anoxic seafloor), fsubox (fractional extent of suboxic seafloor) and foxic (fractional extent of oxic seafloor) are used equivalently to feux, fred and foxic in ref. 28 to directly match the parameters extracted from our cGENIE ocean simulations. The modern biogeochemical data underlying the parameterization of these sinks and associated fluxes remain the same. All other flux subscripts are correspondingly updated such that:
Thus, the change in the seawater inventory of Mo or U (Nsw) with respect to time is a function of the metal flux into the global ocean via rivers (Friv), minus the metal fluxes into redox-sensitive depositional environments (Foxic, Fsubox, Fanox). The pseudospatial scaling algorithm of ref. 28 (based on ref. 46) is also used here to incorporate the estimated attenuation of trace metal burial rates with water depth for scenarios with expansive reducing conditions in deep marine environments. For all parameters in the model analyses presented in this study, we use the mean value (midpoint) of the uniform distributions used to define model parameters such as riverine input and fluxes into redox-sensitive sinks in ref. 28. Our integrated cGENIE and mass balance approach reproduces modern seawater Mo and U concentrations at modern (for example, 100% PAL) levels of oxygen and phosphate (Fig. 3c,d).
We further present a global sensitivity analysis of this Mo–U mass balance model, illustrating contours of equal seawater [Mo] or [U] for varying balances of fanox and fsubox (Extended Data Fig. 9). This analysis uses the same parameters as the model applied to the cGENIE simulations, but fanox and fsubox are varied in 31 logarithmic steps between 0.1% and 100.0%.
Organic carbon burial
We use results from the ensemble CANOPS Earth system model experiment used to establish stable Earth system states30 to estimate organic carbon burial rates for each atmospheric \(p_{{\mathrm{O}}_2}\) and marine productivity scenario. Global marine organic carbon burial rates (as a function of present oceanic levels) are binned into the logarithmically spaced atmospheric \(p_{{\mathrm{O}}_2}\) and marine productivity scenarios, and the mean simulated organic carbon burial flux is calculated for each scenario. We refer the reader to ref. 30 for full details of model parameters and stability calculations.
Data availability
The Sedimentary Geochemistry and Paleoenvironments Project dataset used in this study is publicly available from sgp-search.io. The full API details for the Phase 1 dataset used in this study are described in Methods.
Code availability
The R code used to run the statistical analyses is available in R Markdown format and rendered as HTML and PDF files at https://github.com/richardstockey/sgp.trace.metals. This R Markdown code is available via Zenodo (https://doi.org/10.5281/zenodo.11426206)47. The code for the version of the ‘muffin’ release of the cGENIE Earth system model used in this paper is tagged as v.0.9.46 and is available via Zenodo (https://doi.org/10.5281/zenodo.10041805)48. Configuration files for the specific experiments presented in the paper can be found in the directory genie-userconfigs/PUBS/submitted/Stockey_et_al.NatGeo.2023. Details of the experiments, plus the command line needed to run each one, are given in the readme.txt file in that directory. All other configuration files and boundary conditions are provided as part of the code release. A manual detailing code installation, basic model configuration, tutorials covering various aspects of model configuration, experimental design and output, plus the processing of results, is available via Zenodo (https://doi.org/10.5281/zenodo.7545814 (v.0.9.35))49.
References
Lyons, T. W., Reinhard, C. T. & Planavsky, N. J. The rise of oxygen in Earth’s early ocean and atmosphere. Nature 506, 307–315 (2014).
Farquhar, J., Bao, H. & Thiemens, M. Atmospheric influence of Earth’s earliest sulfur cycle. Science 289, 756–758 (2000).
Crockford, P. W. et al. Triple oxygen isotope evidence for limited mid-Proterozoic primary productivity. Nature 559, 613–616 (2018).
Kump, L. R. The rise of atmospheric oxygen. Nature 451, 277–278 (2008).
Cole, D. B. et al. On the co-evolution of surface oxygen levels and animals. Geobiology 18, 260–281 (2020).
Sperling, E. A., Knoll, A. H. & Girguis, P. R. The ecological physiology of Earth’s second oxygen revolution. Annu. Rev. Ecol. Evol. Syst. 46, 215–235 (2015).
Stolper, D. A. & Keller, C. B. A record of deep-ocean dissolved O2 from the oxidation state of iron in submarine basalts. Nature 553, 323–327 (2018).
Sperling, E. A. et al. A long-term record of early to mid-Paleozoic marine redox change. Sci. Adv. 7, eabf4382 (2021).
Dahl, T. W. et al. Devonian rise in atmospheric oxygen correlated to the radiations of terrestrial plants and large predatory fish. Proc. Natl Acad. Sci. USA 107, 17911–17915 (2010).
Lu, W. et al. Late inception of a resiliently oxygenated upper ocean. Science 361, 174–177 (2018).
Wallace, M. W. et al. Oxygenation history of the Neoproterozoic to early Phanerozoic and the rise of land plants. Earth Planet. Sci. Lett. 466, 12–19 (2017).
Krause, A. J., Mills, B. J. W., Merdith, A. S., Lenton, T. M. & Poulton, S. W. Extreme variability in atmospheric oxygen levels in the late Precambrian. Sci. Adv. 8, eabm8191 (2022).
Wei, G. Y. et al. Global marine redox evolution from the late Neoproterozoic to the early Paleozoic constrained by the integration of Mo and U isotope records. Earth Sci. Rev. 214, 103506 (2021).
Sahoo, S. K. et al. Oceanic oxygenation events in the anoxic Ediacaran ocean. Geobiology 14, 457–468 (2016).
Pohl, A. et al. Continental configuration controls ocean oxygenation during the Phanerozoic. Nature 608, 523–527 (2022).
Sperling, E. A. & Stockey, R. G. The temporal and environmental context of early animal evolution: considering all the ingredients of an ‘explosion’. Integr. Comp. Biol. 58, 605–622 (2018).
Berner, R. A. The Phanerozoic Carbon Cycle (Oxford Univ. Press, 2004); https://doi.org/10.1093/oso/9780195173338.001.0001
Tribovillard, N., Algeo, T. J., Lyons, T. & Riboulleau, A. Trace metals as paleoredox and paleoproductivity proxies: an update. Chem. Geol. 232, 12–32 (2006).
Scott, C. et al. Tracing the stepwise oxygenation of the Proterozoic ocean. Nature 452, 456–459 (2008).
Partin, C. A. et al. Large-scale fluctuations in Precambrian atmospheric and oceanic oxygen levels from the record of U in shales. Earth Planet. Sci. Lett. 369–370, 284–293 (2013).
Sahoo, S. K. et al. Ocean oxygenation in the wake of the Marinoan glaciation. Nature 489, 546–549 (2012).
Farrell, Ú. C. et al. The Sedimentary Geochemistry and Paleoenvironments Project. Geobiology 19, 545–556 (2021).
Morford, J. L. & Emerson, S. The geochemistry of redox sensitive trace metals in sediments. Geochim. Cosmochim. Acta 63, 1735–1750 (1999).
Poulton, S. W. & Canfield, D. E. Ferruginous conditions: a dominant feature of the ocean through Earth’s history. Elements 7, 107–112 (2011).
Mehra, A. et al. Curation and analysis of global sedimentary geochemical data to inform Earth history. GSA Today 31, 4–9 (2021).
Sperling, E. A. et al. Statistical analysis of iron geochemical data suggests limited late Proterozoic oxygenation. Nature 523, 451–454 (2015).
Ridgwell, A. et al. Marine geochemical data assimilation in an efficient Earth system model of global biogeochemical cycling. Biogeosciences 4, 87–104 (2007).
Stockey, R. G. et al. Persistent global marine euxinia in the early Silurian. Nat. Commun. 11, 1804 (2020).
Ozaki, K. & Tajika, E. Biogeochemical effects of atmospheric oxygen concentration, phosphorus weathering, and sea-level stand on oceanic redox chemistry: implications for greenhouse climates. Earth Planet. Sci. Lett. 373, 129–139 (2013).
Cole, D. B., Ozaki, K. & Reinhard, C. T. Atmospheric oxygen abundance, marine nutrient availability, and organic carbon fluxes to the seafloor. Global Biogeochem. Cycles 36, e2021GB007052 (2022).
Pasquier, V., Fike, D. A., Révillon, S. & Halevy, I. A global reassessment of the controls on iron speciation in modern sediments and sedimentary rocks: a dominant role for diagenesis. Geochim. Cosmochim. Acta 335, 211–230 (2022).
Alcott, L. J., Mills, B. J. W. & Poulton, S. W. Stepwise Earth oxygenation is an inherent property of global biogeochemical cycling. Science 366, 1333–1337 (2019).
Brocks, J. J. The transition from a cyanobacterial to algal world and the emergence of animals. Emerg. Top. Life Sci. 2, 181–190 (2018).
Laakso, T. A., Sperling, E. A., Johnston, D. T. & Knoll, A. H. Ediacaran reorganization of the marine phosphorus cycle. Proc. Natl Acad. Sci. USA 117, 11961–11967 (2020).
Dahl, T. W. & Hammarlund, E. U. Do large predatory fish track ocean oxygenation? Commun. Integr. Biol. 4, 92–94 (2011).
Boag, T. H., Gearty, W. & Stockey, R. G. Metabolic tradeoffs control biodiversity gradients through geological time. Curr. Biol. 31, 2906–2913.e3 (2021).
Penn, J. L., Deutsch, C., Payne, J. L. & Sperling, E. A. Temperature-dependent hypoxia explains biogeography and severity of end-Permian marine mass extinction. Science 362, eaat1327 (2018).
Pohl, A. et al. Why the early Paleozoic was intrinsically prone to marine extinction. Sci. Adv. 9, eadg7679 (2023).
Stockey, R. G., Pohl, A., Ridgwell, A., Finnegan, S. & Sperling, E. A. Decreasing Phanerozoic extinction intensity as a consequence of Earth surface oxygenation and metazoan ecophysiology. Proc. Natl Acad. Sci. USA 118, 2021 (2021).
Kocsis, Á. T., Reddin, C. J., Alroy, J. & Kiessling, W. The R package divDyn for quantifying diversity dynamics using fossil sampling data. Methods Ecol. Evol. 10, 735–743 (2019).
D’Antonio, M. P., Ibarra, D. E. & Boyce, C. K. Land plant evolution decreased, rather than increased, weathering rates. Geology 48, 29–33 (2020).
Raiswell, R. et al. The iron paleoredox proxies: a guide to the pitfalls, problems and proper practice. Am. J. Sci. 318, 491–526 (2018).
Molnar, C. Interpretable Machine Learning—A Guide for Making Black Box Models Explainable 2nd edn christophm.github.io/interpretable-ml-book/ (2020).
Reinhard, C. T. et al. The impact of marine nutrient abundance on early eukaryotic ecosystems. Geobiology 18, 139–151 (2020).
Meyer, K. M., Ridgwell, A. & Payne, J. L. The influence of the biological pump on ocean chemistry: implications for long-term trends in marine redox chemistry, the global carbon cycle, and marine animal ecosystems. Geobiology 14, 207–219 (2016).
Reinhard, C. T. et al. Proterozoic ocean redox and biogeochemical stasis. Proc. Natl Acad. Sci. USA 110, 5357–5362 (2013).
Stockey, R. Richardstockey/sgp.trace.metals: submission for publication. Zenodo https://doi.org/10.5281/zenodo.11426206 (2024).
Ridgwell, A. et al. Derpycode/cgenie.muffin: v0.9.46. Zenodo https://doi.org/10.5281/zenodo.10041805 (2023).
Ridgwell, A. et al. Derpycode/muffindoc: v0.9.35. Zenodo https://doi.org/10.5281/zenodo.7545814 (2023).
Acknowledgements
We thank A.-S. Ahm, W. Chan, M. Clarkson, K. Doyle, J. Dumoulin, T. Fraser, J. Husson, B. Johnson, F. Kurzweil, A. Lenz, X. Lu, Y. Liu, A. Miller, P. Petrov, S. Richoz, P. Sack, C. Scott, S. Slotznick, S. Spinks, S. Tecklenburg, D. Thompson, H. Wang, L. Xang and J. Wang for their contribution to the Sedimentary Geochemistry and Paleoevironments Project dataset used in this study. We thank A. Ridgwell and A. Pohl for helpful discussions. A.J.M.J. publishes with permission of the Senior Executive Director, Northern Territory Geological Survey. We thank Stanford University, the Stanford Research Computing Center, the IRIDIS High Performance Computing Facility and associated High Performance Computing support services at the University of Southampton for providing computational resources and support during this research. This research and the SGP are funded by NSF grants EAR-1922966 and EAR-2143164 to E.A.S. We further thank the donors of The American Chemical Society Petroleum Research Fund for partial support of this research (61017-ND2).
Author information
Authors and Affiliations
Contributions
R.G.S., E.A.S. and N.J.P. designed the research. U.C.F. conducted all geoinformatics tasks associated with building and compiling our geochemical database. R.G.S. led the research, performed statistical analyses and conducted palaeo-oceanographic modelling. R.G.S., D.B.C., U.C.F., H.A., T.H.B., J.J.B., D.E.C., M.C., P.W.C., H.C., T.W.D., L.D.M., K.D., S.Q.D., J.F.E., R.R.G., T.M.G., B.C.G., G.J.G., K.G., R.G., G.H., E.U.H., K.H., M.A.H., C.M.H., M.S.W.H., A.J.M.J., D.T.J., P.K., J.K., A.H.K., M.K., M.A.L., C.L., D.K.L., F.A.M., J.M.M., N.T.M., L.M.O., B.O., A.P., S.E.P., S.M.P., S.W.P., S.R.R., A.D.R., S.S., E.F.S., J.V.S., G.J.U., T.W., R.A.W., C.R.W., I.Y., N.J.P. and E.A.S. provided detailed geological context information on samples, provided unpublished geochemical data and helped interpret results. R.G.S. and E.A.S. wrote the manuscript, with important discussion and contributions from all authors.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Geoscience thanks Zunli Lu, Edel O’Sullivan and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Xujia Jiang, in collaboration with the Nature Geoscience team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Histograms showing the distribution of geologic context variables for the primary SGP dataset used in this study.
A) Lithology, B) Site Type, C) Basin Type, D) Environmental Bin, E) Metamorphic Bin. Details regarding these geological and geographic context variables can be found on the SGP wiki (https://github.com/ufarrell/sgp_phase1/wiki/Database-description#geological-context).
Extended Data Fig. 2 Distributions of raw geochemical data in sub-datasets used in spatial-temporal weighted bootstrap analyses (Fig. 1 of main text).
A) molybdenum concentrations in euxinic shales, B) uranium concentrations in anoxic shales, C) proportion of anoxic shales that are euxinic based on iron speciation, D) total organic carbon (TOC) in all shales. Box and whisker plots illustrate the distribution of all data. Central box lines correspond to the median of the distribution for each time bin; lower/upper box boundaries correspond to the 25th and 75th percentiles; lower/upper whiskers correspond to the smallest/largest value no further than 1.5 times the interquartile range from the lower/upper box boundary; points indicate outliers from the whisker range. As box and whisker plots of proportion euxinic data are not very informative about the structure of the data, we overlay red data points illustrating the mean of the data for each time bin for panel C. Histograms show the number of lithostratigraphic units used in the bootstrap analyses for each time bin.
Extended Data Fig. 3 Impact of using iron speciation values as predictors rather than filters in molybdenum random forest analyses.
These analyses eliminate any assumptions linked to specific iron speciation thresholds used in main text analyses (Extended Data Table 1).
Extended Data Fig. 4 Random forest variable importance plots.
Panels A, C, E, and G show increased mean squared error estimates for molybdenum, uranium, proportion euxinic, and total organic carbon analyses, respectively. Panels B, D, F, and H show increased node purity estimates for molybdenum, uranium, proportion euxinic, and total organic carbon analyses, respectively. Box and whisker plots summarize the results from the 100 Monte Carlo random forest analyses presented in the main text for each proxy. Central box lines correspond to the median of the distribution for each time bin; lower/upper box boundaries correspond to the 25th and 75th percentiles; lower/upper whiskers correspond to the smallest/largest value no further than 1.5 times the interquartile range from the lower/upper box boundary; points indicate outliers from the whisker range. For both variable importance metrics, higher values indicate that a variable is more important in determining the predictions of the model.
Extended Data Fig. 5 Summaries of model fit, parameters and performance for random forest analyses, illustrating the distribution for the best fit model across 100 Monte Carlo random forest iterations.
Panels A-C show mean squared error for best fit model for A) molybdenum (ppm) and uranium (ppm), B) TOC (wt %), C) iron speciation proportion euxinic, denoted by Fepy (range of 0-1). Mean squared error is plotted across three panels due to the different units of different proxies. Panel D shows the percentage variance described by the best fit model for each proxy. Panel E shows the number of trees used in the best fit model for each proxy. Panel F shows the mtry value for the best fit model for each proxy. Panel G shows the node size for the best fit model for each proxy. Central box lines correspond to the median of the distribution for each time bin; lower/upper box boundaries correspond to the 25th and 75th percentiles; lower/upper whiskers correspond to the smallest/largest value no further than 1.5 times the interquartile range from the lower/upper box boundary; points indicate outliers from the whisker range.
Extended Data Fig. 6 Impact of alternative redox treatments on random forest analyses.
Partial dependence plots illustrate the marginal effect of geologic time on A) molybdenum, B) uranium, C) proportion of euxinic depositional environments, D) total organic carbon. All analyses included here include the full suite of geologic context variables used in all random forest analyses. Redox filters are color-coded for each proxy. Envelopes represent the 25th to 75th percentiles of the distribution of interpolated partial dependence plot values from 100 Monte Carlo random forest analyses for each timestep. See Extended Data Table 1 for full model predictor variables.
Extended Data Fig. 7 Spatial illustrations of cGENIE models arrays.
A) Maps of redox classifications for cGENIE bottom waters. Maps illustrate bottom water dissolved oxygen concentrations binned into oxic (dissolved [O2] ≥ 4.8 μmol/kg), suboxic (dissolved [O2] > 0 μmol/kg; dissolved [O2] ≤ 4.8 μmol/kg) and anoxic (dissolved [O2] ≤ 0 μmol/kg) categories for the cGENIE ensemble experiment presented in Fig. 3. B) Hypoxia classification of cGENIE equatorial transects. Ocean transects show equatorial cross sections of the three-dimensional ocean models presented in Fig. 3 binned into anoxic, suboxic, severe hypoxia, hypoxia and oxic categories (following Sperling et al.6) to illustrate the impact of oxygen-productivity scenarios on broad physiological thresholds for marine animals. In panel B only the stable oxygen-productivity scenarios depicted in Fig. 3 are shown.
Extended Data Fig. 8 cGENIE configuration sensitivity.
In this sensitivity analysis, we replicate Fig. 3 for A) Cryogenian-Ediacaran continental configuration at 12 PAL CO2 with shallow remineralization depth (Fig. 3), B) Ordovician continental configuration at 12 PAL CO2 with shallow remineralization depth, C) Cryogenian-Ediacaran continental configuration at 12 PAL CO2 with modern remineralization depth, D) Cryogenian-Ediacaran continental configuration at 3 PAL CO2 with shallow remineralization depth, E) Cryogenian-Ediacaran continental configuration at 20 PAL CO2 with shallow remineralization depth, F) Ordovician continental configuration at 3 PAL CO2 with shallow remineralization depth.
Extended Data Fig. 9 Mo-U mass balance global sensitivity analysis.
Impact of fanox and fsubox on the estimated concentrations of molybdenum and uranium in seawater using the Mo-U mass balance model of ref. 28.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Stockey, R.G., Cole, D.B., Farrell, U.C. et al. Sustained increases in atmospheric oxygen and marine productivity in the Neoproterozoic and Palaeozoic eras. Nat. Geosci. 17, 667–674 (2024). https://doi.org/10.1038/s41561-024-01479-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41561-024-01479-1