OPTiMAL: a new machine learning approach for GDGT-based palaeothermometry

Research output: Contribution to journalArticlepeer-review


In the modern oceans, the relative abundances of glycerol dialkyl glycerol tetraether (GDGT) compounds produced by marine archaeal communities show a significant dependence on the local sea surface temperature at the site of deposition. When preserved in ancient marine sediments, the measured abundances of these fossil lipid biomarkers thus have the potential to provide a geological record of long-term variability in planetary surface temperatures. Several empirical calibrations have been made between observed GDGT relative abundances in late Holocene core-top sediments and modern upper ocean temperatures. These calibrations form the basis of the widely used TEX86 palaeothermometer. There are, however, two outstanding problems with this approach: first the appropriate assignment of uncertainty to estimates of ancient sea surface temperatures based on the relationship of the ancient GDGT assemblage to the modern calibration dataset, and second, the problem of making temperature estimates beyond the range of the modern empirical calibrations (> 30 ∘C). Here we apply modern machine learning tools, including Gaussian process emulators and forward modelling, to develop a new mathematical approach we call OPTiMAL (Optimised Palaeothermometry from Tetraethers via MAchine Learning) to improve temperature estimation and the representation of uncertainty based on the relationship between ancient GDGT assemblage data and the structure of the modern calibration dataset. We reduce the root mean square uncertainty on temperature predictions (validated using the modern dataset) from ∼ ±6 ∘C using TEX86-based estimators to ±3.6 ∘C using Gaussian process estimators for temperatures below 30 ∘C. We also provide a new quantitative measure of the distance between an ancient GDGT assemblage and the nearest neighbour within the modern calibration dataset, as a test for significant non-analogue behaviour.

Bibliographic note

Funding Information: Acknowledgements. Tom Dunkley Jones, James A. Bendle, Ilya Mandel, Kirsty Edgar and Yvette L. Eley acknowledge NERC grant NE/P013112/1. Sarah E. Greene was supported by NERC Independent Research Fellowship NE/L011050/1 and NERC large grant NE/P01903X/1. William Thomson acknowledges the Wellcome Trust (grant 1516ISSFFEL9) for funding a parameterisation workshop at the University of Birmingham (UK). William Thomson, Tom Dunkley Jones and Ilya Mandel would like to thank the BB-SRC UK Multi-Scale Biology Network grant no. BB/M025888/1. Ilya Mandel is a recipient of the Australian Research Council Future Fellowship, FT190100574. We are grateful for the extensive and constructive reviews of Yige Zhang, Jessica Tierney, Peter Bijl and an anonymous reviewer, the comments of Huan Yang, and the great patience of Editor Alberto Reyes. We would also like to give special thanks to Elizabeth Thomas and Isla Castañeda for providing primary GDGT data during a global pandemic when access to laboratories was far from straightforward.


Original languageEnglish
Pages (from-to)2599–2617
Number of pages19
JournalClimate of the Past Discussions
Issue number6
Publication statusPublished - 23 Dec 2020