There are dozens of disciplines and subdisciplines within the broad ambit of climate science, studying everything from ancient geology to the spread of disease. But one discipline in particular is exposed to intense public scrutiny, the subject of long-running political and legal disputes: modeling.
As interesting as the details of climate science may be, what society most needs from it is an answer to a simple question: What the hell is going to happen? What are we in for? That’s the question models seek to answer.
It turns out that attempting to understand, model, and predict the entire global biophysical/atmospheric system is complicated. It’s especially tricky because there’s no way to run tests. There’s no second Earth to use as an experimental control group. The best scientists can do is use their knowledge of climate history and climate physics to build models of Earth systems and then test the models against future emission scenarios.
This reliance on models has always been a bête noire for climate change deniers, who have questioned their accuracy as a way of casting doubt on their dire projections. For years, it has been a running battle between scientists and their critics, with the former rallying to defend one dataset and model after another. (The invaluable site Skeptical Science has a page devoted to attacks on modeling, with links to further reading.)
Now, for the first time, a group of scientists — Zeke Hausfather of UC Berkeley, Henri Drake and Tristan Abbott of MIT, and Gavin Schmidt of the NASA Goddard Institute for Space Studies — has done a systematic review of climate models, dating back to the late 1970s. Published in Geophysical Research Letters, it tests model performance against a simple metric: how well they predicted global mean surface temperature (GMST) through 2017, when the latest observational data is available.
Long story short: “We find that climate models published over the past five decades were generally quite accurate in predicting global warming in the years after publication.”
This is contrary to deniers, who claim that models overestimate warming, and contrary to the bizarre op-ed the New York Times ran in November, which claimed that scientists underestimate warming. As it happens, models have roughly hit the mark all along. It’s just, nobody listened.
The good news, as the authors say, is that this result “increases our confidence that models are accurately projecting global warming.” As uncertain as we may be about our future emissions (more on that later), we have a pretty good handle on how the planet is going to respond to them.
The bad news is that the projections from those models are unrelentingly grim, so accuracy isn’t very reassuring.
Let’s take a quick look at how the review worked.
Five decades of climate models, more or less on point
The researchers did a comprehensive literature review for pre-1990 models; for post-1990 models, they followed the literature reviews of the Intergovernmental Panel on Climate Change (IPCC). They ended up choosing 17 models to closely analyze, dating from 1970 through 2007 — models old enough to be testable against decades of observational data.
To be clear, almost all the models chosen are no longer in use, having been superseded by more sophisticated models since. Some of the earlier ones, especially those from the 1970s and early ’80s, are fairly crude energy-in, energy-out models, with a single variable for forcing (CO2) and a crude measure of climate sensitivity (the amount temperature rises per additional ton of CO2). It wasn’t until the late ’80s that James Hansen and other scientists developed multivariable general-circulation models.
It turns out that even those crude early models were fairly accurate, which is remarkable given the sophistication of the science and the available computing power. None of the models the authors analyzed got it badly wrong.
There is one important nuance to keep in mind here, which helps illuminate the ways that climate models are evolving and improving over time.
Predicting physics vs. predicting humans
There are two basic factors that contribute to the accuracy of a model’s projections. The first is physics — how various biophysical systems like the ocean and atmosphere respond to external radiative “forcings” like carbon dioxide and other greenhouse gases. That’s the stuff we expect climate scientists to get right.
But they also depend on the level of forcings, i.e., how many tons of GHGs are actually pumped into the atmosphere. That’s not a matter of physics, it’s more about demographics, economics, history, and sociology. It’s about how human societies and technologies develop, which depends on endless variables that climate scientists can’t possibly be expected to predict (not like anyone else can either).
Scientists generally project a range of forcings, with high, low, and medium scenarios, but they can still be off in one direction or another — and it’s not fair to blame the models when those projections of forcings turn out to be mistaken. It’s the physics for which we should be holding models accountable.
With that in mind, the authors tested the models against two different metrics. One is “temperature vs. time,” which is simply, how closely did the model predict observed changes in global mean surface temperature (GMST) over time?
The second is “temperature vs. change in radiative forcing” (or “implied TCR”), which asks, how accurately did the model predict the rate of change per ton of GHGs? This is arguably a fairer assessment of a model, since it measures it purely on the accuracy of its physics, not on the accuracy of its predictions about pollution.
Here are both metrics, plotted against observed GMST:
On the first metric, temperature vs. time, 10 of the 17 models were consistent with observed GMST. Three predicted temperatures too low, four too high.
On the second (and better) metric, implied TCR, 14 of 17 models were consistent with the observed relationship between forcings and temperature change. Two had an implied TCR that was too high, one too low. That’s extremely accurate overall.
Models are dialing in their physics
The authors assigned “skill scores” to models based on their accuracy, where 1 is perfect prediction and zero is no better than chance. On both metrics, temperature vs. time and implied TCR, the “average of the median skill scores across all the model projections evaluated” was 0.69.
But this obscures something important.
Early models performed quite well along the first metric; they predicted actual temperature change fairly accurately. But it turns out there was a bit of luck involved.
They got some physics wrong, in that they didn’t accurately estimate how much CO2 the ocean would absorb, so they overestimated how much surface temperatures would rise. Their implied TCR was too high. On the flip side, they badly underestimated the change in radiative forcings, mainly because most of them only included CO2, not other GHGs.
“These early models end up getting their temperature vs. time projections mostly right because the too-high transient climate response is counteracted by too-low future forcings,” Hausfather told me. So the overall median of scores for temperature vs. time projections was good, but maybe slightly by accident.
What this obscures is that implied TCR performance is generally getting better over time. Models are getting better at estimating the amount of climate change that will result from a given ton of GHGs. They’re getting their physics dialed in.
It will always be difficult to predict future radiative forcings, because they are dominated by anthropogenic emissions, anthropogenic emissions depend on human behavior, and human behavior is inscrutable. We don’t know how human societies are ultimately going to respond to climate change, whether they will manage to change course or continue blundering ahead into catastrophe.
All we can ask of climate models is that they accurately tell us what Earth’s biophysical systems will do in response to our behavior. And every indication suggests that models are doing just that. For five decades now, they have warned us that we are marching toward ruin, and we have, for the most part, ignored them. We cannot claim that we did not know what we were doing. We knew.