The weekend of July 30, a group of intellectual heavyweights met at a beautiful vineyard in California's Napa Valley. Their agenda was modest: learn how to predict the future.
The "class," organized by Edge, was led by Philip Tetlock, a University of Pennsylvania psychologist who has made the study of prediction his life's work. For the past several years, Tetlock and his colleagues have been running a project supported by the US intelligence community. Their goal is to find ways to accurately predict major events in world affairs, such as whether Vladimir Putin will lose power in Russia.
Now they're sharing what they've found with the world. The results are astonishing: Tetlock's team found out that some people were "superforecasters" who, when placed in teams, can produce a surprisingly good track record at predicting the future of world affairs. And Tetlock thinks he might know why.
Who is good at predicting future events? Who isn't?
Tetlock's project was born out of failure. Failure, specifically, of people like me: pundits and subject matter experts.
Between 1987 and 2003, Tetlock asked 284 people who "comment[ed] or offer[ed] advice on political and economic trends" professionally to make a series of predictive judgments about the world: 82,361, in total. Sample questions, according to the New Yorker's Louis Menand, included things like "Would there be a nonviolent end to apartheid in South Africa?" and "Would Gorbachev be ousted in a coup?"
Tetlock's findings, documented in a 2005 book called Expert Political Judgment, are pretty interesting. Tetlock asks us to imagine a bunch of chimpanzees throwing darts at a board full of predictions — a metaphor for random choice. These "dart-throwing chimps," in Tetlock's famous phrasing, would be more accurate than the so-called experts. Knowing a lot about Russia didn't help the experts predict what was going to happen in the country — in fact, it seemed to make them worse.
That seemed concerning. Foreign policy analysis depends on the idea that people with more information can formulate better policy. Obviously, there is more to policy than predicting certain outcomes in foreign countries, but it's certainly part of it. If the experts tend to be wrong about those predictions, it calls into question all of their analysis as well as the policies they come up with.
Tetlock wanted to know more. In 2011, the Intelligence Advanced Research Projects Activity (IARPA) — the US intelligence community's equivalent of DARPA — announced a contest. Researchers were invited to submit proposals on how they might go about developing a better way to predict future events.
Tetlock's group, founded along with Penn's Barbara Mellers and UC Berkeley's Don Moore, aimed to harness the wisdom of crowds. The Good Judgment Project, as it was called, asked huge numbers of people (it ended up being around 20,000) to make judgments about future world events. The researchers then used a variety of algorithms to sort through all of the individual predictions and come up with an aggregate prediction.
To be clear, these were not crystal ball–style predictions whereby participants were asked to guess, say, who would win the World Series. Rather, they were specific questions on known issues, on which there was lots of readily available information whereby participants could make reasoned conclusions about future outcomes.
One question, for example, was, "Who will be inaugurated as President of Russia in 2012?" Another: "Will the United Nations General Assembly recognize a Palestinian state by Sept. 30, 2011?" A third: "What will be the lowest end-of-day price of Brent Crude Oil between Oct. 16, 2011 and Feb. 1, 2012?"
The Good Judgment Project won IARPA's contest for predictive capability. More than won, in fact: The researchers found something fascinating.
How the project was able to get much more accurate predictions
Here's a chart of the 100 best-performing forecasters who participated in the Good Judgment Project versus the 100 worst-performing, by Brier score (a statistical means of measuring prediction accuracy; all you need to know is that negative scores are better than positive ones). The x-axis is the number of questions posed; what you see is that as people were asked to make more predictions, the gap between the top predictors and the worst ones actually widened. That suggests that it wasn't a statistical fluke:
The results got even more interesting when Tetlock's team looked closer. They separated participants into three groups: "superforecasters" (the most accurate 2 percent, who were asked to work together to make predictions in groups), "top team individuals" (the next 3 to 5 percent, working individually), and everyone else. The superforecaster teams dramatically outperformed everyone else — and as the project's years went on, the gap widened:
The results here seem potentially quite significant. If these superforecasters can be identified, and their judgment can be harnessed with the right statistical tools, that would seem to be a step toward significantly improving our ability to foresee future geopolitical events.
To be clear, no one is imagining that we'll be able to see the future or offer Nostradamus-style predictions. Rather, the skill on display here is analytical judgment on known issues in world affairs, which can be applied to better anticipate, say, how a foreign election seems likely to turn out, or whether North Korea looks like it's planning to conduct another nuclear test. These are analytical questions that just happen to be about future events, but that is a good metric for gauging the quality of the analysis.
The study is meant to produce lessons about prediction-making that can be used in the real world; remember that the intelligence community is sponsoring Tetlock's research.
It's not hard to understand why US intelligence agencies would want to be better at understanding how to predict future events. That's about knowing what's coming and being able to prepare for it, but also about avoiding mistakes — something Tetlock pointed out, referencing the 2003 invasion of Iraq.
"Would a forecasting tournament have saved us a multitrillion-dollar mistake that could have cost tens of thousands of lives? I don't know," Tetlock said at the Edge meeting. "I would say that if you have a tool that can increase the accuracy of probability estimates — by 30, 40, 50, 60 percent— as much as has been demonstrated in the IARPA tournaments, it's worth investing many millions of dollars even to reduce, to a small degree, the probability of multitrillion-dollar mistakes."
But how do you find a superforecaster — let alone enough of them to build a team like the ones that so excelled in Tetlock's project?
Four lessons for how to predict more accurately
Consider Bill Flack.
Flack is a retired irrigation specialist, with no obvious experience with world politics. Tetlock calls him "a nobody in Nebraska." (Flack's family would probably dispute this.)
And yet, Flack is "scientifically documented, officially certified IARPA tournament superforecaster." In Tetlock's project, "he did a great job assigning probability estimates to hundreds of questions posed over four years in the IARPA forecasting tournament, a superb performance. This is with neutral umpires, no room for fudging, this is objective scoring."
But how? What do Flack and people like him have that many of the so-called experts don't? It turns out there are a few things:
1) They change their mind, frequently and in little bits. According to Good Judgment Project researchers Pavel Atanasov and Angela Minster, one of the best things you can do is be open to change. People who slightly and frequently adjust their estimates about how likely something is to happen, based on new information, are much more likely to end up calling events correctly than are people who don't change their mind ever or who flip-flop more dramatically.
"Frequent, small belief updates are the marks of an accurate forecaster," Atanasov and Minster write.
2) Work in teams. One of the reasons the "superforecasters" did better than the other two groups is that Tetlock and company asked them to work in teams. According to Atanasov and Minster, grouping people into teams where everyone had "equal rights and responsibilities ... increased their level of engagement and produced highly accurate forecasts."
Even superforecasters, it seems, benefit from working with people who have different ideas and perspectives to make their predictions better. And people who were more open to working in groups tended to do better than people who preferred to work alone.
"I like to think of forecasting tournaments as intellectual ecosystems that require different types of creatures," Tetlock says.
3) Make actual predictions. One of the reasons this experiment worked, according to Tetlock, is that people were asked to make specific hard predictions. Instead of asking, "Is the Bashar al-Assad regime in Syria stable?" (the sort of vague question pundits like to weigh in on), they asked, "Will Bashar al-Assad remain president of Syria through Jan. 31, 2012?"
By forcing participants to answer with firm predictions, and assigning actual probability values (e.g., there's a 60 percent chance Assad will remain president), you ended up getting predictions that could be proven categorically true or false. The more that people do this, the more people can be held accountable for getting things wrong — which forces them to adjust the way they think about the world, hopefully for the better.
4) You have to know at least a little. It helps to know some background in order to make good predictions. According to a study Tetlock et al. published in the Journal of Experimental Psychology, people who knew more about world politics and were given some training in probability theory tended to do better in the tournament. Smarter people, as measured by IQ tests, also tended to do better.
This research is all very new, so it's hard to say if these tips and traits can actually be harnessed to produce accurate forecasts for decades.
But the possibility that we're starting to develop a better system for anticipating future events, by putting the right kinds of people in the right kinds of settings, is an exciting one. It's important not to get too carried away with the implications at this point; being able to better forecast, say, the coming UK parliamentary elections is a lot different from foreseeing the rise of ISIS. But this is a promising step toward better understanding what's coming in the world, and thus knowing how to prepare for it.