Forecasting in Policymaking: Beyond Cassandra

By: Dan Spokojny | January 26, 2022

Cassandra by Evelyn De Morgan (1898, London).

All policy decisions are built on assumptions about the future.

President Biden’s December 21, 2022 remarks on Ukraine, for instance: “The American people know that if we stand by in the face of such blatant attacks on liberty and democracy and the core principles of sovereignty and territorial integrity, the world would surely face worse consequences.” 

Yet many of the assumptions underlying policy pronouncements remain ambiguous and under-evaluated. Formalizing these assumptions can allow us to test the quality of our policy process and potentially improve its effectiveness.

Modern forecasting techniques offer the possibility to meaningfully improve the quality of our foreign policy decision-making process. These techniques have been rigorously studied and demonstrated to be more accurate at predicting the future than traditional analytical methods. In a four-year forecasting tournament hosted in collaboration with the Director of National Intelligence (DNI), a team of trained forecasters outperformed professional intelligence analysts by 30% and beat the control group by 60% (though some replication data was classified).

But advocates of forecasting must heed the warning of the Greek myth of Cassandra. She was granted a powerful ability to see the future, but cursed to be ignored by anyone she warned of impending disaster. A recent study of DNI’s tournament and another similar effort found that forecasts were essentially ignored by the policy process. As far as I’m aware, forecasting methods are not currently being used anywhere in the foreign policy process.

It is possible that forecasting is a dead end; more of a parlor game than a viable method of improving policymaking. Despite the promise of forecasting, we simply don’t have the research or experience to know whether these techniques can be integrated into the policy process.

It would be a shame to simply ignore Cassandras in our ranks, however. We must test how the power of foresight could be embedded in the decision-making process. This concept note lays out some methods by which forecasting might be effectively integrated into the policymaking process.

Note: This article does not explain forecasting techniques in detail, or discuss the research supporting their claims. If you want to learn more, check out the book Superforecasters, this collection of academic articles, or these training resources.

Four Models for Integrating Forecasting into Policymaking

There is little publicly available guidance about how to integrate forecasting into the policy process. Some proponents view it as merely an analytical tool, like intelligence analysis. For example, Phil Tetlock, the scholar most responsible for advancing forecasting in recent years, suggested in Foreign Affairs that forecasting “should be seen as a complement to expert analysis, not a substitute for it.”

The implicit theory for many in the forecasting community is supply-side: if accurate forecasts are generated in public, policymakers will consume them and improve their understanding of the world. Other forecasting advocates suggest that all policy decisions can theoretically be treated as forecasts.

The first step for testing the promise of forecasting is to develop models for how it might integrate into the actual foreign policy decision-making process. Here I posit four models for how this might be done, and speculate on the strengths and weaknesses of each model.

Figure 1: Strengths and weaknesses of the four models for integrating forecasting in policymaking.

1. The analytical model.

A wide variety of forecasts would be made available to policymakers, but no process changes would be imposed. The hope is that policymakers look to forecasting for wisdom. This is the Field of Dreams theory of change: “if you build it, he will come.” The development of this model is already maturing. For example, the forecasting questions in Figure 2 below are from the popular forecasting platform, Metaculus.

Strength: This is the least disruptive for existing organizational models because forecasting (much of which could be conducted externally) is just another input into a policymaker’s existing decision process. It exposes the logic of forecasting to policymakers in a low-stakes environment. Forecasts here would likely be most impactful when they draw attention to counter-intuitive emergent trends. 

Weakness: I’m skeptical about the efficacy of this model. Policymakers using existing mental models and decision-making heuristics will not likely have their beliefs meaningfully changed by the added precision bonus good forecasting can provide. One could imagine a policymaker cherry-picking only forecasts which support their beliefs and ignoring forecasts which challenge their priors. Further, policymakers would not be involved in the learning process at all in this model.

Figure 2: Metaculus platform example questions

2. The early warning model.

In this model, forecasts could be embedded into early warning systems focused on discrete, high priority issue areas. Early warning systems for mass atrocities (Figure 3), political instability, and nuclear war already exist and provide examples of this approach, in which forecasters proactively scan the globe for warning signs. Forecasting in this model remains an analytical product intended to focus the attention of a policymaker without dictating a particular response. Other relevant issue areas might be predicting future pandemics, political instability, environmental conflict, arms races, or trade wars.

Strength: Embedding forecasting within discrete issue areas makes bureaucratic sense. One could easily imagine various bureaus within the State Department hosting their own forecasting teams to build expertise and attention for their issue area. This approach might offer a useful foothold within the bureaucracy from which to expand if the techniques are found to be useful.

Weakness: This model would suffer some of the same weaknesses as the first model. If policymakers aren’t involved in the generation of forecasts, it may have little impact on their thinking. Further, constraining forecasting to discrete issue areas limits its applicability.

Figure 3: Estimated risk of new mass killing, 2022–23. From the Early Warning Project.

3. The policy evaluation model.

This model asks forecasters to evaluate the possibility of success for discrete policy options in order to highlight the policy interventions most likely to succeed. For instance, three potential interventions could be proposed by decision-makers, and forecasters would estimate the likeliness of success for each option. The options could also be compared with baseline forecasts of the status quo. 

All policy decisions can be viewed as acts of prediction: the decision-maker is betting that their chosen intervention will change the status quo in a way that will benefit their goals. These are often called conditional forecasts, which can be phrased as if/then statements: if the United States conducts intervention X, then we predict that the world will diverge from the status quo outcome Y and we will instead observe outcome Z.

Strength: This seems like a plausible approach for integrating forecasting into decision-making to improve the quality of policymaking. While policymakers would not be required to comply with forecaster recommendations, generating discrete probabilities about the likely success of certain policy tools would incentivize decision-makers to engage with the logic underlying relevant forecasts. Conditional forecasting would also require policymakers to identify discrete and falsifiable goals of their policies, which would already be a major process improvement. Such rigorous approaches might shift the center of gravity of the policy debate toward evidence and away from ideology and turf. It would also provide the foundations for more active learning in the policymaking sphere: policymakers would be able to improve their policymaking skills by studying which of their interventions succeeded and failed.

Weakness: Decision-makers will likely resist the meddling of forecasters on their sacred turf. This intervention would insert forecasting into highly politicized spaces, jeopardize the objectivity of forecasting, and potentially threaten the authority of policymakers. It would also likely require more resources and slow down the decision-making process. Finally, conditional forecasting uses the same logic as the status quo, but its efficacy has not been rigorously studied.

4. The decision-making model.

This model would be a big change to the status quo whereby forecasting methods supplant the existing decision-making process altogether. All policy decisions would be presented as testable forecasting questions requiring analysis. Ethical and political judgment would still have a prominent role in the decision-making process, but all empirical claims about likely policy outcomes would be subjected to probabilistic estimates.

Strength: Integrating forecasting at every level of the policy process would create the conditions for a much more intense focus on the impact of foreign policy interventions. It could also drive a much-needed discussion about the merit of decision-makers, which is rather ambiguous without an understanding of the effectiveness of their policy judgment. Such an approach has the potential to have a transformative impact on the quality of foreign policy.

Weakness: The resource requirements to implement forecasting at this scale, and the depth of the organizational changes on which it would depend, make this proposal unviable for the foreseeable future. Much more research and policy experience would be needed to develop the evidence needed to advance such a grand vision.

Alternative Perspectives on Decision-Making

I want to expand on this last “decision-making” model. While this option is the least viable option, it deserves further exploration.

The best way to think about integrating forecasting and decision-making is to break every policy proposal into causal mechanisms and then treat each component as a forecasting question. This requires carefully considering each action/change/step that would need to occur for the policy to achieve success.

Let’s say the policy solution under consideration is “the US will send weapons to Ukraine in order to push back against the Russian invasion.” One could break this strategy down into components: a) the US will need to send the weapons to Ukraine; b) Ukrainians will need to get effectively trained on the weapons; c) the new weapons will need to be deployed in battle at a sufficient rate; d) the new weapons will have to achieve measurable improvements in battlefield effectiveness; e) the effectiveness of the Russian military strategy will be meaningfully undermined, and; f) Ukraine will win the war.

Such an approach encourages precise thinking from policymakers. And it will encourage each step of the policy-making proposal to be generated as a forecasting question:

  • How many weapons will the US actually be able to send to Ukraine (given financial, production, and transportation challenges)?

  • How many Ukrainian units will get trained to use American weapons?

  • How many battles will prominently feature US weapons? How many enemy forces will be killed with US weapons?

  • Given the above… How likely is it that Russia will retreat its military outside of Ukrainian lines by the end of 2023?

  • Given the above… How likely is it that Russia will escalate, including the use of nuclear weapons?

  • Given status quo conditions… How likely is it that Russia will retreat? Use nuclear weapons??

The policymaking team might proceed according to the following script:

  1. Clearly identify the problem statement or opportunity (e.g. Russia invades Ukraine)

  2. Clearly describe goal(s). Goals must be presented as falsifiable end-states (e.g. A complete withdrawal of Russian troops from Ukraine, avoidance of nuclear escalation, etc.). If there are multiple or competing goals, they need to be presented in priority order.

  3. Generate an array of strategies/policies that will achieve the goal(s). (e.g. send NATO troops to Ukraine).

  4. Break these strategies down into component parts

  5. Each claim generates its own forecasting question that can be passed to the prediction market.

I make no claim that supplanting the existing decision-making process will be superior to the existing method. Instead, I merely suggest this model deserves consideration and study.

Conclusion

Thinking critically about the ways in which we make decisions will help us improve the quality and accuracy of our foreign policy. Excellent research is accumulating on methods to achieve this goal. The forecasting methods rely on decades of research in cognitive psychology and decision science that help us understand deeply ingrained human biases that have long afflicted policymaking.

Forecasts generated outside of policymaking spaces are unlikely to meaningfully impact decision-making. New models need to be developed and tested if we hope to capitalize on new research into decision-making and advance our policy process.

The goal is to improve the quality of policymaking and make the world a safer place.

Previous
Previous

Human Judgment: How Accurate is it, and how can it get better?

Next
Next

The Bayes Brief: Designing a Modern Policy Memo Process