Measuring War: Cognitive Effects in the Age of AI
Editor’s Note: This article was submitted in response to the call for ideas issued by the co-chairs of the National Security Commission on Artificial Intelligence, Eric Schmidt and Robert Work. It addresses the first question (part a.) on the character of war, and the third question (part d.) on the types of data that would be most useful for developing applications.
Billy Beane, general manager of the struggling Oakland Athletics baseball team, faced a problem in the early 2000s. He needed to field a competitive team with one of the league’s smallest budgets. Beane turned to what became known as a “moneyball” approach — a new analytical method that valued a player’s ability to get on base over traditional statistics like batting average and home runs. His approach was revolutionary not because it was analytical — numbers have always been part of baseball. It was revolutionary because it demonstrated that baseball analytics measured the wrong things.
As the Department of Defense positions itself to capitalize on artificial intelligence, it should — like a good baseball general manager — analyze what is important and avoid what is trivial. The only outcome of military action that ultimately matters occurs at a cognitive level — at the level where adversaries perceive and give meaning to actions taken against them. More specifically, the cognitive impact of a military action is the extent to which the action alters an adversary’s belief that he can maximize his own interests through military means. Cognitive effectiveness is not the same thing as psychological warfare. Psychological warfare refers to a body of largely non-violent operations focused on information dissemination. Cognitive effects, on the other hand, follow from any military action, violent or non-violent.
Artificial intelligence cannot observe these effects. Humans can estimate them, however, by assessing the extent to which an action makes an adversary’s key decision-makers less (or more) likely to rely on force to maximize their interests. If the military were to make these assessments objectively and database those assessments with other inputs relative to military activity, AI could lay the foundations for predictions of cognitive effectiveness. Building this measure would be much, much more than a task for computer science. The task would require significant organizational changes to how the military collects data and assesses its own effectiveness.
First, data. At its most basic level, AI detects relationships between input and output variables. A video streaming service, for example, might employ AI to take one’s previous viewing history (input) and measure the extent to which that history predicts the likelihood of watching an action film (output). The “intelligence” in artificial intelligence only works at this level of relationship detection among variables. It does not work at the higher level of determining which variables should be examined in the first place — that is a human task. Human choices (and biases and limitations) in providing variables constrain what AI can eventually do. AI is only as good as the data it is given.
For an AI-aided effectiveness measure to be feasible, the military would need to expand its conceptualizations of both the inputs and outputs of military actions. On the input side, military thinking would need to expand beyond readily-measurable means of employing force — number of troops, ships, aircraft, rounds, or bombs. It would need to grow to include variables that help paint a picture of how an adversary views the world — an adversary’s own strategic goals, level of desperation, likely emotional reaction, and views on the strategic efficacy of force, just to name a few. It is an adversary’s interpretation of an action that determines the action’s cognitive effect. AI could not predict an interpretation without first characterizing how the adversary views the world.
On the outcome side, current data is again biased towards force and often conflated with inputs. Leaders who are asked to account for progress in their campaigns often avoid the cognitive effectiveness question by reporting how much physical force they have employed — campaigns are effective because they have conducted lots of missions or dropped lots of bombs. However, the appropriate database would require that the outcome of a military action be reported as an estimate of the adversary’s change in judgment. In addition to reporting the number of missions or bombs, the military would need to assess the extent to which its specific actions made an adversary’s key decision-makers less likely to rely on force.
This assessment would invite biases of self-serving optimism. Those responsible for answering it would be prone to look for positive, confirmatory information that suggested a substantive change in an adversary’s judgment, even when such a change never occurred. A database filled with optimistic assessments would lead to AI-generated, overly-rosy estimates. The eventual result would be campaigns that involve plenty of action but little strategic progress. One means of mitigating overly optimistic assessments — far from a complete solution — would be to separate the assessment of cognitive effectiveness from the execution of military action. Different people, accountable to different bosses, should perform those two activities, unlike the current process in which those who carry out missions are responsible for performing battle damage assessment on those missions. A further step, one that would invoke the science of how best to aggregate judgments involving uncertainty, would be to average the assessments of multiple, independent sources.
Properly building a cognitive effectiveness database would be labor-intensive and tedious, but it would be possible. The military already collects, in some form, the data described here. Operators in the field create situation reports that describe the physical parameters of their actions — the number of troops, the actions they took, and the way they took them. Intelligence analysts examine the worldview of adversaries and estimate the effect that different actions will have. What would be new is the combination of these sources at the level of single military actions into a database on which AI could learn.
It is not new to say that the ultimate purpose of military activity is to alter an adversary’s judgment. This idea is an enduring truth about the nature of war, one captured by Clausewitz’s famous phrase that “war is the continuation of politics by different means.” Somewhere between the strategizing of Clausewitz and the actual prosecution of war, however, measures of effectiveness shift from cognitive to physical, making physical effects ends in themselves. This shift occurs, perhaps, because physical effects are much easier to observe than cognitive effects, and because connecting the dots between a tactical action and a strategic decision-maker’s judgment is too complex a task for human intelligence. AI could help tackle this complexity. In doing so, it could bridge the ever-changing character of war with the medium that has always defined the nature of war: human judgment.
If AI is to bridge this divide it will be because human choices made it possible. Whether the application is moneyball or military effectiveness, people, more so than computing power or algorithms, will determine whether AI is useful. In the context of war, AI should focus on not just the application of force, but the effects of that force on an adversary’s judgment. Gathering and maintaining the right data is a first step. Most importantly, leaders should demand evidence-based decision-making. If the military is willing to focus on the right data, a new measure of war will be possible.
Brad DeWees is a Major in the U.S. Air Force and a tactical air control party officer at the 13th Air Support Operations Squadron at Fort Carson, Colorado. An alum of the Air Force Chief of Staff’s Strategic PhD fellowship, he holds a doctorate in decision science from Harvard University. The views expressed here are his alone and do not necessarily reflect those of the U.S. government or any part thereof. Twitter. LinkedIn.