It’s Time for the Pentagon to Take Data Principles More Seriously
Since 1996, China’s military has beaten countless U.S. forecasts for how long it would take it to develop and field new weapon systems. By comparison, the U.S. military has lagged in its military modernization even as it receives increasing attention and resources. In this context, Vice Chairman Gen. John Hyten’s comment that the Defense Department will take the next 10 years to manage its data effectively should disturb U.S. policymakers and mobilize them to act.
While this timeline may seem forgivable given the magnitude of the Defense Department’s data management problem, decades’ worth of data debt, and dearth of data expertise, it’s nonetheless woefully inadequate. It also helps explain why the U.S. military has an eroding military advantage relative to China’s military forces.
As two people intimately engaged in these problems in the private sector, we understand the potential for data to enable better and faster national security decisions on its own and especially when augmented by artificial intelligence (AI) systems. It informs — or should inform — nearly every decision that the Defense Department makes, from business decisions that govern its ~$700 billion annual budget, to intelligence decisions that seek to understand the environment in which the U.S. military competes, to operational decisions that keep the peace and, if necessary, win the nation’s wars.
Yet the Defense Department’s data still remains predominantly siloed, messy, and unused. While defense leaders rightfully become seized with the importance of data, it should move faster. Defense leaders should create the policies, processes, and programs to turn data into useful information quickly and accurately, thereby enabling more effective decision-making.
Some progress is underway. The Defense Department has a formal objective to treat data as a strategic asset, considers data as an essential warfighting resource and the currency of future warfare, and is mandating that data be used to improve decision-making to better execute the National Defense Strategy. It recognizes that it is in a high stakes race to harness the power of data and is actively working on creating a culture of data-centric decision-making. This burst of enthusiasm for exploiting data is all to the good, even if belated.
Within just the past few years, the Pentagon has appointed chief data officers and issued numerous digital strategies that reflect commercial sector best practices in data management and that call for IT modernization to bring in the tools and talent necessary to unlock the value of data. It also has embarked on a range of critical initiatives to better leverage data for decision-making, such as the Joint Common Foundation, which brings relevant data together to enable AI development and adoption, and Advana, which serves as a central hub for audit and business data analytics. Moreover, the Pentagon plans to release a new data strategy that emphasizes the importance of data collection and management for warfighting and enterprise management. This presents a timely and appropriate vehicle for driving change.
Overall, these efforts should be applauded and encouraged. While not all of them will bear fruit — or be realized on a sensible timeline — they are rightly fighting to change an entrenched culture that’s anchored more on hardware and platforms than software. They place the Department of Defense in a much better position to implement the next National Defense Strategy, whenever it may come, than it was the last. Yet, these efforts reflect just the embryonic stage of a digital Defense Department. There is much more work to come for this transformation to yield game-changing results on and off of the battlefield. There is much more that can be done to accelerate the Defense Department’s timeline to achieve effective data management.
The Pentagon should treat a 10-year timeline to mastering digitization and use of data as a proposition that is dead on arrival. Instead, it should use the upcoming data strategy to provide a clear framework for modernization efforts, leveraging the following principles. Taken together, these principles help illuminate a rapid path to data primacy in the Department of Defense and ultimately, improvement in the quality and timeliness of its decision-making.
Use Raw Data
As Lt. Gen. Jack Shanahan, the first director of the Pentagon’s Joint Artificial Intelligence Center, observed, the “data is the new oil” analogy mischaracterizes the state of most data. He remarked, “I treat it as mineral ore: There’s a lot of crap. You have to filter out the impurities from the raw material to get the gold nuggets.”
As the largest employer in the world, the Defense Department generates a lot of data across its business, intelligence, and operational functions. But much of this data is incomplete, inaccurate, and not collected with systematic use in mind. Many of the Defense Department’s initiatives, therefore, aim to collect and cleanse its data to make it usable for analysis and decision-makers. But the reason the data is bad isn’t for lack of processing — it’s for lack of use. By using data in its nascent, unrefined state, the Defense Department can learn what can and cannot be done with it, and then how additional hygiene can improve its outputs. Doing so also allows the Defense Department to extract value from its data earlier, before massive IT modernization investments create a more purified process.
Expand Data Sources
A common thread among the Defense Department’s data initiatives is their inward focus. They call for better collection, processing, and sharing of the military’s own datasets. This is right and good, but woefully limited in scope. At the extreme, it can be counterproductive by creating a “tyranny of authoritative data” — a severely limiting lens through which to process information. Exclusively relying on authoritative datasets makes sense in certain contexts, such as in an audit. But decisions in many others need to find patterns and extract meaning from data outside of officially sanctioned, dogmatically managed, and inherently biased “authoritative systems.” This mindset is equivalent to only looking through whatever eye is dominant, blinding yourself on the misguided principle that the other eye has nothing to offer.
There is a sea of data outside the Defense Department’s networks that can be brought to bear on defense problems. Datasets that are external to the U.S. military and from commercial providers, academia, non-governmental organizations, other federal departments, and even local governments provide a diverse view of global markets. They can help the Defense Department stay abreast of technological developments, industrial base dynamics, and supply chain risk — they are pieces of information critical to strategy development and execution. External data, including canonical data models, can also serve as a “Rosetta Stone” that translates between disparate datasets so that they can “speak” with one another, increasing the quality and utility of internal datasets. Furthermore, the Defense Department should access external data simply to keep pace with China, which is aggressively collecting and utilizing data available in commercial markets.
Selectively Scale Training Data
The current era is marked by a proliferation of numerous, narrowly focused algorithms that only work within a tightly circumscribed context. AI routinely demands massive amounts of training and testing data to become serviceable. However, the data required is highly context-specific and should be appropriately labeled to enable the algorithm to identify key features and relevant patterns. Data is not a fungible good that works well with any algorithm pulled off the shelf or created in the test kitchen.
For the Pentagon to develop training data for AI, the crucial step is to treat data as a means to an end, not an end in its own right. The Defense Department ought to avoid aggregating wide swaths of data into a data lake and expecting that if they build it the AI will come — the very logic that underlies many defense-wide initiatives. It should instead focus on the desired outcomes of a digital environment and acquire, store, and assemble data at scale that is targeted to the priority algorithms it is developing and decisions it aims to inform.
Design Data for Interoperability
A crucial lesson from 9/11 is the reality that data silos can lead to tragic failures. The reckoning that followed the 9/11 Commission’s report, which detailed that observation, kicked off more than a decade of countless federal IT programs to establish data lakes, data warehouses, and other centralized data repositories. While it is a reasonable place to start, aggregated unlinked data provides little improvement. Data should be integrated and linked to connect the full range of disparate activities that inform complex decision-making. In fact, much of the government’s effort in this space to advance data warehousing or consolidation has limited technical relevance to data interoperability. Even if data sits in different systems, intelligent mapping of federated data elements means they can be combined at will. It is only through this type of advanced data integration that important — and non-obvious — relationships among the data are revealed.
The key to facilitating data integration is data interoperability, not uniformity. The U.S. defense enterprise is too large, with too many disparate datasets developed for specific purposes to expect adherence to centrally determined data schemas. Such uniformity may be valuable for select high-value kill chains — such as one trying to destroy North Korean intercontinental missiles before they launch — where there is little margin for error between the sensor, command, and shooter. It would, however, effectively limit space for creativity and ingenuity, as well as incorporation of novel technologies from the private sector. A manufacturer of a new sensor, whose revenue is driven by commercial markets but who is open to selling to the department, would quickly conclude not to do so if the sensors needed to adhere to a set of unique data standards. Rather than striving for all the data to speak in the same language via a uniform data schema, the Pentagon should ensure the data are sufficiently intelligible to allow translations between them. This calls for the pedestrian but essential work of data documentation and for spending the extra effort in designing a system to enable those outside of its inner workings to quickly interpret the data when necessary.
Eliminate Bureaucratic Data Ownership
There is a bureaucratic “ownership” culture that persists among government-owned systems within the Defense Department. It limits data sharing and use, often for parochial reasons that run counter to the interests of the department as a whole.
Instead, the department should view all of its internal data as a corporate asset. Some progress has been made on this front, with the Fiscal Year 2019 National Defense Authorization Act, § 911 calling for common enterprise data in specific domains to be considered corporate assets. However, implementing this principle in practice is an ongoing effort, and one that should be expanded to all datasets. The elimination of bureaucratic “ownership” in data further encourages a culture of sharing data within the department and underpins the above efforts for improving interoperability. Even corporately managed data within the Defense Department often exist in functional stovepipes or “lines of business,” effectively transferring “ownership” from the disparate collection sources to a well-intentioned group of “line of business owners.” This construct, while improved, is still siloed, as sharing and interoperability are promoted within a specific line of business, but not yet across lines of business. Breaking down these artificial management barriers is essential to use data across functional lines and solve complex, interdisciplinary problems.
Manage Data Transparency
The Pentagon should work cooperatively with the nation’s innovative centers across the private sector, academia, and the defense laboratories to become a digital organization. It simply cannot do so on its own. This requires the Defense Department to be transparent with its own data. The problem is that the more transparent the department’s data is, the more exposed it is to Chinese state efforts to exploit that data for their advantage. Complicating matters, China will never make its data transparent to the United States, providing it an asymmetrical advantage in data access — an advantage that’s entrenched by U.S. prohibitions on government and civilian hacking of Chinese entities for commercial benefit.
The Defense Department should therefore make careful and deliberate data disclosure decisions, as a whole enterprise, to reap the maximal benefits from external collaboration while limiting risk of Chinese exploitation. It should, for example, explore setting different standards for data transparency depending on the level of trust it has with certain companies and academic centers through mechanisms such as a trusted data consortium.
The ability to judge the quality and timeliness of the Defense Department’s decisions is difficult. The national security business lacks the rapid and objective feedback of the market. See, for example, the department’s top priority as called for by the National Defense Strategy: Deter great-power war. Proving a negative — whether the department’s decisions are what lead to an effective deterrence — is not possible. Unless, of course, that deterrence fails. Judging the efficacy of the Defense Department’s decisions in preparing for and waging war on the battlefield is a time too late to afford much course correction without significant consequence measured in national blood or treasure.
Lacking regular feedback on its performance, the Pentagon ought to be intrinsically motivated to improve its decision-making. It has good reason to feel a sense of urgency. China presents a great-power competitor of the likes the nation has never seen, given its combination of economic might and effective authoritarian rule.
The Defense Department recognizes that the use of data to inform its business, intelligence, and operational decisions will be instrumental to success. That realization is a valuable and necessary first step. Next, it should pair that fact with a strong sense of urgency and seriousness in leveling up its data management proficiency. The foregoing principles outline the rungs up to the next performance level. It’s time to start climbing.
Bob Work is the chairman of the board of Govini and the former deputy secretary of defense.
Tara Murphy Dougherty is the chief executive officer of Govini — a decision science company whose mission is to advance U.S. competitiveness through dynamic data and machine learning.