Join War on the Rocks and gain access to content trusted by policymakers, military leaders, and strategic thinkers worldwide.
Editor’s Note: This is the second in a series exploring key AI policy choices faced by the Department of Defense and Congress. Please also read the first: “Four AI Policy Choices Policymakers Can’t Afford to Get Wrong.”
If you ask almost anyone who has ever tried to get a new capability into the hands of the U.S. military what they think of the acquisition process, you’ll get a consistent answer: It’s broken.
It’s not broken because the people inside it are malicious, stupid, or incompetent. And it’s certainly not because it’s a system that lacks rigor. It’s broken because it’s slow, cumbersome, over-engineered, and structurally misaligned with the people who actually have to use the tools and weapons it produces. Most, if not all, acquisition and procurement programs — from personal tablets to tactical bombers — are optimized for exquisite requirements, long production timelines, risk aversion, and bureaucratic survival. What they’re not optimized for is the battlefield.
None of these observations are new, and commissions and blue-ribbon panels have cited these same critiques for decades. But, if the Department of Defense, enabled by new authorities from Congress, doesn’t start making structural changes to this process now, that’s exactly how a new set of commissions and blue-ribbon panels will be assessing AI diffusion in the U.S. military a decade from now. Defense AI will become just another in a long series of programs that followed the well-worn path of producing exquisite but unused systems that took years of development, invested billions of dollars in defense appropriations, and resulted in minimal operational impact.
President Donald Trump’s recent AI Action Plan tasked the Department of Defense with developing “responsible AI.” Nearly everyone agrees that the military needs “trustworthy AI.” But almost no one has seriously confronted the critical question: Who gets to define what “trustworthy” actually means? The answer to that question will determine whether AI becomes a force multiplier or another cautionary tale buried under a mountain of compliance paperwork.
Three Definitions, Three Futures
If the definition of trustworthiness is set by computer scientists and program managers in the Office of the Undersecretary of Defense for Research and Engineering, it will almost certainly mean an emphasis on technical assurance. Under this definition, the U.S. military is likely to drive toward AI that is resilient against data poisoning and other forms of adversary tampering. This future results in investments in new verification and validation pipelines, formal lab-based testing, and a secure machine learning operations infrastructure that will enable joint testing with the commercial actors providing the models, and independent verification by Department of Defense engineers. Assuming engineers at the frontier AI labs are even able to build such AI systems, that version of the story probably results in Department of Defense AI systems that are elegant, secure… and brittle. If such technical assurance proves elusive, this definition probably leads to massive investments into systems that can never be fully fielded.
If operators set the definition, then trustworthiness becomes measured by adoption and diffusion. Warfighters usually require convincing that new equipment is reliable, easy to use, and resilient on the battlefield. Investments under this definition would emphasize user interfaces and user experiences, live and iterative field testing, robust performance in degraded environments, and acquiring troves of data to support tactical missions. This version of trust is the hardest to achieve, but only because of the limitations in the current acquisition system. This version of trust is also arguably the only one that matters in combat, when the United States is asking its warfighters to place their lives into the proverbial hands of an algorithm.
Finally, if lawyers, political appointees, or compliance officers throughout the Office of the Secretary of Defense define trust, it is likely to be built around explainability, some version of human-in-the-loop assurance, and an emphasis on coalition technical interoperability. These are all important goals but the investments they would drive are more likely to ensure that someone can be reasonably blamed when something goes wrong than making sure America’s warfighters are able to win the nation’s future wars. In this future, AI adoption and diffusion in the Department of Defense is likely to remain low as policy makers become enamored with legal frameworks rather than operational value.
These definitions are not mutually exclusive to one another, but they are competitive with one another. When competing imperatives collide at the Pentagon, bureaucratic gravity pulls toward what is easiest to measure and defend in a program review and not what matters in combat.
An Old Problem with a Clear Solution
Addressing the gap between operational relevance and acquisition outputs are not quirks of the AI era. This movie is an old one, and the trailers are familiar: precision targeting systems that were too complex to deploy at scale; logistics software that worked on paper, but collapsed when deployed into an operational theater; and “next gen” radios that passed every test in the lab but overheated in the field, making them unusable.
Each of those programs was optimized for byzantine contracting systems, not the trust of warfighters. And when it comes to AI and warfare, trust can’t be defined by policy memos or test reports. The definition has to be in the hands of the operator who has to decide, in the moment, whether to rely on a new tool that shortens the kill chain or to fall back on laminated checklists. If the Department of Defense wants to break this cycle, it should elevate operator-defined trust as the decisive standard for its AI systems.
Other definitions of trust should be thought of as enabling definitions, rather than primary definitions, of trust for the Department of Defense’s AI systems. These other definitions are more concrete in the current acquisitions and procurement system but making operator-defined trust the center of gravity doesn’t have to be based on pure “vibes.” It can, and should, be defined and measured. A practical series of questions provides a template: Does the AI system perform in disrupted, degraded, intermittent, and low-bandwidth environments? Does the AI reduce cognitive load instead of adding to it? Can the AI be integrated into other command post systems without an engineering degree? When the AI system inevitably fails, does it do so gracefully or catastrophically?
If the answer to these questions isn’t the correct one, then an AI system isn’t trustworthy, no matter how many accreditation stamps it has collected on the way to a final decision to move to production.
Making Operator-Centered Trust Real
The Department of Defense doesn’t need to adopt operator-centered trust as part of some new responsible AI framework. It needs legislative and policy action to radically change a system that’s not built to take operator trust into account for its AI systems. Among other things, this change should include the appointment of a joint “trust arbiter” with real teeth, a mandatory operator trust gate built into the acquisitions process and expanded authorities for operator-led field testing with direct feedback to commercial vendors of AI systems.
First, the secretary of defense should appoint the chief digital and artificial intelligence officer as the new joint AI trust arbiter with directive authority to mediate and prioritize tradeoffs between technical, legal, and operational trust definitions as systems are being developed. One of the primary challenges to placing operator-defined trust at the center of the development of AI systems is that no one inside the Department of Defense owns trust at the enterprise level. The chief digital and artificial intelligence officer drafts principles and frameworks, the office of the undersecretary of defense for research & engineering tests and validates code, the military services field capabilities, and lawyers throughout the entire process worry about compliance. It’s a bureaucratic hydra, with no center of gravity. The chief digital and artificial intelligence officer should be reestablished as a direct report to the secretary of defense and that office should be empowered to provide centralized direction on tough trade-off calls related to varying versions of trust in the department’s AI systems. These powers should include the ability to define operator trust metrics that are rooted in mission realities, align and integrate definitions of AI trustworthiness among relevant actors, ensure that operator perspectives are embedded early in the acquisitions cycle, and gate AI program advancement based on demonstrated operator confidence in the system.
Second, Congress should codify an operator trust assessment in the next National Defense Authorization Act as a formal milestone requirement for AI systems entering production or operational deployment. Right now, AI programs at the Department of Defense, like all other programs, can make it past most major decision review gates without any meaningful operator input. That’s institutional malpractice. An operator trust assessment should be weighed against other criteria (e.g., cybersecurity or operational test & evaluation) and include both early-stage and iterative operator assessments of AI systems. Whether operators trust a system enough to use it under pressure should start driving the Department of Defense’s acquisition decisions. If Congress wants AI systems to avoid the malaise of most other weapons systems, it should act to ensure the Department of Defense alters its systemic behavior.
Third, Congress should provide flexible acquisition authorities to the military services and combatant commands that provide operational units with the funding and autonomy to pilot, adapt, and iterate AI capabilities in operational theaters and in field environments. There are already a few oil spot examples of this going extraordinarily well, and Congress should move to broadly reward operational innovation in the field. Operator trust can’t be fabricated in conference rooms. It’s earned in exercises, deployments, and dirty real-world fighting conditions. Without looser contracting mechanisms, and the opportunity to create direct feedback loops with vendors, operators will only ever just receive AI systems from the Department of Defense, rather than actively shape them.
If We Don’t Break the Cycle, the Cycle Will Break Us
The problems identified here are not unique to AI. But, as the Department of Defense starts answering key questions about how it’s going to approach the acquisition and procurement of AI systems, it has a clean slate to solve the very same problems that have plagued the defense acquisition system for decades: diffuse accountability, risk aversion masquerading as rigor, and the triumph of process over purpose.
If the department doesn’t purposefully reorder the hierarchy of trust to place operators at the center of its AI acquisition strategy, then military AI systems will follow the same tragic arc as countless other systems. They will be over-promised, over-engineered, and under-employed. But this time, the cost won’t just be wasted taxpayer dollars and bureaucratic frustration. This time, the impact will be strategic. If the U.S. military can’t operationalize AI at scale because its warfighters don’t trust the systems they’ve been given, then adversaries who accept more risk by building for the “bleeding edge,” rather than for review boards, will end up seizing the initiative.
However, if the Pentagon seizes this rare opportunity to re-design for operator trust — if investments start to tilt toward field experimentation, iterative development, and mission-tailored model architectures — it could unlock the full potential of AI-enabled warfare for the United States for decades to come. That shift would reshape the entire AI innovation process: requirements writing, budget justification, vendor engagement, milestone reviews, and, ultimately, operational integration. In short, the Department of Defense could finally give its warfighters the acquisition system they deserve.
Morgan C. Plummer is currently a senior policy director at Americans for Responsible Innovation, a non-profit public advocacy group based in Washington. He previously served as a professor of practice at the U.S. Air Force Academy, a defense and security expert at Boston Consulting Group, and a senior defense official at the U.S. Department of Defense. Morgan also served as a U.S. Army officer, where he served in various command, staff, and Pentagon assignments and deployed multiple times to Iraq. He can be reached at morgan@ari.us.
**Please note, as a matter of house style, War on the Rocks will not use a different name for the U.S. Department of Defense until and unless the name is changed by statute by the U.S. Congress.
Image: Senior Airman Johnny Diaz via DVIDS.