What’s Wrong with Wanting a “Human in the Loop”?
At the recent conference on the ethics of AI-enabled weapons systems at the U.S. Naval Academy, well over half the talks discussed meaningful human control of AI to some extent. If you work among the AI ethics community, and especially among those working on AI ethics and governance for the military, you are hard-pressed to find an article or enter a room without stumbling on someone literally or metaphorically slamming their fist on the table while exalting the importance of human control over AI and especially AI-enabled weapons. No meeting or paper on the ethics of AI-enabled weapons is complete without stressing the importance of having a human in the loop, whether in the now outdated sense of meaningful human control, or in the recently more popular sense of appropriate human judgment. It often seems like everyone agrees that having human control over AI weaponry is a good thing. But I am not so sure that “meaningful human control over AI” is the panacea everyone seems to make of it.
Arguments in favor of meaningful human control of AI-enabled weapon systems usually focus on safety, precision, responsibility, and dignity. Centrally, proponents of human control over AI-enabled weapons systems don’t think that lethal targeting decisions should be left to AI. This is why the examples used to stress the importance of meaningful human control often focus on weapons systems that use AI for targeting decisions — systems like Collaborative Operations in Denied Environments (CODE) or HARPY. According to publicly available information, and Paul Scharre’s description of it in the book Army of None, CODE’s purpose is to develop “collaborative autonomy — the capability of a group of unmanned aircraft systems to work together under a single person’s supervisory control.” This control can take several forms depending on whether the system is operating in a contested electromagnetic environment (more contested means more reliance on autonomous features). Usually, the human operator gives high-level commands like “orbit here,” “follow this route,” or “search and destroy within this area.” In cases of search-and-destroy missions, once the airborne vehicles find enemy targets, “they cue up their recommended classification to the human for confirmation,” Scharre reports. In addition, after target confirmation, the system asks for authorization to fire. This means that there are at least three places where a human exerts control over the system: first when drawing the box around the area the drones should search for targets, next when confirming the target, and finally when accepting the plan of attack. Proponents of meaningful human control see this as a great example of leveraging all that is good about AI, while assuring human control — thus minimizing accidents (assuring safety) and thus also identifying who to hold responsible when things go wrong (assuring responsibility assignment).
On the other end of the spectrum, those that question human control as a solution to potential problems with AI often do so by pointing to the speed or complexity of data processing that justifies the use of AI in the first place, making meaningful human oversight impossible. In other words, one of the most significant benefits of AI is that it can process information and act faster than humans, and that it can extract information from patterns that we humans cannot recognize. In many such cases, suggesting that a human can provide meaningful oversight seems problematic, simply because AI is doing things precisely because humans cannot. But, of course, that doesn’t mean (proponents of human control would say) that we can’t have appropriate human judgment somewhere in the life cycle of AI — at least in the development and testing stage, or in the deployment or fielding stage. And that seems reasonable to me. Having meaningful human control or appropriate human judgment need not be about assuring that there is a person pressing a button in the last stage of decision-making (as with CODE). It is about assuring that we use our incredible human cognition and moral judgment to assure safety and accuracy, and it is about assuring that someone is responsible when things go wrong.
Consider the Aegis weapon system, which has been around since the 1970s. Aegis uses a high-powered radar to search, track, and engage targets, and it can do so for over 100 targets simultaneously. Control of Aegis takes the form of the commander picking and choosing various “doctrines” for Aegis — mixing and matching different control types for different anticipated threats. Since Aegis can operate at four different levels of autonomy, a commander might choose one type of autonomy for one type of possible threat and a more human-in-the-loop setting for another type of threat, depending on factors such as the geographic area the ship is operating in or possible threats. This “translates” commanders’ intent to Aegis’ behavior without the commander having to make every single decision. Simply put, appropriate human judgment can take different shapes depending on the AI we are trying to govern and why we want such human control in the first place (e.g., safety, ability to assign responsibility, and dignity).
Even though I am sympathetic to the claim that meaningful human control or appropriate human judgment can take many forms, ultimately, we must acknowledge that there will be times when such control is not possible, or when such control is an illusion that distracts us from other solutions to genuine worries about the use of AI for life-or-death decisions. I consider three arguments meant to jointly illustrate that human control of AI is not the holy grail of safe AI.
To start, there will be times when having a human in the loop as a matter of empirical fact works less well than not having a human in the loop. Consider, for example, AI-enabled weapon systems that are meant to respond or engage at superhuman speeds, like ship self-defense systems (when operating in autonomous mode). Or consider cases of cooperative or collaborative autonomy, like CODE, which are hard to interpret for a human due to the fact that hundreds of drones are sharing information and, in real-time, changing behavior based on information incoming from all these sources. There is a matter of fact about whether these systems can work more effectively with a human in the loop or not. If an AI-enabled weapon system works better without a human in the loop, we are going to have a very hard time justifying the decision to keep a human in the loop for safety’s sake (which tends to be the primary reason for those who want a human in the loop). Thus, it seems like the insistence on humans picking or approving targets in systems like CODE or DARPA’s Target Recognition and Adaption in Contested Environments (TRACE) might at times be misplaced if it is supposedly aimed at safety and accuracy.
But these issues of speed, explainability (i.e., the ability of the operator to understand why the system made the decision it did), and interpretability are not the only worries when it comes to positioning meaningful human control and appropriate human judgment as solutions for problems that ail AI. A potentially more serious problem is that in many cases such human judgment is a figment of our imagination. In fact, even when there is sufficient explainability and time, it is possible that human oversight is illusory. Consider CODE once again. CODE, at least in theory, has three places where human control can be exerted in search and destroy missions: the box drawing, the target confirmation, and the acceptance of the attack plan. But consider how something like CODE gets fielded, tested, and evaluated: with trained and certified operators. Systems that are meant to have meaningful human control or a human-in-the-loop are tested and evaluated (and rightly so) as socio-technical systems (with operators in place obviously). When that socio-technical system doesn’t respond with the right level of safety or accuracy, something must be changed, and that something often is the user interface or the way data is presented to the operator. As we fine tune an AI-enabled weapon system to get the right level of accuracy, we are fine tuning not just the code, but also the way data gets presented and taken up by the operator. Whether this is done through changes in the user interface, or through training, the fact is that until data is presented in a way that maximizes operator compliance with the “right outcome,” developers will continue to make changes to the system. That in turn raises significant questions about the level of control the human is “meaningfully” exerting over the system when the system has been fine-tuned to make it easier for the human to do less, and for the operator to accept the machine’s judgment.
To further illustrate this, imagine a targeting system that identifies objects in the field and provides the tactical action officers and the commanding officer with the likelihood that a certain object is a legitimate target. (This would thus be both an object recognition system and an automated decision support system, because it would be advising tactical action officers whether the object is a legitimate target.) This algorithm could be an ordinary supervised learning model trained on thousands or (better yet) hundreds of thousands of images and other sources of data in various contexts, resulting in superior-to-human identification of objects as a legitimate target. Now imagine that during testing of this algorithm the testing staff consistently fails to trust the algorithms in certain contexts. For example, when the object recognized as a “legitimate target” is next to another object that looks like a school bus (but is not) or whenever there are flashing lights in the right-hand corner of the screen. Now further imagine that when we remove information that seems to be misleading the testing staff, they are much more likely to, consistently and correctly (post hoc), rely on the object recognition algorithm. This casts doubt on the extent to which one can exert “meaningful” control over AI so tested.
The point I am making here is rather simple: When algorithms fail in the field they sometimes fail for technical reasons (e.g., not enough data or poor fit), but more often they fail because of the human-machine interaction problems. When that happens, we identify why the interaction is problematic, why the human is not trusting the machine, or why the way that the data is presented is being misunderstood — and then we change those things. Such changes should make us question to what extent human oversight of algorithms is truly meaningful.
Finally, consider the primary focus of most discussions about meaningful human control: to guard against full autonomy (thus assuring that human dignity is respected and it remains possible to assign responsibility at least). So what is this autonomy we are trying to guard against? While there is a lot of debate about what makes an AI system autonomous, one key worry is a system that could pick out its own targets. In cases when the operator picks out a target and chooses to engage it, that is not autonomy of the worrisome kind. Closer to “worrisome autonomy,” but not quite there, are weapons systems like the Long Range Anti-Ship Missile. This missle is capable of autonomously avoiding incoming threats (it can change course in response to a threat), and it can choose to continue to the target in the last few moments before the attack if connection to the command is lost. But the kind of autonomy that proponents of meaningful human control are most worried about is the kind where the weapon can pick out its own target, like HARPY. HARPY is an Israeli “fire-and-forget” weapon that is programmed before launch to loiter in a pre-determined area and search and destroy radiating targets (radar). HARPY does not target humans (although there are some worries about its inability to engage in collateral harm assessments). But now imagine a system similar to HARPY, a “fire-and-forget” system that targets enemy combatants in an area or enemy ships (to use an example that minimizes the collateral risk). Proponents of meaningful human control do not believe it should ever be left up to a machine to target humans.
I think there are good reasons to question this. Specifically, I am not persuaded that there is a significant moral difference between an operator identifying a single human target based on some data and a human operator drawing a box and defining targets within that box based on equally good data. Consider targeting person X based on information that they are in known enemy territory and are carrying a rocket launcher, that they are approaching a friendly base, and are wearing the enemy uniform. Now consider doing that same thing with an autonomous base defense system using AI to identify potential threats meeting those same conditions. Again, if the system is better or safer as a matter of empirical fact with a human in the loop (i.e., a human confirming each target), then obviously we ought to have a human in the loop. The main question I pose here is, whether in cases when the system works better or as well without a human operator, we have moral reasons to place a human in the loop. I am not persuaded that there is a moral difference between picking out a single target using certain information and picking out multiple targets by using the same type and quality of information to provide definite descriptions of those targets. Any moral difference arises, in my opinion, from empirical fact about what works better, not how far removed the operator is from the final decision.
Let me be clear: We need to strive for exactly the things that proponents of meaningful human control strive for: safety and minimizing unnecessary deaths and harm to civilians. My worry is that we will end up with pretend human control that will not in fact solve the problems we are trying to solve. Meaningful human control often doesn’t get us safety, dignity, or oversight, but only an appearance of those things. So we have to reconsider if there are better ways to govern AI-enabled weapon systems or if there are cases when we should simply not use AI (e.g., killer robots using facial recognition). The illusion that we must have — which implies that we can have — meaningful human oversight is dangerous to our ability to ethically assess AI-enabled weapons. We should be very careful in asserting which problems meaningful human control can solve, because in overstating the extent to which meaningful human judgment is the solution to what ails AI and AI-enabled weapons, we are underselling alternative solutions to genuinely serious problems.
Jovana Davidovic is an associate professor of philosophy at the University of Iowa, where she also holds a secondary appointment at the Center for Human Rights and the Law School. Davidovic is a Senior Research Fellow at the Stockdale Center for Ethical Leadership at the United States Naval Academy and the Chief Ethics Officer for BABL, AI — an algorithmic auditing and algorithm impact assessment consultancy.