More than Math: Toward a Better Strategy for Advanced Analytics
If, like the Army’s XVIII Airborne Corps, you plan to use AI when dropping 1,000-pound bombs, you should be sure you are using it correctly. This is why the Department of Defense has been working to incorporate critical thinking, problem-solving, and ethics into AI and machine learning. Following the lead of the commercial sector, the 2020 Department of Defense Artificial Intelligence Education Strategy emphasizes the identification, understanding, and mitigation of bias in AI, as well as the limitations of advanced analytics in the operational environment and the ethical implications of human-machine teaming.
With these goals in mind, our data literacy program at the Joint Special Operations Command’s Intelligence Brigade has identified four critical elements for the effective and responsible use of AI. In our experience, effective AI education and talent management include developing individuals steeped in critical thinking and problem-solving skills, building data science teams rather than pursuing individual talent, using a process that emphasizes letting the problem guide the solution, and understanding the importance of complementary skillsets and roles as they contribute to a rich data science ecosystem.
Elements of Responsible AI
Advanced analytics requires more than coding skills, math or fluency in specific software. Our approach to data literacy leverages social science research and commercial best practices to achieve a more wholistic approach to AI in the following ways:
First, humans will always be more important than either hardware or software. The professional data science community has rebranded critical thinking, problem-solving, and ethics — traditionally described as “soft skills” — as the new “power skills” necessary to translate data science aspiration into real, measurable competitive advantage. As such, our curriculum is designed to fill the analytic “utility belt” with vital critical-thinking tools. One of the tools we emphasize is identifying biased algorithms. Bias can be compelling, nuanced, and even seductive, especially in very complex or “black box” algorithms when it seems to confirm conventional wisdom. But biased algorithms are wrong, and often backfire as well. Analytically competitive organizations identify and mitigate bias not only because it is the right thing to do, but also because it gives them a competitive advantage.
While examples of algorithmic bias occur with concerning regularity, facial recognition and matching algorithms have received special focus given their extremely poor performance in identifying women and people of color. Researchers recognized that the problem with these models was that they were built using training sets predominately filled with light-skinned males. Thus, when used to classify a person outside that demographic, their accuracy drops significantly. To ensure that these capabilities are used responsibly and effectively, it is necessary to educate data scientists, methodologists, and end users alike in how to understand model performance and assess the implication of errors.
Transparent, interpretable AI is all the more important in the operational environment. It might be possible to convince an eager sales associate that your black box model will help them reach their annual sales quota. But it is not acceptable to tell a teammate about to put his or her life into the digital hands of your highly predictive but unintelligible model, “Trust me, model lift over previous iterations was amazing and overall performance of the algorithm looked great, but I have absolutely no idea how it works.” You do not have to educate analysts in the details of backpropagation algorithms so a user can better understand neural networks, but they should at least be generally aware of what is going on under the hood. Taking a cue from the financial services and other regulated industries, the real-world translation of “interpretable AI” means that if I cannot understand the model, I cannot explain it. If I cannot explain it to the intended end-user, then they are not going to trust and/or use it.
Playing as a Team
Second, in a competitive organization, data science is approached as a full-contact team sport in which no one rides the bench. Increasing data literacy across an organization builds overall analytic maturity and provides broader lift to enterprise-wide efforts. Conversely, organizations that corral their exquisitely qualified data science “unicorns” in an isolated service center miss opportunities for meaningful end user engagement and participation. Moreover, like the mythical unicorn, individual performers fully skilled in all aspects of data science may not even exist. Commercial experience demonstrates the benefits of embracing this and training a workforce that possesses a varied yet complementary set of data science skills. This approach expands organizational breadth, depth, and sustainable capacity. Furthermore, it also mitigates single points of failure associated with a limited supply of exceptionally talented individuals who are notoriously difficult to recruit, hire, and retain.
The U.S. Air Force is addressing this challenge directly through its innovative TRON program. TRON effectively leverages commercially available training programs in software development and data science. This training is reinforced and enriched through structured internships that enable students to apply their newly acquired skills to real-world problems in a semi-supervised environment. Our data literacy program takes a similar approach to training the military and civilian members of the Joint Special Operations Command Intelligence Brigade to ensure that every member of the team has the foundation-level data acumen necessary to not only support but actively participate in enterprise-wide data science efforts. Like TRON, our program also makes additional education available to students demonstrating aptitude and interest as a means to deepen internal data science capacity.
Building a foundation of core data science knowledge across an organization creates an environment for data science talent to grow organically by teaching the workforce to speak a common data science “language.” Data literacy education, as opposed to merely training in specific capabilities, creates a workforce able to seamlessly, rapidly, and meaningfully integrate novel data sources, methods, and technology, including those currently over the horizon. While we can train for the known, we should educate for the unknown. As described in the Department of Defense AI strategy, foundational concepts that are standardized yet flexible set the conditions necessary for successful innovation, collaboration, implementation, and responsible use of advanced analytics.
Have a Problem-Focused Process
Third, analytically mature organizations know that technical proficiency in specific software tools, or “buttonology,” is necessary but not sufficient. Reliance exclusively on training in specific coding languages or technology platforms threatens to create inflexible, brittle capacity that is unable to grow or evolve in response to changing conditions and may snap when stressed. If the only thing you know how to use is a hammer, everything will look like a nail. A better alternative, as outlined in the 2020 Artificial Intelligence Education Strategy, involves developing workflows and processes that let the problem guide the solution rather than a specific analytic technology or tool.
The Cross-Industry Standard Process for Data Mining is the most widely accepted methodology for solving problems within the data science community and serves as the framework for our basic course. Similar to the scientific method, which describes a standard research process and workflow, the Cross-Industry Standard Process for Data Mining operationalizes analytic process best practices. Also like the scientific method, it is not limited to specific sources, methods, or technology. Rather, it provides an effective analytic process model that can be used to answer any question that can be solved by data. We elected to use the Cross-Industry Standard Process for Data Mining as a foundational workflow given its ability to support seamless and effective incorporation of novel capabilities as they become available.
Moreover, the use of a standard process model or checklist can also include “nudges” that prompt analysts to pause, think critically, and check for bias. To return to the previous example, the facial recognition algorithms developed on white males “worked” until they did not. Even technologically savvy end users continued to use these models until it became apparent that they did not perform well against individuals outside the narrow range of training data. Analytic process models or checklists like Cross-Industry Standard Process for Data Mining that include explicit nudges to evaluate model performance reinforce the fact that all models have errors, particularly as potential real-world application extends from training cases. Developing a solid foundation in critical thinking and reproducible methods in combination with a stable, reproducible analytic workflow will not catch everything. However, by building in checks for bias and errors, this approach will at least cue critical thinking in support of error identification, mitigation, and consequence management. In our experience, using a common data science process model and language builds capacity that can transcend service branch, individual role, and even intelligence discipline — setting conditions for mission success in the joint operating environment.
Create an Ecosystem
Finally, every member of our teams should be educated as an informed consumer. While putting fingers to keyboard to write code is not for everyone, the Department of Defense’s AI strategy includes nontechnical and less technical roles, or “AI Workforce Archetypes,” who are increasingly required to procure, manage, field, and adopt progressively sophisticated analytic capabilities. In addition to these archetypes, our program accounts for the increasing significance of the “analytic translator” role. Like a data science “utility player,” the translator trades depth of knowledge for breadth and associated domain expertise, which allows them to competently fill multiple roles across the organization and provide continuity with data science efforts. This enables them to serve as important bridge-builders who can identify the mathematical “word problem” embedded in a thorny business challenge and translate it into actionable data science requirements. They can also ensure the results meet the end user needs and are operationally relevant and actionable.
Again, data science is a team sport. Not only does everyone have a role, even the supporting positions need to understand the playbook. Everyone on our team needs the data literacy necessary to ask the right questions in order to effectively, responsibly, and ethically use advanced analytic capabilities. General understanding of model creation, validation, and associated assumptions can inform responsible use to include checks for bias, as well as identification and mitigation of errors. The Army’s XVIII Airborne Corps has taken this concept to the next level with their innovation challenge, the Dragon’s Lair. Like the popular television program “Shark Tank,” the Dragon’s Lair provides a forum where individuals can identify real-world problems and pitch solutions. Teaching the workforce data literacy promotes this type of innovation by developing the ability to identify and describe the “word problem,” generate actionable requirements, and provide meaningful feedback to proposed solutions.
Ultimately, data science in its foundation is math, not magic. But math can still be difficult to use effectively, responsibly, and ethically. The 2020 Department of Defense AI Education Strategy can serve as a starting point for doing so. In particular, the inclusion of and emphasis on “non-math” skills provides access to the nontechnical/less technical members of the team and enables the military to realize the promise of advanced analytics. This involves setting the conditions for novel insight and understanding in support of meaningful solutions to some of the hardest problems our nation faces.
L.t. Col. James “Mike” Blue is a career Army intelligence officer with multiple deployments in Iraq and Afghanistan. He earned his bachelor’s in history from George Mason University, his master’s in intelligence studies from American Military University, and a master’s in strategic intelligence from the National Intelligence University.
Lt. Col. Anthony Smith is an operations research systems analyst in the Army. He deployed multiple times to Iraq and Afghanistan as a commander with conventional and Ranger units and was also a data analyst at Army Futures Command. He earned an master’s in operations research from the Naval Postgraduate School and a bachelor’s in management from the U.S. Military Academy.
Colleen McCue, Ph.D., supports the special missions community. She is a principal data scientist with CACI International and a CACI Fellow. She earned her Ph.D. in psychology from Dartmouth College and completed a five-year postdoctoral fellowship at the Medical College of Virginia, Virginia Commonwealth University.