How can we trust decisions made by AI?
AI has found itself at odds with the human need for reassurance. Trusting decisions made by AI and machine learning (ML) algorithms can be difficult when those decisions occur in a “black box,” an inscrutable process that takes place without human supervision. As AI-ML becomes more widespread, pressure is growing to make sure these decisions are better explained and understood. But seeking this reassurance is about much more than warm feelings. Many who wish to adopt AI-ML often cannot afford to be wrong, notably when it comes to matters of national security. How can we trust decisions made by AI-ML? To learn more we welcome Dana Moore, author, lecturer, and Principal Engineer at Leidos.
Q: What’s the big challenge when it comes to trusting AI decisions?
Moore: We’re on the cusp of a new age of AI applications, from vehicular autonomy to guided surgeries. Machine learning and deep learning are the critical technologies in this explosion of AI-guided applications, but these models can be opaque at best—tough for people to understand. We want to understand the logic path these models follow to make decisions. We want to trust them and know why errors occur in order to understand how to correct them. We also want models that are provable. If you can demonstrate that a model is not wholly provable, then you can most likely demonstrate it is not valid, or that its security parameters have been breached.
Q: Why is it so important to trust decisions made by AI?
Moore: Trust is important to those who can’t afford to be wrong. If you have a model that plays a board game against a human opponent, for example, you can afford to be wrong. But if you’re working with self-driving vehicles, you simply can’t. Trust and validity are important in many other domains as well, including medicine and national security. When the penalty for being wrong might be undue loss of human life, unlawful detention or prosecution, or in any case where poor recommendations might have dire consequences, it’s very concerning. We get strange recommendations from Amazon all the time, even with years of purchasing data in those models. But these have no discernible effects on life or liberty. But apply the same faulty recommended engine to a “person of interest” scenario, for example, and lives and livelihoods are at stake.
, Principal Engineer
The big secret about AI, machine learning, and deep learning is that 80 to 90 percent of the work is in the curation and preparation of the data that goes into a model.
Q: Trust and validity seem to be weak links right now in AI-ML. Why is this the case?
Moore: Provably correct code is not easily attained. Many AI-ML systems are simply too complex to demonstrate that a given outcome is provable. The best we can do in many cases is make a decent effort at demonstrating suitability to task and robustness against failure. But there are constraints even to this. It takes time and money to validate AI decisions, which means less-than-perfect (and generally unprovable) systems get deployed.
In traditional computer programming, you can trace outcomes and get a fairly good clue about the inner working of those programs. This came from a healthy skepticism in software engineering. Developers took cues from other engineering practices like structural engineering, and developed the ability to trace through code. Now we have the ubiquity of things like stack tracing, so that when you have a programming error, you can trace it back to its point of origin. Those kinds of things grew up in software engineering, but they’re not yet part of AI-ML. A more realistic goal instead is the idea of “explainable AI,” or the ability of AI-ML systems to explain their decisions to humans.
Q: What’s the holy grail of explainable AI?
Moore: First, I understand why. Second, I understand why not. Third, I know when it will succeed. Fourth, I know when it will fail. Fifth, I know when to trust it. Finally, I know why it erred. These may turn out to be necessary but not sufficient constraints. That is, there may also be other elements that support explainability and model validation. Consider that even though these requirements may be met, there may still exist reasons why models or predictions are poor.
Q: If an AI decision can’t be fully provable, what does “provable enough” look like?
Moore: Mathematic equations are provable, but machine learning and deep learning models aren’t necessarily provable. There are certain output metrics that indicate precision and accuracy. What you really want are measures of performance and effectiveness from a model so that you can have confidence in its validity. If an AI system is well constituted and trained, has algorithms for prediction evaluation, and demonstrably produces reasonably high quality, true positive results, then that model may be suited for its purpose. But there’s still a vital role for curation and feedback with regard to training data selection, feature selection, and reasoning algorithms.
Q: Going forward, what will it take to counter the growing discomfort about our ability to trust AI decisions?
Moore: The big secret about AI, machine learning, and deep learning is that 80 to 90 percent of the work is in the curation and preparation of the data that goes into a model. So the better job one does in that regard, the better models and predictions you come up with. As a consequence, systems begin to build a “track record” of trustworthiness, sensible decisions and apparent consistency. When we consider this question, the good news is that the subject of explainable AI is a hot topic of pursuit both in academia and industry. There’s a near universal agreement that explainable AI and allaying mistrust are of paramount importance.
Q: What has led to this consensus?
Moore: Trusting AI has become a bit of a sensitive subject in an age when we hear so much about AI taking on human-like decision-making roles in an ever-widening range of activities. In many cases, like operating a motor vehicle, we have always restricted these activities to the human domain. But the truth is that very few decisions are ever made by the seat of our pants in the modern era. For the most part, elaborate combinations of hardware and software systems support our every decision. Despite flaws and imperfections, we may have no alternative to relying on AI in the decision-making process. This, in turn, will lead to serious efforts to make sure we cover all the bases in creating trust of AI-ML decision making.