In the realm of AI, understanding is the new currency. The AI space is on the brink of disruption, but not of the likes that we normally hear about. New technologies, frontier AI and foundational models are undoubtedly on the horizon. However, now there is a disruption of a new kind: regulation. 

The regulation revolution 

In recent months, from the Five Eyes meeting in Silicon Valley to the safety summit in Bletchley Park, we have seen world leaders starting to respond to the AI revolution and take steps towards regulation at a global level. The Bletchley declaration was signed by all 28 governments in attendance. It states that “AI should be designed, developed, deployed, and used, in a manner that is safe, … human-centric, trustworthy and responsible” and, in listing some of the key risks, says “explainability, fairness, accountability, regulation, … privacy and data protection need to be addressed.” The call for transparency has propelled Explainable AI to the forefront of discussion. 

AI is also only as good as the humans that use it. Misinterpretations of model outputs can have serious consequences – consider healthcare or the justice system. The limits of AI are dictated by the people who interact with it, not just at the development stage, but at every single level. For AI to be used responsibly and trustfully, people need educating to understand these limits and to carefully review and question its outputs.  

Decoding explainability 

Explainable AI involves techniques for ensuring outputs and AI-made decisions can be understood. It broadly consists of three main methods: decision understanding, prediction accuracy, and traceability. Other techniques, such as interpretability, observability and monitoring, also underpin discussions around explainable AI. These are intrinsically linked, with overlap in terms of applications and implications. Discussion on any one concept invariably touches upon aspects of others.  

Decision understanding comes from educating those interacting with AI, teaching them how to comprehend the reasoning process of a model. Models can be designed to facilitate this. A decision tree model can be designed to output the sequence of decisions that lead it to its final prediction. Prediction accuracy measures how well a model’s predictions match true outcomes. Traceability can be achieved through transparency on how models weight certain parameters relative to others, or by intentionally building models that, by design, can be audited by people.    

Balancing precision and interpretability 

The relationship between prediction accuracy and explainability is complex, and hinges on the quality of training. Deep neural networks can model intricate patterns due to their many degrees of freedom, but if poorly trained they are inaccurate. Their complexity has led to their widespread characterisation as “black boxes.” While capable of making accurate predictions, they shed little light on how decisions are arrived at. More interpretable models like linear regression or decision trees are less capable of modelling complex patterns with high accuracy, but they offer more explainability. In some cases, sacrificing some accuracy for greater explainability may be the best option, especially in areas like healthcare where understanding decision-making is crucial. This trade-off is entirely context specific, and thus companies requiring AI solutions should seek bespoke strategies which align with their specific needs.  

Many AI systems can be inherently difficult to explain. Deep learning models, comprising many layers and transformations between, are not easily traceable. The exact way decisions are arrived at can be elusive, buried in their nonlinear nature and the interplay of weights and biases. Large language models (LLMs), which underpin popular AI tools such as chatbots, are trained on immense amounts of data. Whilst this delivers an astonishing capacity to mimic human language, their sheer scale makes them virtually impenetrable. The task of understanding the influence of each individual data point within these colossal models is akin to finding a needle in a cosmic haystack. At least for the moment, they are not interpretable in any sustainable way.  

People all the way down 

A recent case study from Google DeepMind finds that LLMs struggle to self-correct their responses without external feedback. The success of self-correction is strongly linked to the nature of the task at hand. Self-correction techniques typically only succeeded with the addition of external sources, such as human feedback. This study underlines the importance of leveraging human feedback, and emphasises the fact that AI is people all the way down.  

This is why techniques for transparent and trustworthy AI must be applied at all levels of interaction: from design, building, training, deployment, and post-deployment. On top of this, systems need to be continuously updated to keep them safe, relevant and useful. Explainability techniques need to be an ongoing support service, not just a one-time event. Robust and repeatable tests, combined with evaluation suites to track model shifts, can ensure new data doesn’t introduce bias, inconsistency, or inaccuracy. 

Smith Institute empowers people to understand AI through explainability and AI assurance. We can assess, critique, and provide recommendations on the use of AI systems. Our strength across maths, statistics, machine learning and data science makes us ideally positioned to offer an objective view of how systems work, whilst also shining light on their limitations. 

We work with clients to evaluate AI decision-making agents, helping them understand which interpretability techniques can be used to explain decision-making processes. Beyond this, our award-winning IV&V procedures check that systems meet requirements, specifications, and fulfill their intended purpose. We intelligently stress-test algorithms so that they can rely on these streamlined processes to function with accuracy and equity. We also audit data-driven algorithms, probing their behaviours with novel data to check for overfitting or bias. 

People use what they understand and trust. Companies that provide clarity into the origins and logic behind their AI will not only earn the trust of users within and outside their organisation, but also gain support from regulators, consumers, and stakeholders. The escalating demand for openness will not only enhance credibility, but also pave the way for safe continuation of advancements in this space.