INSIGHTS

A Call for Explainability: The Path to Trustworthy AI

By Dr Madeleine Hall

In the realm of AI, understanding is the new currency. The AI space is on the brink of disruption, but not of the likes that we normally hear about. New technologies, frontier AI and foundational models are undoubtedly on the horizon. However, now there is a disruption of a new kind: regulation. 

The regulation revolution

2024 has been a landmark year for AI regulation.  

The EU AI Act has been in force since August. Member states must have identified their bodies responsible for fundamental rights protection by November 2024. In February 2025, prohibitions on certain AI systems begin. Providers of high-risk AI face substantial obligations, from risk management to human oversight.  

Across the Channel, the UK joined other signatories to the Council of Europe's Framework Convention on AI last month. This framework demands comprehensive measures to identify, assess, and mitigate AI risks. It emphasises oversight, like the EU Act, as well as digital literacy and public consultation.  

Meanwhile, on the other side of the Atlantic, California's Governor Newsom vetoed a landmark AI safety bill. With 32 of the world's top 50 AI companies calling California home, Newsom is advocating for a nuanced approach. He argues for context-based regulation that considers risks from all AI models, not just large-scale ones. Newsom warns against a false sense of security and stifling of innovation that regulation may introduce, calling for adaptable oversight as things rapidly evolve. 

While regulatory approaches differ across the globe, they all underscore a crucial point: AI's impact on society is too significant to ignore. Whether through comprehensive legislation or nuanced, context-based regulation, policymakers want to harness potential while mitigating risks. But regulation alone is not enough. It's increasingly clear that understanding and responsibly using AI is not just a task for developers and policymakers, but for society as a whole. 

AI is also only as good as the humans that use it. Misinterpretations of model outputs can have serious consequences – consider healthcare or the justice system. The limits of AI are dictated by the people who interact with it, not just at the development stage, but at every single level. For AI to be used responsibly and trustfully, people need educating to understand these limits and to carefully review and question its outputs.  

Decoding explainability

Explainable AI involves techniques for ensuring outputs and AI-made decisions can be understood. It broadly consists of three main methods: decision understanding, prediction accuracy, and traceability. Other techniques, such as interpretability, observability and monitoring, also underpin discussions around explainable AI. These are intrinsically linked, with overlap in terms of applications and implications. Discussion on any one concept invariably touches upon aspects of others.  

Decision understanding comes from educating those interacting with AI, teaching them how to comprehend the reasoning process of a model. Models can be designed to facilitate this. A decision tree model can be designed to output the sequence of decisions that lead it to its final prediction. Prediction accuracy measures how well a model’s predictions match true outcomes. Traceability can be achieved through transparency on how models weight certain parameters relative to others, or by intentionally building models that, by design, can be audited by people.    

Discover how we help EDF to deliver safe low-carbon energy to the UK.
EDF Logo

Balancing precision and interpretability

The relationship between prediction accuracy and explainability is complex, and hinges on the quality of training. Deep neural networks can model intricate patterns due to their many degrees of freedom, but if poorly trained they are inaccurate. Their complexity has led to their widespread characterisation as “black boxes.” While capable of making accurate predictions, they shed little light on how decisions are arrived at. More interpretable models like linear regression or decision trees are less capable of modelling complex patterns with high accuracy, but they offer more explainability. In some cases, sacrificing some accuracy for greater explainability may be the best option, especially in areas like healthcare where understanding decision-making is crucial. This trade-off is entirely context specific, and thus companies requiring AI solutions should seek bespoke strategies which align with their specific needs.  

Many AI systems can be inherently difficult to explain. Deep learning models, comprising many layers and transformations between, are not easily traceable. The exact way decisions are arrived at can be elusive, buried in their nonlinear nature and the interplay of weights and biases. Large language models (LLMs), which underpin popular AI tools such as chatbots, are trained on immense amounts of data. Whilst this delivers an astonishing capacity to mimic human language, their sheer scale makes them virtually impenetrable. The task of understanding the influence of each individual data point within these colossal models is akin to finding a needle in a cosmic haystack. At least for the moment, they are not interpretable in any sustainable way.  

People all the way down

A recent case study from Google DeepMind finds that LLMs struggle to self-correct their responses without external feedback. The success of self-correction is strongly linked to the nature of the task at hand. Self-correction techniques typically only succeeded with the addition of external sources, such as human feedback. This study underlines the importance of leveraging human feedback, and emphasises the fact that AI is people all the way down.  

This is why techniques for transparent and trustworthy AI must be applied at all levels of interaction: from design, building, training, deployment, and post-deployment. On top of this, systems need to be continuously updated to keep them safe, relevant and useful. Explainability techniques need to be an ongoing support service, not just a one-time event. Robust and repeatable tests, combined with evaluation suites to track model shifts, can ensure new data doesn’t introduce bias, inconsistency, or inaccuracy. 

Smith Institute empowers people to understand AI through explainability and AI assurance. We can assess, critique, and provide recommendations on the use of AI systems. Our strength across maths, statistics, machine learning and data science makes us ideally positioned to offer an objective view of how systems work, whilst also shining light on their limitations. 

We work with clients to evaluate AI decision-making agents, helping them understand which interpretability techniques can be used to explain decision-making processes. Beyond this, our award-winning IV&V procedures check that systems meet requirements, specifications, and fulfill their intended purpose. We intelligently stress-test algorithms so that they can rely on these streamlined processes to function with accuracy and equity. We also audit data-driven algorithms, probing their behaviours with novel data to check for overfitting or bias. 

People use what they understand and trust. Companies that provide clarity into the origins and logic behind their AI will not only earn the trust of users within and outside their organisation, but also gain support from regulators, consumers, and stakeholders. The escalating demand for openness will not only enhance credibility, but also pave the way for safe continuation of advancements in this space. 

Don’t miss an Insight or Case study
Get summaries of our latest Insights delivered straight to your inbox.

Office Address:
Willow Court, West Way, Minns
Business Park. Oxford OX2 0JB
+44 (0) 1865 244011
hello@smithinst.co.uk

© Smith Institute 2024. All rights reserved. Website by Studio Global

Smith Institute Ltd is a company limited by guarantee registered in England & Wales number 03341743 with registered address at 1 Minster Court, Tuscam Way, Camberley, GU15 3YY