Machine learning models are found in all shapes and sizes under the umbrella term ‘AI’, so finding what works for you can be a challenge. Even deployed solutions can be improved, but how, and is it worth it? For any machine learning implementation, new or ongoing, there are key questions you should be asking to interrogate your solution and ensure the rigour of your AI deployment.
What does success look like to you?
Today, a vast catalogue of machine learning techniques is available. Within this catalogue, however, the best approach for your problem will often be unique to what you want to achieve. Finding this approach demands careful choice of key performance indicators (KPIs) to assess how well different models succeed at the intended task. To seek a solution you will be satisfied with, these KPIs should reflect the specific goals of your project.
Suppose you’re designing a diagnostic test for an infectious disease. You might want your test to minimise the chance of false negatives, called high sensitivity. On the other hand, if you were designing a department store shoplifting scanner, you might prefer it to be highly selective for whom it raises an alarm, lest innocent customers take their custom elsewhere. In this case you might want to minimise the chance of false positives, called high specificity.
Sensitivity and specificity are just two of many different types of KPI used in AI, and almost no real-world model can excel across all of them. Some domains have well accepted KPIs, but choosing the most appropriate for your application lets you make the most informed model selection.
Having decided on your KPIs, the goal is then to find the model which performs best. But the time it takes to train and assess a model makes optimisation very slow, and often one must settle with having only partially explored the options. Whatever you arrive at can usually be improved to some degree by continued fine-tuning, but eventually you will need to compromise. Is an extra week of investment worth an extra 0.1% performance? What about two weeks? Or 0.01%? The value added is determined by what impact the difference in performance will have, for example in high cost risk assessments such as mortgage lending the value of slight model improvements is magnified by the scale of gains and losses at stake. By establishing performance targets early, you can avoid an endless improvement cycle with diminishing returns.
How far can your data take you?
Machine learning is at its most powerful when applied to data rich problems. Complex models like neural networks can produce astonishing results, but only if the data is available to support them. Training a complex model on a small data set is likely to result in overfitting, where your model can appear to perform exceptionally during training but will make unreliable or nonsense predictions when used in practice. You can avoid this by using a smaller, simpler model, or – for more predictive power – by combining machine learning with domain expertise and systems analysis. In any case, your choice of AI model should be checked against the quality and quantity of data you have available to train it.
It remains a challenge in AI to design models that generalise well to tasks they haven’t seen and stay relevant over time. Even well tested and verified AI models yield unexpected and sometimes problematic results when asked to predict outside their limited training window. We are witnessing this first-hand as AIs struggle in the wake of COVID-19 to adapt to never-before-seen consumer habits. You can only have confidence in your model’s predictions when they are made in the same data realm it has been tested in. So while the prospect of a flexible AI is attractive, a restricted prediction domain will keep your solution reliable.
Can you explain your data-driven solution?
The human approach to model building often involves steps of abstraction, from system to model, which can be followed, or at least believed, by other humans. The same cannot be said for many AIs which use their superior data processing power to derive models which are less clearly linked to the underlying system. This ability is part of what brings AI to the forefront of modelling in business, but is also a source of major criticism. If you cannot explain how your model arrives at predictions, how can you defend them? How can you inspect the results for bias? For these reasons so-called “black box” machine learning models can lack credibility, regardless of their performance. Transparency is particularly valued in certain industries, such as in healthcare where the use of AI raises concerns over patient data privacy, and has a significant impact on human wellbeing. However, some models are more transparent than others: you may be able to sacrifice some predictive power in exchange for a model which can be more easily explained, manipulated, and audited.
Conceptually speaking, AI development is not a closed topic to non-specialists. Despite becoming increasingly sophisticated, machine learning techniques are typically built from simple premises, and with the abundance of resources available from the data science community, becoming familiar with important AI concepts requires very little time investment. This means that informed conversations about AI are possible without specialist knowledge, and you can maintain transparency as a customer or consultant.
Do you have the right infrastructure?
Fitting machine learning models is computationally expensive, so it is worth investing in the right infrastructure early in the project. Even for data sets of no more than a few gigabytes, large models like neural networks can take hours to fit on an average laptop CPU, and searching for the best such model, according to your KPIs, could take weeks. To save time, and avoid locking up your local devices, cloud computing can offer a good balance of value and efficiency.
AI maintenance can also become expensive over its lifetime if your processes are inefficient, as the costs of updates, modifications, and day-to-day usage add up. Learning from updated data should avoid re-training the model from scratch, and if your circumstances change you need a model that can be rapidly adapted. Keeping your systems streamlined and well documented will pay off significantly in the long run.
In summary
AI has achieved great successes where human pattern recognition fails. Without due care, however, an AI can quickly become costly, unhelpful, or downright misleading. You can avoid these pitfalls by establishing appropriate and achievable goals for your needs, by ensuring your data can answer your questions, and by setting up suitable infrastructure early.
It is hard to build trust in AI, even those you’ve designed yourself, as they appear to achieve feats of increasingly complex reasoning in ways most humans cannot comprehend. The power of explainable AI should not be underestimated. If your solution is justified, and stands up to scrutiny from questions like these, you can be confident that AI is providing a good return on your investment.