AI for effective solutions, step 4: online evaluation or monitoring

9/14/2021

Adapt to change to last in time: the true Machine Learning challenge

Any ML model that is designed and developed through the steps we have presented in previous posts, even using all the suggested tips and tools, would be very accurate and - most importantly - really effective. Sure, but for how long? The world is constantly changing in time, and, together with it, also the related data. This is why any Machine Learning system is exposed to a performance degradation in time, as it has to deal with real data, which are changing elements. Hence monitoring - or online evaluation - is essential to let the model adapt to the world’s changes, therefore to keep its accuracy and consistency.

Previous blog post

The test bench AKA monitor model’s performance on real data

As mentioned above, the monitoring of an ML model’s performance on real data is the last step of the process made in Aptus.AI studied to create AI effective solutions. Now it helps to recap - mostly for those who have lost the other phases - all the previous episodes.

Intro: what actually happens when Machine Learning models have to face real market’s needs.
Problem definition (step 1): why success is always based on a clear identification of the problem to be answered and what is needed to define it.
Data preparation (step 2): the arrangement of specific datasets for the training is an essential activity to create an effective model.
Model design and offline evaluation (step 3): tips to follow to optimize the more standardised - but at the same time delicate - phase in the development of ML-based products.

Once a satisfying model is built, the focus needs to move on the degradation of performance caused by the unavoidable change of data and concepts in time, mostly in a business context. As already observed, the academia world represents an exception, as it often works on predefined - therefore static and normalized - datasets, in order to compare how different models perform on the same data. On the other hand, in a commercial context, data and concepts are constantly evolving, so they require to be carefully monitored to preserve the model’s performance in time - and we are alone in assessing this.

Catch the drift: how to adapt to change by calibrating data and concepts

The term used to express the temporary evolution of data and concepts is drift. This word is very close to slip or shift, but - more simply - what it is at stake here is change. An ML model cannot be effective if it doesn’t take into account data and concept drift, as correctly highlighted in this interesting Medium article on Towards Data Science. Starting from data drift, the distribution of data can change both in respect to inputs, following changes in the real world (for example, the technological evolution of a service on which the ML system makes predictions), and in respect to outputs, namely the results delivered by the system itself. In both cases, the distribution of data changes and consequently also the performance of the Machine Learning system. Besides data, instead, concepts evolve in time too. The meaning change of the elements used by the model needs to be monitored on a semantic level, in order to avoid interpretation errors. For example, while developing Daitomic - our AI platform created for the RegTech market -, we needed to update the legal texts repository which feeds the model and, consequently, also the contained concepts. Hence, both in respect to data and to concepts, the answer to change is to monitor their drift and to adequately adapt the model, in order to keep its consistency.

The success of machines is grounded on humans’ feedbacks: the case of Daitomic

At Aptus.AI we work constantly to optimize our monitoring operations to enhance the performance on real data of our AI solutions. To do that, we also use the most powerful tool: human skills. And this is why humans’ feedback is surely the most affordable, mostly when talking about concepts, therefore about semantics. Besides, as already reported in the previous episode of this series, the HLP (Human Performance Level) is always the best possible reference. Finally, the so-called human-in-the-loop approach - which we have used in the development of Daitomic, and which we are still using to make it better every day - is necessary to evaluate the real effectiveness of an ML-based product, as final users are always human beings. Those who have followed all this blog series on how we work in Aptus.AI now must have no doubt about the quality of our AI solutions. And those who work in the financial compliance sector just have to meet Daitomic to try first hand how Artificial Intelligence is capable of revolutionizing all the typical activities of this market.