Beyond the Black Box: Creating Clear and Effective Analysis Tools

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”  

– Sherlock Holmes ‘A Scandal in Bohemia’ by Arthur Conan Doyle 

Introduction

At the recent UK Leakage conference, Jeremy Heath, Innovation Manager at SES Water, enthusiastically proclaimed from the stage that ‘now is absolutely the most exciting time to work in the leakage industry’ and he was spot on. As with other industries, the explosion in the amount of available data has rapidly been transformed into a profusion of information. In turn, this information is facilitating significant advances in the type of actionable insights that will both preserve our environment and deliver better value for the consumer. 

These insights are driving more efficient approaches to managing water networks. From pursuing just-in-time (JIT) maintenance strategies for critical assets such as pressure reducing valves (by monitoring their performance in real-time), to narrowing down the location of hard-to-find bursts with correlating acoustic loggers; increasingly, water companies can ensure they have the right teams in the right places. Such advances, when set against challenging targets formed in the face of climate change and the regulator’s requirements, have set the stage for a particularly dynamic phase for the industry. 

This exciting new phase does raise some interesting questions that are more widely applicable, however. As we move further into the 21st century, these informational leaps will increasingly be underpinned by machine learning algorithms and ultimately, artificial intelligence. The question this blog will address is: how can these developing technologies best support analysts and what are the dangers associated with ‘black box solutions’? 

On the perils of black box solutions

Let’s first review what we mean by black box solutions. A black box solution is defined as one whose decision-making process is not known or accessible by the user. By nature, the machine learning algorithms at its heart are complex and so such systems would focus on inputs and outputs, leaving the user in the dark about their ‘thinking process’. In the context of the data collected from water networks, we can imagine that a future ‘Leakage AI’ black box might absorb the vast amounts of data being generated by sensors of all descriptions and spit out a command that can be actioned by ground teams.  

Is this desirable?  

Well, let’s discuss the risks and then see what can be done to mitigate them.  

  1. Complexity: The first and most obvious risk is one of understanding. Where the internal processes of an algorithm are unknown to the user, the outputs may well be hard to explain and make a business case for. Parallel to this problem, if the user doesn’t understand the algorithm then they won’t be able to spot and correct for biases that might be baked into it and wont even be aware that it may have biases that should be corrected for.  
  2. Computer says ‘no’: A closelylinked risk to that of understanding is one of over reliance. Water companies spend time and effort building teams with both technical and local knowledge which allows for intuitive responses to outbreaks of leakage or suspected data errors. By imbuing an opaque model with the power to make decisions about what should be done in an area, we would lose the benefit of this intuition. 
  3. The human element: The relationships built up between teams are valuable to a company for a multitude of reasons. If ‘teamwork makes the dream work’ then it should be recognised that building a relationship with an AI that is designed simply to spit out instructions is unthinkable. 

SME Water’s approach

So, what is the best way to implement the benefits of machine learning in the industry without sacrificing human intuition and sidelining the importance of full comprehension of the issue at hand?

This is a question that SME Water has taken into consideration as part of the Paradigm project. With respect for the results that can be achieved by implementing machine learning approaches, we have taken the view that our products should not attempt to step over the need for skilled analysts, instead looking to support leakage teams in providing tools that incorporate machine learning to provide actionable insights. In short, there is a balance to be struck that benefits both the analyst and the speed and accuracy with which they are able to work.

In the case of Paradigm, we have provided a system that incorporates partial automation by demonstrating an expected profile based on known demand components and, in doing so, provide a yardstick against which DMA level flows can be measured. The transparent method we used to build these models is explained through the Paradigm Academy, as we look to build analysts’ understanding of both the Paradigm system and the water networks they oversee.

Conclusion

Informed decision making is essential for any company and, further, using the magic of machine learning to support decision making rather than replace it entirely should be a central consideration for any company. Collaboration between teams of trained analysts and ground teams remains critically important to the efficient resolution of issues – with proposed solutions being explainable in both directions. 

From a supplier perspective it is important to build trust and credibility by ‘showing the working’ of a product. If we did not do this, once a questionable result was produced: how would we go about assuring clients that it is in fact correct? Or demonstrate necessary improvement to the model? The knock-on effect of being unable to do so would be reduced faith in the product. 

It is not an uncommonly held view that the advent of artificial intelligence may well prove to be to this century what the industrial revolution was to the 19th. However, introducing the efficiencies offered by machine learning should go hand in hand with encouraging continuous learning and understanding of the available data. By building better tools that expand users’ understanding we, as an industry, collectively raise our game to meet the challenges that have been set ahead of 2050. 

Related articles