Trust over hype: Building analytics that inspire confidence

Introduction

Water companies are collecting more data than ever before. Network sensors, smart meters, telemetry systems, and cloud platforms are generating a continuous stream of operational insight. Much of this data exists to meet regulatory commitments and improve performance, particularly around leakage.

But data alone doesn’t drive decisions. People do.

For analytics to influence operational priorities or long-term investment planning, teams must have confidence in the results. They must trust that analytical outputs reflect how their networks behave, that models respond logically to real-world events, and that results can be understood, explained and even challenged.

Through our collaborative club projects with UK water companies, we have explored how different analytical approaches influence both performance and adoption. Our experience has consistently shown that the most effective analytics rarely sits at the extremes of simplicity or complexity. Instead, it often sits in the middle — combining data-driven learning with transparent, explainable modelling that supports confident decision-making.

The most important decisions are made by people, not models

Over the past two decades, the water industry has undergone a quiet but significant digital transformation. Investment in data platforms, telemetry, and analytics capability has expanded rapidly as organisations recognise the potential for data to improve performance and support regulatory reporting.

With this investment has come expectation. Regulators, executive teams, and operational departments, including managers and analysts, increasingly look to analytics to support decisions at every level of the organisation — from tactical network interventions through to long-term infrastructure planning.

As analytics becomes more influential, the definition of success changes. Accuracy alone is not enough; the real challenge becomes confidence. Confidence that models represent network behaviour realistically. Confidence that changes in analytical outputs correspond to operational events. Confidence that teams can explain results to colleagues, regulators, and other stakeholders.

In critical infrastructure, decisions are never made by algorithms alone. They are made by experienced professionals who must justify and stand behind those decisions. As analytics becomes embedded across the industry, building confidence in models is becoming just as important as building the models themselves.

The false binary choice: simple rules or black-box AI

We often see analytics discussions framed as a choice between two very different approaches.

On one side are traditional statistical or rules-based methods. These approaches are generally widely trusted because they are comparatively transparent and easy to interpret. Engineers and analysts can usually see how results are generated, challenge assumptions, and explain outputs clearly. However, these methods can struggle to adapt to complex or evolving network behaviour and may oversimplify inherently dynamic problems.

On the other side is the rapid rise of artificial intelligence and machine learning. These techniques can identify subtle patterns within large datasets and often deliver accurate predictions. As data volumes increase, AI is frequently put forward as the natural progression for improving network insight.

However, many AI approaches operate as ‘black boxes’. While they may produce accurate predictions, it can be difficult to explain why results occur or how outputs relate to actual network behaviour. When analytics influences operational or strategic decisions, this lack of transparency can create barriers to adoption, particularly in environments where accountability and regulatory scrutiny are high.

In our experience, decision makers are rarely choosing between transparency and sophistication. The challenges they face demand both.

Thinking in spectrums, not categories

Rather than viewing analytical approaches as opposing camps, we find it more helpful to view them as sitting along a spectrum.

At one end are traditional ‘white box’ methods, where assumptions are explicit and outputs are easily explained. At the other end are highly data-driven black-box AI models that prioritise predictive performance but often compromise on interpretability.

Between these extremes lies a powerful middle ground. Here, models are designed to learn from data while still reflecting known characteristics of water networks. These approaches combine statistical learning with engineering understanding, allowing models to remain transparent while adapting to complex and changing conditions.

Crucially, these models do not attempt to learn everything from data alone. Instead, data is used to refine structured representations of established and well understood system behaviours. This makes results easier to interpret, easier to explain, and naturally easier to trust.

For many challenges within the industry — where systems are complex, only partially observable, and operational decisions carry significant consequences — we consistently see this balance delivering the greatest value.

What ‘white box’ really means

The term “white box” is sometimes interpreted as meaning simplistic or rigid modelling. In practice, we see white box modelling as being defined by transparency and accountability rather than simplicity.

White box models make assumptions visible. They allow users to understand how outputs are generated, what factors influence results, and how models respond when conditions change. This visibility allows engineers and analysts to challenge outputs, explore alternative explanations, and confirm that results align with operational experience.

Transparency does not prevent sophistication. Many modern modelling techniques can capture complex system behaviour while remaining interpretable. The key difference is that these models are designed to reflect known network behaviour rather than relying entirely on data to uncover relationships without context.

By contrast, black box models often prioritise predictive performance above interpretability. While they can uncover subtle patterns, they may struggle to explain why outputs change or how results relate to operational events. When analytical models influence network interventions, leakage strategies or investment decisions, explainability becomes as important as accuracy.

Figure 1: Hitting a balance between Black and White box solutions

Lessons from collaborative club projects

Our collaborative club projects with UK water companies have provided valuable insight into how analytics is successfully developed and adopted in operational environments. These projects are built on a simple principle: the most effective analytics is created in partnership with the teams who rely on it.

Working across multiple organisations allows challenges to be examined from different regulatory, operational, and cultural perspectives. It reinforces our belief that analytics rarely succeeds in isolation. Instead, it evolves through collaboration, shared learning, and open challenge.

One consistent lesson has been that technical performance alone does not guarantee adoption. Models must behave in ways that practitioners recognise as realistic. When outputs align with engineering intuition and respond logically to real-world events, confidence grows rapidly. When they do not, even technically impressive models can struggle to gain traction.

The Paradigm club project demonstrated this clearly. By modelling expected customer demand behaviour within District Metered Areas (DMAs), Paradigm helped shift leakage analysis away from reliance on historic norms and minimum night flow indicators. Instead, it introduced a structured understanding of how each DMA should perform across a full 24-hour cycle across the year.

One of the most interesting observations during Paradigm’s adoption has been how analysts with different levels of experience interact with analytical models.

Among experienced analysts, Paradigm is often used as a tool to challenge their own understanding of network behaviour. These analysts bring deep operational knowledge and use the model as a second perspective. In many cases, this leads to valuable discussions between analysts and field teams — sometimes validating traditional understanding but occasionally highlighting where long-held assumptions about network behaviour may not fully reflect reality. In these situations, the model strengthens engineering judgement rather than replacing it.

At the other end of the spectrum, we have seen less experienced analysts become highly reliant on model outputs. While this demonstrates strong confidence in the analytics, it can sometimes reduce critical challenge. When models are followed without question, opportunities to identify data issues or model limitations can be missed. Ironically, this can slow the improvement process that makes these tools more effective.

This contrast has reinforced an important principle for us: the best analytical tools are not those that replace human expertise, but those that encourage constructive challenge. Models improve when they are questioned. Organisations gain the greatest value when analytics and operational experience strengthen each other.

Building on these lessons, the Dynamo club project is applying the same collaborative philosophy to pressure management. As pressure control schemes and monitoring infrastructure increase in scale and complexity, distinguishing genuine network opportunities from data or asset issues becomes more prominent and challenging. Dynamo will focus on providing structured insight that supports both area level optimisation and strategic pressure management decisions.

Across all of our collaborative work, two enabling themes consistently emerge: trusted data foundations and investment in skills development. Reliable analytics depends on validated, connected datasets that provide a consistent view of network performance. Equally, models deliver greatest value when organisations invest in shared understanding and training that supports confident and consistent use.

Why the middle ground works best in practice

Our experience consistently shows that analysis approaches that combine structured network understanding with data-driven learning deliver the most sustainable value.

These approaches embed engineering knowledge directly into modelling frameworks while remaining flexible enough to adapt as new data becomes available. This balance improves interpretability while maintaining analytical performance.

Confidence in these models, however, depends heavily on the quality and consistency of supporting data. Without robust validation and integration, operational datasets can produce conflicting conclusions that undermine trust in analytical outputs, which is why we place such importance on the data pipelines that feed our analytics platforms and what Cadence (Our data ingestion and storage product) seeks to address.

Equally important is the development of analytical capability within organisations. Supporting analysts through structured training and shared methodologies and best practice ensures model outputs are used consistently and confidently across teams. This is what underpins the content produced in our Academy (Our training and learning hub).

The most valuable analytics is rarely the most technically complex. It is the analytics that organisations feel confident using, challenging, and evolving over time.

This isn’t anti-AI — It’s pro-appropriate AI

Discussing the limitations of black-box modelling can sometimes be interpreted as resistance to artificial intelligence. In reality, we see AI and machine learning as offering significant opportunities to our industry.

Different analytical approaches are suited to different problems. Where large, labelled datasets exist or objectives are purely predictive, black box AI can deliver exceptional value.

However, when analytics directly influences operational or strategic decisions, additional requirements emerge. Decision-makers need to understand how results are generated, how models respond to change, and how uncertainty is represented. In these situations, transparency becomes a core component of responsible analytics.

Rather than viewing modelling approaches as competing technologies, we see them as complementary tools. Modern data science increasingly focuses on selecting and combining techniques that reflect operational risk and decision accountability.

Looking ahead: evolving without losing confidence

As data and analytics capability continues to expand across the water industry, analytical models will inevitably become more sophisticated. Increasing data availability and improved computational techniques will unlock new opportunities to optimise network performance.

However, this growing sophistication cannot come at the expense of confidence. Trust cannot be added to analytics retrospectively — it must be designed into solutions from the outset.

Future analytics will increasingly rely on hybrid modelling approaches that combine data-driven learning with structured system understanding. Alongside technical innovation, our club members are recognising the

importance of developing sustainable analytical capability within organisations through shared data environments and structured training.

Confidence in analytics develops gradually. It grows when models behave consistently, respond logically to operational events, and support informed discussion between both field and office-based teams.

As water companies continue their digital transformation journeys, long-term success will depend not simply on adopting new innovative technologies, but on embedding analytics into organisational culture in ways that people understand, trust, and confidently use to guide decisions.

Related articles