Graph Neural Networks-Based Dynamic Water Quality State Estimation in Water Distribution Networks

doi:10.21203/rs.3.rs-5112794/v1

Download PDF

Research Article

Graph Neural Networks-Based Dynamic Water Quality State Estimation in Water Distribution Networks

https://doi.org/10.21203/rs.3.rs-5112794/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 30 Nov, 2024

Read the published version in Engineering Applications of Artificial Intelligence →

Version 1

posted

You are reading this latest preprint version

The substantial cost associated with deploying and operating sensor networks challenges the pursuit of comprehensive water quality (WQ) management in drinking water distribution networks (WDNs). In this study, we introduce a framework for dynamic WQ state estimation using Graph Neural Networks (GNNs) to reconstruct unmeasured chlorine concentrations (CCs) based on measurements from a limited number of distributed sensors. Two GNN frameworks were developed. In the first framework, the model was trained to conduct Static Prediction (SP) of CCs based on data collected from a specific sensor placement configuration. In the second framework, a GNN model was trained using data from various sensor placement configurations to produce a generalized state estimation model capable of conducting Dynamic Prediction (DP). That is, reconstructing CCs throughout the WDN based on data collected from any sensor placement configuration, even if different from those used in its training. The two models were applied to a benchmark, real-life WDN, with a sensor coverage of only 3%. The results of the two models highlighted their ability to produce accurate predictions for intermediate junctions, while struggling to predict CCs at dead-end junctions. The SP model outperformed the DP model in terms of accuracy, and its predictions showed significant robustness against noisy measurements. On the other hand, the DP model stood out for its flexibility in being applicable to different sensor network designs. Furthermore, the DP model accuracy was found to be highly dependent on the input sensor design, highlighting the potential for its implementation within sensor placement optimization frameworks.

Environmental Engineering

Civil Engineering

Water quality

Water Distribtion Networks

Graph Neural Networks

Water distribution networks (WDNs) are crucial infrastructure systems that sustain the well-being of communities by providing reliable access to treated water. Due to various factors, including pipe corrosion, the formation of biofilms, and the potential for contaminant intrusion, water quality gradually degrades throughout WDNs (García-Ávila et al., 2021). To ensure the delivery of safe water to all demand locations, a minimum residual of chlorine is typically maintained throughout the distribution network (Hallam et al., 2003), with the aim of inhibiting microbial growth and preventing waterborne diseases (García-Ávila et al., 2021). Failing to uphold the chlorine concentration (CC) above the minimum threshold could pose a significant health hazard to the public (Monteiro et al., 2020). On the other hand, elevated chlorine levels can lead to the formation of harmful disinfection byproducts (Islam et al., 2017). Thus, the CC is typically maintained below a maximum concentration of 4 mg/L as set by the U.S. Environmental Protection Agency (USEPA, 1998).

In order to achieve a balance between effective disinfection and avoiding the formation of harmful disinfection byproducts, monitoring and controlling CCs is an integral part of water quality (WQ) management in WDNs (Aisopou et al., 2012). The monitoring process involves continuous measurement of CCs throughout the network, which are then analyzed to determine if an intervention is needed to control these concentrations (e.g., by injecting chlorine through booster stations) (Drewa and Brdys, 2007). Recently, technological advancements allowed operators to remotely measure CCs and other water quality parameters, such as pH and dissolved oxygen, through water quality sensors (Suresh et al., 2014). However, the high cost of deploying and operating WQ sensors makes it practically infeasible to place a sensor at every junction, limiting the ubiquitous monitoring of CCs throughout the network (Rajakumar et al., 2019). Similar limitations are encountered in monitoring various systems, such as power and transportation systems.

To tackle this issue, a State Estimation (SE) approach is commonly used to estimate the unmeasured variables based on the limited measurements provided by available sensors (Yu and Powell, 1994). Implementing the SE approach is vital, as obtaining a precise estimation of a particular system variable (CC in our case) assists in making informed decisions and, accordingly, effectively managing the system of concern. SE approaches were applied in the field of WDNs to estimate flows, pressures, and water quality parameters (Ashraf et al., 2023; D’Souza and Kumar, 2010; Xing and Sela, 2022). In addition, SE was also used as an optimization metric in solving the sensor placement problem to enhance the observability of the water quality dynamics (Taha et al., 2021).

Several techniques have been proposed to solve the SE problem in various systems, which can be broadly classified into either model-based or data-driven SE approaches. A comprehensive review of SE approaches proposed for WDNs can be found in the literature (Tshehla et al., 2017). For the majority of model-based approaches, the SE problem is typically formulated as an inverse modeling problem in which an optimization algorithm is implemented to estimate the unmeasured variables that minimize the differences between the measurements and the predictions made by a physically-based model at sensor locations (Andersen and Powell, 2000; Preis et al., 2011). However, this optimization requires conducting numerous evaluations of computationally expensive numerical models, which prohibits real-time SE. Other model-based studies attempted to get around this issue by mathematically formulating the numerical solution of the partial differential equations describing the WDN dynamics into a system of equations (Vrachimis et al., 2021; Wang et al., 2022). Nevertheless, such approaches are limited to simplified systems and are incapable of fully capturing the complexity of WQ dynamics in WDNs.

On the other hand, data-driven SE approaches typically rely on machine learning methods to learn the patterns exhibited by the variable of concern at various WDN junctions (D’Souza and Kumar, 2010; May et al., 2008; Soyupak et al., 2011). While data-driven SE approaches do not require numerous computations at the prediction stage, they can only be trained to conduct predictions based on a specific sensor design. That is, they are incapable of producing predictions based on data collected from sensor designs other than that on which they were trained. This can be attributed to the fact that data-driven approaches are trained to merely learn the relationships between the inputs and outputs, without developing a representation of the topological structure (i.e., junction-pipe connectivity) of the underlying network system.

Recently, Graph Neural Networks (GNNs) were introduced as an innovative technique to model complex graph-structured data. GNN was initially introduced to perform social network analysis and recommender systems. However, GNN’s ability to learn the underlying relationships between the graphs’ nodes and edges resulted in extending the GNN implementation into other fields, including the estimation of power flow (Donon et al., 2020) and traffic flow (Guo et al., 2019). In recent studies, GNN was applied for estimating head and flow in WDNs considering supervised and semi-supervised approaches (Xing and Sela, 2022), estimating nodal pressure in WDNs (Ashraf et al., 2023), and estimating water loss in WDNs (Fu et al., 2024). In a recent study, GNNs were proposed for conducting chlorine concentration (CC) predictions in WDNs (Li et al., 2024). In this study, a static prediction model was developed to predict CCs based on a specific sensor placement configuration. Therefore, the model was limited by the availability of data from the specific sensor network on which it was trained, restricting its practical application for WQ-SE in WDNs.

In this study, we present the first attempt to develop a GNN-based framework for dynamic water quality state estimation (WQ-SE) in WDNs that can leverage data collected from any sensor placement configuration, even if it was different from that on which the GNN model was originally trained. Our objectives are to i) train a GNN model to accurately predict the CC at non-observed junctions, based on a specific sensor placement configuration (i.e., static prediction model), ii) examine the static prediction model robustness against noisy measurements, iii) train a universal GNN model that can be applied to different sensor configurations (i.e., a dynamic prediction model), iv) investigate the dynamic prediction model’s dependence on the input sensor design, and finally, v) highlight the pros and cons of each model through comparing their results.

In this study, we aim to estimate chlorine concentrations (CC) at every junction in water distribution networks (WDN), leveraging CC data at a subset of junctions (i.e., sensors), considering steady state conditions and a single species model. To attain this goal, we introduce two distinct GNN models: i) static prediction (SP) and ii) dynamic prediction (DP) GNN models. As illustrated in Fig. 1, the development of these models encompasses i) the pre-processing stage and ii) the GNN training and testing stage. This section elaborates on these stages, along with the employed GNN model parameters and case study.

GNN Model Formulation

The primary objective of a GNN model is to capture latent node representations of a given graph. These representations are intended to comprehensively encapsulate the underlying information and complex relationships presented in the graph to proficiently make a graph or nodal-level predictions. In the context of this research, the WDN is represented as a graph $\:G=\left(V,E,X,Z\right)$, where $\:V=\left\{{v}_{1},\:{v}_{2},\:{v}_{3},\:\dots\:,{v}_{N}\right\}$ is the set of the $\:N$ nodes (i.e., junctions), $\:E=\left\{{e}_{uv}|\:\forall\:\:v\in\:V,\:u\in\:N(v)\right\}$ is the set of $\:M$ edges (i.e., pipes), $\:X\in\:{R}^{N\times\:D}$ is the set of node features with $\:D$ dimensions (e.g., junction demand, CC), $\:Z={R}^{M\times\:L}$ is the set of edge features with $\:L$ dimensions (e.g., pipe length, diameter). Since water flows through pipes in both directions, WDNs are presented as undirected graphs, in which the junctions’ connectivity is represented by a bidirectional (i.e., symmetrical) edge adjacency matrix ($\:A$). In this study, GNN is used to perform node-level predictions to predict CCs employing the features of the neighboring junctions while utilizing the junctions’ connectivity.

GNN model variants

Two GNN models are proposed in this study, the Static Prediction (SP) model and the Dynamic Prediction (DP) model. In the SP model, GNN is utilized to predict CCs considering a fixed set of sensors, where the junctions with available CC information (i.e., sensors) are constant across all events (i.e., graphs). In this model, the user defines the IDs of the sensors, while the CCs of all other junctions are considered unknown and are sought to be predicted by the GNN model. In comparison, the DP model is trained to make predictions based on measurements collected from any set of sensors, such as the case of mobile water quality sensors. This is done by employing a random set of sensors to train the GNN model. Furthermore, the number and locations of the sensors used in the prediction are dynamic. This is achieved by training the model to make predictions based on a range of sensor to non-sensor ratios defined to the model. This ratio is then randomly converted by the model to a certain sensor design (specific sensors’ number and location) on an event basis, allowing for a different sensor design for each training event. In addition, the DP model allows the user to assign a specific set of junctions to either sensors or non-sensor junctions, which gives flexibility to account for fixed sensors.

Dataset Generation

The training and testing of the GNN models necessitate a dataset of a large number of events (i.e., graphs); vital to this process is the incorporation of a diverse array of events in this dataset, empowering the model to predict unforeseen events accurately. In this study, two key aspects distinguish each individual event within the dataset: i) junctions’ demands $\:\left(d\left(n\right)\right)$: controlled by the demand parameters which define the total demand ($\:D)$ and the range of how many junctions in the network with demand $\:\left({z}_{min},{z}_{max}\right)$, ii) chlorine injection rates ($\:I\left(n\right))$: controlled by the injection parameters which define the injection locations ($\:m$), and the range of injection rates at these locations $\:\left({i}_{min},{i}_{max}\right)$. The determination of the number, location, and demand allocation for the network junctions in addition to the determination of the injection rates at the injection location follow a random selection process according to the user-defined parameters as shown in Eq. (1), and Eq. (2).

$$\:d\left(n\right)=\frac{{r}_{n}}{\sum\:{r}_{n}}\times\:D,\:\:\:{r}_{n}\left\{\begin{array}{c}U\left(\text{0,1}\right),\:\:\:n\in\:\:L\subset\:\left\{\text{1,2},\dots\:,N\right\}\:|\:\left|L\right|=U\left({z}_{min},{z}_{max}\right)\\\:0\:\:\:\:\:\:\:\:\:\:,\:\:\:n\notin\:\:L\subset\:\left\{\text{1,2},\dots\:,N\right\}\:|\:\left|L\right|=U\left({z}_{min},{z}_{max}\right)\end{array}\right.$$

$$\:I\left(n\right)=\left\{\begin{array}{c}U\left({i}_{min},{i}_{max}\right),\:\:\:n\in\:m\subset\:\left\{\text{1,2},\dots\:,N\right\}\\\:0\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:,\:\:\:n\notin\:m\subset\:\left\{\text{1,2},\dots\:,N\right\}\end{array}\right.$$

Once $\:d\left(n\right)$ and $\:I\left(n\right)$ are defined for all events, the Python interface of EPANET (WNTR) is employed to perform the hydraulic and water quality simulation of these events and to extract CCs at all network junctions (Klise et al., 2017). Subsequently, a masking process is implemented to remove the CC values of certain junctions. In the SP model, all the CC values are masked in all events except for sensor junctions defined by the user. On the other hand, in the DP model, different sets of junctions are masked in each event, and this masking process is performed randomly based on the minimum and maximum masking ratio (50% of junctions are masked at 0.5 masking ratio) defined by the used. In addition, the user can enforce a set of junctions to be sensors or non-sensors across all events in the DP model.

GNN model architecture

In this study, the Topology Adaptive Graph Convolution Network (TAGCN) was used as the GNN model architecture (Du et al., 2017). TAGCN was chosen due to its superiority over other GNN architectures such as the Graph Convolutional Network (GCN) and the Graph Attention Network (GAT). Unlike GCN, TAGCN incorporates filters that act as attention coefficients that tailor the contributions of neighboring nodes during the aggregation process to effectively capture distinctive node features and prevent over-smoothing. Although GAT was introduced to address the same GCN limitation, it is not well-suited for large-scale graphs (Du et al., 2017). In contrast, TAGCN showed to scale well while allowing for dynamic adjustment of aggregation parameters on a local scale within each graph region. TAGCN effectively model complex relationships within diverse graphs by learning $\:K$ number of filters as in Eq. (3), then it uses these filters along with learnable bias to perform predictions as in Eq. (4). The general GNN model architecture used in this study is presented in Fig. 2.

$$\:{\text{G}}_{f,x}^{\left(\mathcal{l}\right)}=\sum\:_{k=0}^{K}\:{g}_{f,x,k}^{\left(\mathcal{l}\right)}\:{\mathbf{A}}^{k}$$

$$\:{\text{Y}}_{f}^{\left(\mathcal{l}\right)}=\sigma\:\left(\sum\:_{x=1}^{X}\:{\text{G}}_{f,x}^{\left(\mathcal{l}\right)}\:{\text{x}}_{x}^{\left(\mathcal{l}\right)}+{b}_{f}{1}_{N}\right)$$

where $\:{\text{G}}_{f,x}^{\left(\mathcal{l}\right)}$ is the $\:{f}^{th}$ graph filter applied to the $\:{x}^{th}$ feature in the $\:{\mathcal{l}}^{th}$ layer; $\:k$ and $\:K$ are the filter index and the total number of filters; $\:{g}_{f,x,k}^{\left(\mathcal{l}\right)}$ is the graph filter polynomial coefficients; $\:{\mathbf{A}}^{k}$ is the normalized adjacency matrix of the graph; $\:{\text{Y}}_{f}^{\left(\mathcal{l}\right)}$ is the output of the $\:{\mathcal{l}}^{th}$ layer; $\:\sigma\:$ denotes a rectified linear unit ($\:ReLU$); $\:{\text{x}}_{x}^{\left(\mathcal{l}\right)}$ is the input data of the $\:{\mathcal{l}}^{th}$ layer for all nodes for $\:{x}^{th}$ feature; $\:{b}_{f}$ is a learnable bias; and $\:{1}_{N}$ is a unity vector of size $\:N$. More explanation of the TAGCN mathematical model can be found in (Du et al., 2017).

The primary input of a TAGCN GNN model is the adjacency matrix ($\:A$) that is used to calculate the normalized adjacency matrix ($\:{\mathbf{A}}^{k})$ in Eq. (4). This matrix represents the network’s topology and is equal to $\:{\left[{a}_{ij}\right]}^{N\times\:N\:}$ with $\:{a}_{ij}$equals 1 if node $\:i$ is connected to node $\:j$, and 0 otherwise. Additionally, a TAGCN GNN model incorporates node features, which in this study were composed of flows, CCs, and junction indicators. Positive flow values represent junction demands, whereas negative values indicate supply from source junctions (e.g., reservoirs). For CCs, the CC values are used wherever this information is available (e.g., at sensors), and 0 is held elsewhere. For the junction indicator, a value of 1 was used for junctions with known CC (i.e., a sensor), and 0 for junctions with unknown CC. The resulting junction input vector is $\:\left[Q,\:\:CC,\:\:{J}_{indicator\:}\right]$. Despite the TAGCN GNN model's exclusion of edge features, their inclusion would not have added value due to their consistency across different events.

During the training process, the GNN model utilizes a loss function to predict CCs based on some inputs. In this study, this loss function was defined as the normalized root mean square error $\:\left(nRMSE\right)$, as presented in Eq. (5). The $\:nRMSE$ effectively encapsulates the overall CC prediction performance across all junctions and events. However, an additional error metric becomes necessary to assess individual junction prediction errors across various events. Hence, the percentage error ($\:PE$) and mean absolute percentage error ($\:MAPE$) presented in Eq. (6) and Eq. (7) were used to evaluate these particular prediction errors.

$\:nRMSE=\frac{{\left[\sum\:_{g=1}^{G}\:\sum\:_{n=1}^{N}\:{\left({CC}_{n}^{act}-{C}_{n}^{sim}\right)}^{2}/N\right]}^{1/2}}{\sum\:_{g=1}^{G}\:\left(\sum\:_{n=1}^{N}\:{CC}_{n}^{act}\right)/N}$	(5)
$\:{PE}_{n}=\frac{{CC}_{n}^{act}-{C}_{J}^{pre}}{{CC}_{n}^{act}}\:\times\:100\:\:\:\:\:\to\:\:{CC}_{n}^{act}>0\:;{PE}_{n}$ is undefined otherwise	(6)
$\:{MAPE}_{n}=\frac{\sum\:_{g=1}^{G}\:{{\|PE}_{n}\|}_{g}}{G}\times\:100$	(7)

where $\:nRMSE$ is the normalized root mean square error; $\:{CC}^{act}$ and $\:{CC}^{pre}$ are the actual and predicted chlorine concentrations; $\:n$ is the junction index out of the $\:N$ total number of junctions; $\:g$ is the graph index out of $\:G$ total number of graphs.

The C-town benchmark network (Ostfeld et al., 2012) was used to test the performance of the developed GNN model. The network has 5 district meter areas (DMAs), each with a pumping system, with a monthly average total demand of 175 L/s. The network consists of 388 junctions, 429 pipes, 4 valves, one reservoir, and 7 tanks (Figure S1). The C-town network model has multiple controls to manage the value operations based on the tank status (i.e., filling or emptying). Since the steady-state condition is assumed in this study, the tanks and the associated controls were eliminated. For this case study, it is assumed that chlorine is being introduced to the network at the main water source (i.e., reservoir) with a minimum and maximum concentration of 1 mg/L and 4 mg/L, respectively, the actual value of the chlorine injection varies from event to another and is selected randomly as previously explained. For the chlorine reaction kinetics, a first-order decay rate is assumed with global bulk and wall coefficients of 0.55 $\:{day}^{-1}$ and 0.3 $\:m/day$ respectively (Grayman, 1995).

Test Scenarios

In this study, four different investigations were conducted to (i) establish the performance of the SP model for CC state estimation by comparing its predictions to the actual chlorine concentrations, (ii) check the influence of measurement noise on the performance of the SP model by introducing normally distributed noise, (iii) examine the performance of the DP model by comparing its predictions to that of the SP model, (iv) test the sensitivity of the DP model to the input sensor design and its applicability in performing sensor optimization by comparing the DP model predictions of three different sensor designs.

Sensor designs

The Static Prediction (SP) model requires a prior definition of the sensors’ number and location, integrating this data as its input. On the other hand, the Dynamic Prediction (DP) model can undergo training without this information. Nevertheless, it is important to define a sensor configuration when evaluating the DP model's performance and more importantly, to compare its results to the SP model. Although sensor optimization does not fall within the primary focus of this study, our objective was to employ a decent sensor design that ensures good representation for all junctions. This approach was taken to minimize the influence of sensor design on the predictions generated by the GNN models. To achieve that, the Spectral Clustering approach (Von Luxburg, 2007) was utilized in sensor design A in which the junctions were grouped into 9 clusters, and then, a sensor was placed in the center of each cluster, while the 10th sensor was placed at the injection source. By relating the resulting sensor design to the network district meter areas (Fig. 3), we observe that at least one sensor was placed in each DMA, which provides the desired spread.

Sensor designs B and C were also introduced to test the DP model sensitivity to different sensor designs. The objective in these two designs was to achieve better monitoring at localized DMA (e.g., 1 and 2, respectively) by concentrating the sensors at these DMAs. Figure 3 shows the location of the different sensor designs with respect to the DMAs. All taken together, sensor design A will be used to train the SP model and will be applied along with sensor designs B and C to test the DP model.

Model Parameters

As elaborated in section ‎2.1.2, a substantial volume of events is essential for the creation of the training and testing datasets of the GNN models. The parameters employed to construct these datasets are presented in Table 1. Furthermore, the GNN training is controlled by various hyperparameters, such as the number of GNN layers and hidden channels within each layer. Hyperparameter tuning was conducted to find the optimal set of hyperparameters for this study. The used hyperparameters are summarized in Table 2.

Table 1

Generation and masking parameters
Generation Parameter	Value	SP model Masking Parameter
Total number of events	50,000	Follows sensor designs in Fig. 3
Total demand	175 L/s
Range of demand junctions’ number	[338–388]	DP model Masking Parameter	Value
Injection locations	R1	Minimum and maximum masking ratio	(0, 0.98)
Injection rates range	[1,4]	Forced masked/unmasked junctions	None

Table 2

GNN hyperparameters
Number of:	Value	Parameter	Value
Dense layers	1	Learning rate	0.001
TAGCN layers	4	Train/Test split ratio	0.8
Filters	3	Total epochs	100
Hidden channels	100

1. Static Prediction (SP) Model

In this section, the SP model performance is presented by comparing its predictions to the actual chlorine concentrations (CCs). Figure 4 depicts a one-to-one plot featuring this comparison for (a) sensors and (b) non-sensor junctions, with the actual CC on the x-axis and the predicted CC on the y-axis, along with the 45-degree line which symbolizes a perfect match. The SP model predictions almost perfectly matched the actual CCs for the sensor junctions (Fig. 4a) with a $\:nRMSE$ of 0.007. This very high accuracy can be attributed to the availability of actual CC data at these junctions, which was provided as input to the model. For the non-sensor junctions (Fig. 4b), the SP model showed notably lower accuracy in predicting the actual CCs compared to sensor junctions. Nevertheless, the error was still fairly low ($\:nRMSE$ 0.029) reflecting the reliability of the SP model. The fact that CCs slightly diverged from the actual CCs at the sensor junctions is also worth highlighting as it reflects a non-trivial relationship between inputs and outputs within the developed GNN model. This is also reflected in the predictions for the non-sensor junctions, where input CCs are zeros while output CCs are fairly accurate.

Figure 4b shows that the SP model predictions of high CCs (e.g., the upper left portion of the plot) were relatively more accurate than those with low CCs. This is because high CCs are associated with the junctions near the injection source where the decay is not very prominent. On the other side, the SP model predictions of zero CCs (e.g., the lower right portion of the plot) associated with the dead-end junctions were significantly less accurate. This weak performance highlights the SP model's struggle to differentiate between zero demands at intermediate junctions and dead-end junctions. In the case of dead-end junctions with zero demands, the water is stagnant, and accordingly, the CCs are zero (i.e., for the case of steady state), whereas for intermediate junctions with zero demands, the water is flowing, yet CCs are non-zero, leading the SP model to predict non-zero CCs for the dead-end junctions.

Switching from the overall performance and focusing on the inter-junction performance among all the events by investigating the mean absolute percentage error $\:\left(MAPE\right)$, Fig. 5 shows a heatmap of the SP model predictions $\:MAPE$ for all junctions. This figure shows very small errors were achieved for the majority of the intermediate junctions $\:(MAPE<5\%)$. Figure 5 also shows that the $\:MAPE$ is correlated to the distance to the nearby sensors, where junctions closer to the sensors tend to have smaller errors compared to further junctions. Similar to Fig. 4b, the SP model poorly performed at the dead-end junctions resulting in high errors.

Looking at the distribution of the Percentage Error ($\:PE$) across all tested events (10,000 events), Figure S2 shows that 96% of the junctions had their median $\:PE$ between $\:\pm\:3\%$ (J144* to J27). This highlights the SP model prediction’s consistency across different events. Moreover, only $\:2\%$ of the data points (data point: junction prediction of an event) had errors of more than $\:\pm\:30\%$. Figure 6 shows the $\:PE$ distribution of a subset of junctions, in which they are classified into three groups based on the prediction trend; group A: represents the ordinarily predicted junctions with $\:\text{m}\text{e}\text{d}\text{i}\text{a}\text{n}\:PE$ between $\:\pm\:3\%$., group B: features overly predicted junctions with median $\:PE<-3\%$, and group C: represents underly-predicted junctions with median $\:PE>3\%$.

All intermediate junctions were included in Group A with the majority of dead-end junctions, resulting in a total of 373 junctions, whereas groups B and C only included 8 junctions each, featuring some of the dead-end junctions. By projecting the three groups on the network layout in Fig. 6, it was found that the overly and underly predicted junctions form pairs, where one dead-end junction lies in group B, and the other lies in group C. More interestingly, the SP model predictions seemed to follow a systematic over/under estimation pattern. Figure S3 focuses on two branching regions and shows the lengths of the pipes connecting the dead-end junctions to the main junction, and the one-to-one plot for these junctions. The figure shows that the SP model systematically overestimated the CCs of the longer pipes while underestimating the CCs of the shorter pipes.

2. Noisy sensor measurements

In this section, we test the SP model robustness against noisy sensor measurements by enforcing a normally distributed measurement noise with a mean of 0 and a standard deviation (SD) of 0.5 mg/L in the training data. After training the SP model, it was then tested on different noise distributions with SD of 0, 0.2 and 0.5 mg/L to examine its performance on unforeseen noise distributions.

Figure 7 shows the heatmaps of the SP model predictions $\:MAPE$ for the three different SD. This figure shows that accounting for noisy measurements with an SD of 0.2 mg/L didn’t significantly increase the $\:MAPE$ compared to the case of zero noise. In contrast, the same figure shows that the $\:MAPE$ significantly increased when the noise SD increased to 0.5 mg/L and most of the junctions had a $\:MAPE$ between 10% and 30%. Moreover, by comparing the zero noise predictions of this SP model (i.e., trained on noisy data) shown in Fig. 7a to those of the previous SP model (i.e., trained without noise) shown in Fig. 5, it was noticed that the latter model predictions were more accurate since it was trained on non-noisy measurements.

Considering the case with the highest noise (SD = 0.5 mg/L), the SP model predictions were compared to the physical model results (EPANET simulations). This was done by re-running the physical model considering the noisy measurement as the injection CC at the injection source (R1). Interestingly, the SP model predictions were more accurate than the physical model (Figure S4), with $\:nRMSE$ of 0.162 and 0.209, respectively. This highlights the advantage of using the SP model over the physical model as it takes advantage of the measurements from all sensors, unlike the physical model that only utilizes the measurement at the injection source to predict the CCs of all network junctions.

3. Dynamic Prediction (DP) Model

In this section, the DP model predictions are compared to the actual chlorine concentrations (CCs) and those of the SP model. Figure 8 depicts a one-to-one for the predicted and actual CCs for sensors and non-sensor junctions. This figure shows that the DP model predictions were reasonably accurate for both sensors and non-sensor junctions with $\:nRMSE$ of 0.026 and 0.078, respectively. Comparing the DP model to the SP model, Fig. 8a shows that the DP model predictions for the sensors’ junctions were not as accurate as those of the SP model (Fig. 4a). This can be attributed to the fact that the DP model was trained on different sensor designs, unlike the SP model which was trained on the same sensor design. A similar observation can be noticed for the non-sensor junctions (Fig. 8b and Fig. 4b).

Investigating the inter-junctions prediction errors by looking at the $\:MAPE$ heatmaps of the DP model and the SP model shown in Fig. 9, a noticeable accuracy difference can be seen in favor of the SP model, which aligns with the one-to-one results. However, the two models were similar in having relatively higher errors at dead-end junctions. Looking at the $\:PE$ distribution of the DP model predictions (Figure S5) and following the same grouping criteria as earlier, group A (median $\:PE$ between $\:\pm\:3\%$) contained 43% of the junctions, whereas group B (median $\:PE<-3\%$) and group C (median $\:PE>3\%$) contained 48% and 9% of the junctions, respectively. These results show that the DP model predictions are less consistent across different events, featuring significantly wider $\:PE$ distribution. Moreover, no over/under estimation pattern was observed for the DP model predictions (Figure S6).

4. Effect of sensor design

Although the DP model predictions were outperformed by those of the SP model, the DP model has the advantage of being applicable to different sensor designs. Thus, the DP can be useful in the optimization of sensor network design. In this section, the DP model sensitivity to the input sensor design is tested by comparing the previous DP model predictions that utilize sensor design A (presented above) to those that utilize sensor designs B and C (Fig. 3). Figure 10 shows the heatmaps of the DP model predictions $\:MAPE$ after applying sensor designs B and C. Even though the same DP model was used, significantly different error patterns were realized. This highlights the impact of the sensor design on the prediction accuracy, and proves the DP model's ability to be used in sensor optimization as it had different responses to the different input designs,

Figure 10 also shows that concentrating the sensors in one DMA has significantly enhanced the prediction accuracy of this DMA while deteriorating the prediction accuracy of the other DMAs. These results show the importance of using appropriate sensor design in order to gain an overall accurate prediction for all network junctions, such as sensor design A (Fig. 9). Taken together, the results revealed that the DP model is superior to the SP model in the aspect of having the flexibility to evaluate the performance of various sensor designs, demonstrating its applicability in the context of sensor placement optimization.

In this study, we tackle the problem of estimating chlorine concentration (CCs) in water distribution networks (WDNs) by utilizing Graph Neural Networks (GNN). This was achieved by predicting CCs at all network junctions relying solely on the information from a limited set of sensors. Two GNN models were developed to predict CCs, namely, i) Static Prediction (SP) model, and ii) Dynamic Prediction (DP) model. The SP model predicts CCs at non-sensor junctions considering the information from a specific sensor configuration (i.e., sensor design) on which it was initially trained, and hence, it requires the sensor design to be defined in advance. On the other hand, the DP model has the capability to predict CCs utilizing information from different sensor designs, omitting the need for specifying the sensor design in the training process. The performance of the two models was tested by applying them to the C-Twon benchmark network.

The SP model results revealed the model’s ability to accurately predict CCs for the non-sensor junctions with minor errors. It was evident that the SP model predictions of the intermediate junctions were significantly more accurate than the dead-end junctions. The investigation of the error distribution of each junction among all events highlighted the SP model’s ability to precisely predict CCs across different events. However, the SP model predictions were shown to systematically overestimate dead-end junctions connected to longer pipes and underestimate those connected to shorter pipes. Furthermore, the SP model showed a high robustness to noisy measurements, and its predictions were shown to be more accurate than the physical model when high noise was implemented. In terms of accuracy and prediction consistency, the DP model delivered fairly accurate results, although it was outperformed by the SP model as it had higher errors and broader error distribution. Nevertheless, the DP model stood out for its flexibility to be applied to different sensor designs. The DP model also showed a rational response when applied to different sensor designs, showing its potential utilization in sensor placement optimization.

Future studies are encouraged to address the current study limitation of steady-state conditions to better reflect WDN dynamics by incorporating the variation of demands and injection concentrations over time. Other studies can also explore future GNN architectures when they become available and compare their performance to the one used in this study. Finally, future research can utilize the introduced SP model in performing multi-species water quality state estimation, in addition to employing the DP model in evaluating the performance of different sensor design alternatives to obtain the optimal sensor design.

Acknowledgments.

Financial support by the National Science Foundation under grants number 2015603 and 2151392 is gratefully acknowledged.

Aisopou A, Stoianov I, Graham NJD (2012) In-pipe water quality monitoring in water supply systems under steady and unsteady state flow conditions: A quantitative assessment. Water Res 46:235–246. https://doi.org/10.1016/j.watres.2011.10.058
Andersen JH, Powell RS (2000) Implicit state-estimation technique for water network monitoring. Urban Water 2:123–130. https://doi.org/10.1016/s1462-0758(00)00050-9
Ashraf I, Hermes L, Artelt A, Hammer B (2023) Spatial Graph Convolution Neural Networks for Water Distribution System. Adv Intell Data Anal XXI IDA 202:13876. https://doi.org/10.1007/978-3-031-30047-9_3
D’Souza CD, Kumar MSM (2010) Comparison of ANN models for predicting water quality in distribution systems. J / Am Water Work Assoc 102:92–106. https://doi.org/10.1002/j.1551-8833.2010.tb10152.x
Donon B, Clément R, Donnot B, Marot A, Guyon I, Schoenauer M (2020) Neural networks for power flow: Graph neural solver. Electr Power Syst Res 189:106547. https://doi.org/10.1016/j.epsr.2020.106547
Drewa M, Brdys MA (2007) Optimized allocation of chlorination stations for integrated quantity and quality control in drinking water distribution systems, IFAC Proceedings Volumes (IFAC-PapersOnline). IFAC. https://doi.org/10.3182/20070723-3-pl-2917.00011
Du J, Zhang S, Wu G, Moura JMF, Kar S (2017) Topol Adapt Graph Convolutional Networks 1–13
Fu M, Zhang Q, Rong K, Mundher Z, Zheng L (2024) Engineering Applications of Artificial Intelligence Integrated dynamic multi-threshold pattern recognition with graph attention long short-term neural memory network for water distribution network losses prediction : An automated expert system. Eng Appl Artif Intell 127:107277. https://doi.org/10.1016/j.engappai.2023.107277
García-Ávila F, Avilés-Añazco A, Ordoñez-Jara J, Guanuchi-Quezada C, Flores del Pino L, Ramos-Fernández L (2021) Modeling of residual chlorine in a drinking water network in times of pandemic of the SARS-CoV-2 (COVID-19). Sustain Environ Res 31. https://doi.org/10.1186/s42834-021-00084-w
Grayman WM, M O D E L I N G C H L O R I N E R E S I D U A L S IN, Lewis A, Rossman, ~ Member ASCE, Robert M (1995). Clark, 2 storage facility. In calibrating their model, they noted that smaller pipes off of the ma 120, 803–820
Guo S, Lin Y, Feng N, Song C, Wan H (2019) Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. 33rd AAAI Conf. Artif. Intell. AAAI 2019, 31st Innov. Appl. Artif. Intell. Conf. IAAI 2019 9th AAAI Symp. Educ. Adv. Artif. Intell. EAAI 2019 922–929. https://doi.org/10.1609/aaai.v33i01.3301922
Hallam NB, Hua F, West JR, Forster CF, Simms J (2003) Bulk Decay of Chlorine in Water Distribution Systems. J Water Resour Plan Manag 129:78–81. https://doi.org/10.1061/(asce)0733-9496(2003)129:1(78)
Islam N, Sadiq R, Rodriguez MJ (2017) Optimizing Locations for Chlorine Booster Stations in Small Water Distribution Networks. J Water Resour Plan Manag 143:1–16. https://doi.org/10.1061/(asce)wr.1943-5452.0000759
Klise KA, Hart DB, Moriarty D, Bynum M, Murray R, Burkhardt J, Haxton T (2017) Water Network Tool for Resilience (WNTR) user manual
Li Z, Liu H, Zhang C, Fu G (2024) Real-time water quality prediction in water distribution networks using graph neural networks with sparse monitoring data. Water Res 250:121018. https://doi.org/10.1016/j.watres.2023.121018
May RJ, Dandy GC, Maier HR, Nixon JB (2008) Application of partial mutual information variable selection to ANN forecasting of water quality in water distribution systems. Environ Model Softw 23:1289–1299. https://doi.org/10.1016/j.envsoft.2008.03.008
Monteiro L, Carneiro J, Covas DIC (2020) Modelling chlorine wall decay in a full-scale water supply system. Urban Water J 17:754–762. https://doi.org/10.1080/1573062X.2020.1804595
Ostfeld A, Salomons E, Ormsbee L, Uber JG, Bros CM, Kalungi P, Burd R, Zazula-Coetzee B, Belrain T, Kang D, Lansey K, Shen H, McBean E, Wu Y, Walski Z, Alvisi T, Franchini S, Johnson M, Ghimire JP, Barkdoll SR, Koppel BD, Vassiljev T, Kim A, Chung JH, Yoo G, Diao DG, Zhou K, Li Y, Liu J, Chang Z, Gao K, Qu J, Yuan S, Prasad Y, Laucelli TD, Lyroudia DV, Kapelan LS, Savic Z, Berardi D, Barbaro L, Giustolisi G, Asadzadeh O, Tolson M, McKillop BA, R (2012) Battle of the Water Calibration Networks. J Water Resour Plan Manag 138:523–532. https://doi.org/10.1061/(asce)wr.1943-5452.0000191
Preis A, Whittle AJ, Ostfeld A, Perelman L (2011) Efficient Hydraulic State Estimation Technique Using Reduced Models of Urban Water Networks. J Water Resour Plan Manag 137:343–351. https://doi.org/10.1061/(asce)wr.1943-5452.0000113
Rajakumar AG, Mohan Kumar MS, Amrutur B, Kapelan Z (2019) Real-Time Water Quality Modeling with Ensemble Kalman Filter for State and Parameter Estimation in Water Distribution Networks. J Water Resour Plan Manag 145:1–12. https://doi.org/10.1061/(asce)wr.1943-5452.0001118
Soyupak S, Kilic H, Karadirek IE, Muhammetoglu H (2011) On the usage of artificial neural networks in chlorine control applications for water distribution networks with high quality water. J Water Supply Res Technol - AQUA 60:51–60. https://doi.org/10.2166/aqua.2011.086
Suresh M, Manohar U, Anjana GR, Stoleru R, Mohan Kumar MS (2014) A cyber-physical system for continuous monitoring of Water Distribution Systems. Int Conf Wirel Mob Comput Netw Commun 570–577. https://doi.org/10.1109/WiMOB.2014.6962227
Taha AF, Wang S, Guo Y, Summers TH, Gatsis N, Giacomoni MH, Abokifa AA (2021) Revisiting the Water Quality Sensor Placement Problem: Optimizing Network Observability and State Estimation Metrics. J Water Resour Plan Manag 147:1–13. https://doi.org/10.1061/(asce)wr.1943-5452.0001374
Tshehla KS, Hamam Y, Abu-Mahfouz AM (2017) State estimation in water distribution network: A review. Proc. – 2017 IEEE 15th Int. Conf. Ind. Informatics, INDIN 2017 1247–1252. https://doi.org/10.1109/INDIN.2017.8104953
USEPA (1998) National Primary Drinking Water Regulations: Disinfectants and Disinfection Byproducts Notice of Data Availability. Fed. Regist. https://doi.org/https://www.govinfo.gov/content/pkg/FR-1998-03-31/pdf/98-8215.pdf
Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17:395–416. https://doi.org/10.1007/s11222-007-9033-z
Vrachimis SG, Eliades DG, Polycarpou MM (2021) Calculating Chlorine Concentration Bounds in Water Distribution Networks: A Backtracking Uncertainty Bounding Approach. Water Resour Res 57:1–22. https://doi.org/10.1029/2020WR028684
Wang S, Taha AF, Gatsis N, Sela L, Giacomoni MH (2022) Probabilistic State Estimation in Water Networks. IEEE Trans Control Syst Technol 30:507–519. https://doi.org/10.1109/TCST.2021.3066102
Xing L, Sela L (2022) Graph Neural Networks for State Estimation in Water Distribution Systems: Application of Supervised and Semisupervised Learning. J Water Resour Plan Manag 148:1–14. https://doi.org/10.1061/(asce)wr.1943-5452.0001550
Yu G, Powell RS (1994) Optimal design of meter placement in water distribution systems. Int J Syst Sci 25:2155–2166. https://doi.org/10.1080/00207729408949342

The authors declare no competing interests.

SuppMat.docx
Supplemental Materials

Download PDF

Journal Publication

published 30 Nov, 2024

Read the published version in Engineering Applications of Artificial Intelligence →

Version 1

posted

You are reading this latest preprint version

Graph Neural Networks-Based Dynamic Water Quality State Estimation in Water Distribution Networks

Status:

Journal Publication

Version 1

Abstract

Figures

1. Introduction

2. Methodology

3. Case Study

4. Results and Discussion

1. Static Prediction (SP) Model

2. Noisy sensor measurements

3. Dynamic Prediction (DP) Model

4. Effect of sensor design

5. Conclusions

Declarations

Acknowledgments.

References

Additional Declarations

Supplementary Files

Status:

Journal Publication

Version 1

\(\:nRMSE=\frac{{\left[\sum\:_{g=1}^{G}\:\sum\:_{n=1}^{N}\:{\left({CC}_{n}^{act}-{C}_{n}^{sim}\right)}^{2}/N\right]}^{1/2}}{\sum\:_{g=1}^{G}\:\left(\sum\:_{n=1}^{N}\:{CC}_{n}^{act}\right)/N}\)	(5)
\(\:{PE}_{n}=\frac{{CC}_{n}^{act}-{C}_{J}^{pre}}{{CC}_{n}^{act}}\:\times\:100\:\:\:\:\:\to\:\:{CC}_{n}^{act}>0\:;{PE}_{n}\) is undefined otherwise	(6)
\(\:{MAPE}_{n}=\frac{\sum\:_{g=1}^{G}\:{{\|PE}_{n}\|}_{g}}{G}\times\:100\)	(7)