In this study, we aim to estimate chlorine concentrations (CC) at every junction in water distribution networks (WDN), leveraging CC data at a subset of junctions (i.e., sensors), considering steady state conditions and a single species model. To attain this goal, we introduce two distinct GNN models: i) static prediction (SP) and ii) dynamic prediction (DP) GNN models. As illustrated in Fig. 1, the development of these models encompasses i) the pre-processing stage and ii) the GNN training and testing stage. This section elaborates on these stages, along with the employed GNN model parameters and case study.
GNN Model Formulation
The primary objective of a GNN model is to capture latent node representations of a given graph. These representations are intended to comprehensively encapsulate the underlying information and complex relationships presented in the graph to proficiently make a graph or nodal-level predictions. In the context of this research, the WDN is represented as a graph \(\:G=\left(V,E,X,Z\right)\), where \(\:V=\left\{{v}_{1},\:{v}_{2},\:{v}_{3},\:\dots\:,{v}_{N}\right\}\) is the set of the \(\:N\) nodes (i.e., junctions), \(\:E=\left\{{e}_{uv}|\:\forall\:\:v\in\:V,\:u\in\:N(v)\right\}\) is the set of \(\:M\) edges (i.e., pipes), \(\:X\in\:{R}^{N\times\:D}\) is the set of node features with \(\:D\) dimensions (e.g., junction demand, CC), \(\:Z={R}^{M\times\:L}\) is the set of edge features with \(\:L\) dimensions (e.g., pipe length, diameter). Since water flows through pipes in both directions, WDNs are presented as undirected graphs, in which the junctions’ connectivity is represented by a bidirectional (i.e., symmetrical) edge adjacency matrix (\(\:A\)). In this study, GNN is used to perform node-level predictions to predict CCs employing the features of the neighboring junctions while utilizing the junctions’ connectivity.
GNN model variants
Two GNN models are proposed in this study, the Static Prediction (SP) model and the Dynamic Prediction (DP) model. In the SP model, GNN is utilized to predict CCs considering a fixed set of sensors, where the junctions with available CC information (i.e., sensors) are constant across all events (i.e., graphs). In this model, the user defines the IDs of the sensors, while the CCs of all other junctions are considered unknown and are sought to be predicted by the GNN model. In comparison, the DP model is trained to make predictions based on measurements collected from any set of sensors, such as the case of mobile water quality sensors. This is done by employing a random set of sensors to train the GNN model. Furthermore, the number and locations of the sensors used in the prediction are dynamic. This is achieved by training the model to make predictions based on a range of sensor to non-sensor ratios defined to the model. This ratio is then randomly converted by the model to a certain sensor design (specific sensors’ number and location) on an event basis, allowing for a different sensor design for each training event. In addition, the DP model allows the user to assign a specific set of junctions to either sensors or non-sensor junctions, which gives flexibility to account for fixed sensors.
Dataset Generation
The training and testing of the GNN models necessitate a dataset of a large number of events (i.e., graphs); vital to this process is the incorporation of a diverse array of events in this dataset, empowering the model to predict unforeseen events accurately. In this study, two key aspects distinguish each individual event within the dataset: i) junctions’ demands \(\:\left(d\left(n\right)\right)\): controlled by the demand parameters which define the total demand (\(\:D)\) and the range of how many junctions in the network with demand \(\:\left({z}_{min},{z}_{max}\right)\), ii) chlorine injection rates (\(\:I\left(n\right))\): controlled by the injection parameters which define the injection locations (\(\:m\)), and the range of injection rates at these locations \(\:\left({i}_{min},{i}_{max}\right)\). The determination of the number, location, and demand allocation for the network junctions in addition to the determination of the injection rates at the injection location follow a random selection process according to the user-defined parameters as shown in Eq. (1), and Eq. (2).
$$\:d\left(n\right)=\frac{{r}_{n}}{\sum\:{r}_{n}}\times\:D,\:\:\:{r}_{n}\left\{\begin{array}{c}U\left(\text{0,1}\right),\:\:\:n\in\:\:L\subset\:\left\{\text{1,2},\dots\:,N\right\}\:|\:\left|L\right|=U\left({z}_{min},{z}_{max}\right)\\\:0\:\:\:\:\:\:\:\:\:\:,\:\:\:n\notin\:\:L\subset\:\left\{\text{1,2},\dots\:,N\right\}\:|\:\left|L\right|=U\left({z}_{min},{z}_{max}\right)\end{array}\right.$$
1
$$\:I\left(n\right)=\left\{\begin{array}{c}U\left({i}_{min},{i}_{max}\right),\:\:\:n\in\:m\subset\:\left\{\text{1,2},\dots\:,N\right\}\\\:0\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:,\:\:\:n\notin\:m\subset\:\left\{\text{1,2},\dots\:,N\right\}\end{array}\right.$$
2
Once \(\:d\left(n\right)\) and \(\:I\left(n\right)\) are defined for all events, the Python interface of EPANET (WNTR) is employed to perform the hydraulic and water quality simulation of these events and to extract CCs at all network junctions (Klise et al., 2017). Subsequently, a masking process is implemented to remove the CC values of certain junctions. In the SP model, all the CC values are masked in all events except for sensor junctions defined by the user. On the other hand, in the DP model, different sets of junctions are masked in each event, and this masking process is performed randomly based on the minimum and maximum masking ratio (50% of junctions are masked at 0.5 masking ratio) defined by the used. In addition, the user can enforce a set of junctions to be sensors or non-sensors across all events in the DP model.
GNN model architecture
In this study, the Topology Adaptive Graph Convolution Network (TAGCN) was used as the GNN model architecture (Du et al., 2017). TAGCN was chosen due to its superiority over other GNN architectures such as the Graph Convolutional Network (GCN) and the Graph Attention Network (GAT). Unlike GCN, TAGCN incorporates filters that act as attention coefficients that tailor the contributions of neighboring nodes during the aggregation process to effectively capture distinctive node features and prevent over-smoothing. Although GAT was introduced to address the same GCN limitation, it is not well-suited for large-scale graphs (Du et al., 2017). In contrast, TAGCN showed to scale well while allowing for dynamic adjustment of aggregation parameters on a local scale within each graph region. TAGCN effectively model complex relationships within diverse graphs by learning \(\:K\) number of filters as in Eq. (3), then it uses these filters along with learnable bias to perform predictions as in Eq. (4). The general GNN model architecture used in this study is presented in Fig. 2.
$$\:{\text{G}}_{f,x}^{\left(\mathcal{l}\right)}=\sum\:_{k=0}^{K}\:{g}_{f,x,k}^{\left(\mathcal{l}\right)}\:{\mathbf{A}}^{k}$$
3
$$\:{\text{Y}}_{f}^{\left(\mathcal{l}\right)}=\sigma\:\left(\sum\:_{x=1}^{X}\:{\text{G}}_{f,x}^{\left(\mathcal{l}\right)}\:{\text{x}}_{x}^{\left(\mathcal{l}\right)}+{b}_{f}{1}_{N}\right)$$
4
where \(\:{\text{G}}_{f,x}^{\left(\mathcal{l}\right)}\) is the \(\:{f}^{th}\) graph filter applied to the \(\:{x}^{th}\) feature in the \(\:{\mathcal{l}}^{th}\) layer; \(\:k\) and \(\:K\) are the filter index and the total number of filters; \(\:{g}_{f,x,k}^{\left(\mathcal{l}\right)}\) is the graph filter polynomial coefficients; \(\:{\mathbf{A}}^{k}\) is the normalized adjacency matrix of the graph; \(\:{\text{Y}}_{f}^{\left(\mathcal{l}\right)}\) is the output of the \(\:{\mathcal{l}}^{th}\) layer; \(\:\sigma\:\) denotes a rectified linear unit (\(\:ReLU\)); \(\:{\text{x}}_{x}^{\left(\mathcal{l}\right)}\) is the input data of the \(\:{\mathcal{l}}^{th}\) layer for all nodes for \(\:{x}^{th}\) feature; \(\:{b}_{f}\) is a learnable bias; and \(\:{1}_{N}\) is a unity vector of size \(\:N\). More explanation of the TAGCN mathematical model can be found in (Du et al., 2017).
The primary input of a TAGCN GNN model is the adjacency matrix (\(\:A\)) that is used to calculate the normalized adjacency matrix (\(\:{\mathbf{A}}^{k})\) in Eq. (4). This matrix represents the network’s topology and is equal to \(\:{\left[{a}_{ij}\right]}^{N\times\:N\:}\) with \(\:{a}_{ij}\)equals 1 if node \(\:i\) is connected to node \(\:j\), and 0 otherwise. Additionally, a TAGCN GNN model incorporates node features, which in this study were composed of flows, CCs, and junction indicators. Positive flow values represent junction demands, whereas negative values indicate supply from source junctions (e.g., reservoirs). For CCs, the CC values are used wherever this information is available (e.g., at sensors), and 0 is held elsewhere. For the junction indicator, a value of 1 was used for junctions with known CC (i.e., a sensor), and 0 for junctions with unknown CC. The resulting junction input vector is \(\:\left[Q,\:\:CC,\:\:{J}_{indicator\:}\right]\). Despite the TAGCN GNN model's exclusion of edge features, their inclusion would not have added value due to their consistency across different events.
During the training process, the GNN model utilizes a loss function to predict CCs based on some inputs. In this study, this loss function was defined as the normalized root mean square error \(\:\left(nRMSE\right)\), as presented in Eq. (5). The \(\:nRMSE\) effectively encapsulates the overall CC prediction performance across all junctions and events. However, an additional error metric becomes necessary to assess individual junction prediction errors across various events. Hence, the percentage error (\(\:PE\)) and mean absolute percentage error (\(\:MAPE\)) presented in Eq. (6) and Eq. (7) were used to evaluate these particular prediction errors.
\(\:nRMSE=\frac{{\left[\sum\:_{g=1}^{G}\:\sum\:_{n=1}^{N}\:{\left({CC}_{n}^{act}-{C}_{n}^{sim}\right)}^{2}/N\right]}^{1/2}}{\sum\:_{g=1}^{G}\:\left(\sum\:_{n=1}^{N}\:{CC}_{n}^{act}\right)/N}\) | (5) |
\(\:{PE}_{n}=\frac{{CC}_{n}^{act}-{C}_{J}^{pre}}{{CC}_{n}^{act}}\:\times\:100\:\:\:\:\:\to\:\:{CC}_{n}^{act}>0\:;{PE}_{n}\) is undefined otherwise | (6) |
\(\:{MAPE}_{n}=\frac{\sum\:_{g=1}^{G}\:{{|PE}_{n}|}_{g}}{G}\times\:100\) | (7) |
where \(\:nRMSE\) is the normalized root mean square error; \(\:{CC}^{act}\) and \(\:{CC}^{pre}\) are the actual and predicted chlorine concentrations; \(\:n\) is the junction index out of the \(\:N\) total number of junctions; \(\:g\) is the graph index out of \(\:G\) total number of graphs.