Automated simulation framework for flood risk mapping integrating aerial point clouds and deep learning

doi:10.21203/rs.3.rs-3440161/v1

Download PDF

Research Article

Automated simulation framework for flood risk mapping integrating aerial point clouds and deep learning

https://doi.org/10.21203/rs.3.rs-3440161/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 26 Jan, 2024

Read the published version in Water Resources Management →

You are reading this latest preprint version

In recent years, floods have brought renewed attention and requirement for real-time and city-scaled flood forecasting, due to climate change and urbanization. Flood risk mapping through traditional physics-based modeling methods is often unrealistic for rapid emergency response requirements, because of long model runtime, hydrological information lacking, and terrain change caused by human activity. In this study, an automated simulation framework is proposed by integrating aerial point clouds and deep learning technique that is capable of superior modeling efficiency and analysis accuracy for flood risk mapping. The framework includes four application modules, i.e., data acquisition and preprocessing, point clouds segmentation, digital elevation model (DEM) reconstruction, and hydrodynamics simulation. To more clearly demonstrate the advantages of the proposed automated simulation framework, a case study is conducted in a local area of the South-to-North Water Transfer Project in China. In addition, the efficiency and accuracy of the suggested point cloud segmentation network for large-scale 3D point clouds in basin scenes are discussed in detail by comparison with PointNet and PointNet + + networks.

Flood risk mapping

point clouds segmentation

DEM reconstruction

hydrodynamics simulation

Flooding is one of the widespread natural hazards across the world, leading to serious threats to humans’ lives and property, environment, and infrastructures (Huang et al. 2022; Mudashiru et al. 2021). Since 1995, floods accounting for 43% of all-natural hazards have threatened 2.3 billion people (Jakovljevic et al. 2019). More importantly, urbanization and human activity have caused significant changes in the surface runoff and rainstorm intensity detected in many catchments in recent years (Bodoque et al. 2023; Lin et al. 2023). For example, the South-to-North Water Diversion Project in China involves drawing water from southern rivers and supplying it to the dry north, which truncates many natural rivers along the project. Also, large areas of the natural river channel have been encroached upon by farmlands or buildings, seriously hindering the discharge of flood water in the rainy season.

Flood simulation is critically dependent on modeling, and the accuracy and resolution of the digital elevation model (DEM) have a significant influence on the simulation results such as inundated area, flow velocity, inundated depth, flow patterns, and so on (Huang et al. 2023; Xu et al. 2014; Zhao and Wang 2022). However, the DEM for flood simulation usually relies on static data from a geographic information system (GIS), which may be outdated and cannot reflect the current environment or the changes in runoff characteristics that are critical to the flood routing analysis. In addition, more accuracy requirements of DEM are needed for flood risk mapping in low-lying or flat areas (Huang et al. 2023). Referring to the European handbook for flood modeling (European Commission 2007), the resolution requirements are $10\text{m}\times 10\text{m}$ horizontally and 0.5 m vertically. Thus, it is necessary to explore a method to fast build the DEM for flood simulation and to extract the current terrain information and land covers. The two common methods for reconstructing the three-dimensional (3D) environment features are light detection and ranging (LiDAR) and oblique photography. Nowadays, to improve the reconstruction accuracy of DEM, unmanned aerial vehicles (UAVs) are used to install LiDAR devices or optical cameras for collecting 3D environmental features by aerial surveys with high resolution and accuracy (Bodoque et al. 2023; Dubey et al. 2021). Compared with LiDAR point clouds, the point clouds from oblique photography have the benefits of lower acquisition cost, higher density, color information, and building facade information.

For UAV-based aerial surveys, point clouds are an important data collection type, represented as a set of 3D points and additional features (i.e., intensity, return number, or color). In flood risk analysis, it is required to assign various physical parameters to geometric models based on the point cloud segmentation results. Thus, it is necessary to extract semantic and geometric information from point clouds. For semantic segmentation, the traditional methods require professional participation and are challenging to automate. Nowadays, with the development of computer vision techniques, a lot of automatic semantic segmentation methods have been used in point cloud segmentation. For example, a fully convolutional network was used in point cloud segmentation with an average error of 5.21% by converting the point clouds into numerous images (Rizaldy et al. 2018). Also, an artificial neural network was proposed for object detection of point clouds by feeding the network with point clouds that contain position, color, and laser reflectance power (Sofman et al. 2006). However, the calculation of point-wise geometric features through the traditional machine learning has low efficiency and is based on a limited range of local features. In recent years, deep learning has been extensively used in semantic segmentation, and the networks for 3D semantic segmentation include Voxnet (He et al. 2020), PointNet (Kashefi and Mukerji 2022; Kowalczuk and Szymański 2019), and PointNet++ (Hao et al. 2023; Zhang and Wang 2020). Such technique has also been applied for semantic segmentation of urban building environments (Sun et al. 2021), and has shown its power in understanding 3D geometries and engineering applications.

The fast flood simulation mainly requires the current terrain reconstruction (i.e., DEM) and land cover identification. To this end, an automated simulation framework is proposed in this study for flood risk analysis based on aerial point clouds and deep learning. In this study, a deep learning algorithm is applied for the semantic segmentation of point clouds to categorize land usage. Then, an improved DEM reconstruction method is applied for flood modeling based on the refined terrain point clouds after extraction and densification. At last, based on the coordinate mapping of the segmentation results, automatic mesh refinement and parameter setting are used in hydrodynamics simulation. The findings of this study provide an automated modeling and analysis framework for flood simulation, which can further support the decision-making for flood disasters.

2.1 Automated simulation framework for flood risk analysis

In order to meet the requirements of real-time and city-scaled flood forecasting with due consideration of climate change and urbanization, the proposed automated simulation framework for flood risk analysis contains four modules as shown in Fig. 1. More details are described below:

(1) Data acquisition and preprocessing: Because of more information in color and building facade, UAV-based oblique photography is performed in this study to acquire the point cloud data needed for flood modeling. This method provides a more efficient and convenient way to collect the 3D up-to-date data of a target area than traditional GIS for flood risk analysis.

(2) Point cloud segmentation based on deep learning: Flood simulation is required to extract the areas of ground, vegetation and buildings, so as to be assigned different runoff parameters. Then, the first step is to perform the point cloud segmentation based on RandLA-Net (Hu et al. 2019), a deep learning-based encoder-decoder network. RandLA-Net is an efficient and lightweight neural network structure that can directly infer point-by-point semantics for large-scale point clouds. RandLA-Net can remarkably reduce the work and enhances the local feature capture. Generally, the point clouds are split into several parts: grounds, residential lands, farmlands, roads, shrubs, and concrete. More details are illustrated in Section 2.2.

(3) Point cloud filtering and DEM reconstruction: High-precision DEM is the basis for flood simulation. An improved technical scheme for high-precision DEM reconstruction is proposed herein. At first, original point clouds are pre-filtered based on the segmentation results, where the points of residential buildings, shrubs and concrete structures are removed. Then, a hierarchical smoothing filtering algorithm is proposed to extract the ground points. As the skeleton of terrain, topographic feature lines are extracted by the planar surface fitting and intersecting method to more accurately describe the abrupt changes of mountainous terrain. At last, the high-precision DEM is reconstructed by integrating the feature lines, and the points of potential flood routing are densified by the inverse distance weighting method. More details are illustrated in Section 2.3.

(4) Flood simulation based on hydrodynamics: The high-precision DEM in TIN format is imported into MIKE 21 software. Grids are generated by partitions using its automatic mesh function and are assigned with the different Manning coefficients according to point cloud segmentation results. Generally, the model settings are implemented by an automated script.

2.2 Point cloud segmentation based on deep learning

Generally, RandLA-Net uses an encoder-decoder architecture with skip connections, and its structure is illustrated in Fig. 2, in which (N, D) represents the number of points and the number of feature dimensions, respectively. As shown in Fig. 2, the input point cloud (N, 3) is first fed into a fully connected layer, and then four encoding and decoding layers are employed to learn the features of each point. Finally, three fully connected layers with a dropout layer are utilized to predict the semantic label of each point. The random down-sampling technique used in RandLA-Net is a key facilitator for processing massive point clouds since it requires less computational time and cost compared with other sampling techniques. Additionally, the introduced local feature aggregation (LFA) module preserves complicated local structures by gradually increasing the receptive field for each 3D point (Hu et al. 2019).

As shown in Fig. 3, the LFA module consists of three neural units (i.e., local spatial encoding, attentive pooling, and dilated residual block). Among them, the coordinates of all neighboring points are embedded into the local spatial encoding unit to explicitly observe the local geometric patterns, thus eventually helping the entire network to learn the complex local features effectively. For more explanation, the local spatial encoding unit operates the following steps: (1) Finding neighboring points. For the ${i}^{th}$ point, the simple K-nearest neighbors (KNN) algorithm is used to gather its neighboring points firstly, based on the point-wise Euclidean distances. (2) Relative point position encoding. For each nearest point $\{{p}_{i}^{1}\cdots {p}_{i}^{k}\cdots {p}_{i}^{K}\}$, the relative position with the center point ${p}_{i}$ is explicitly encoded by Eq. (1). (3) Point feature augmentation. For each nearest point ${p}_{i}^{k}$, the encoded relative point position ${r}_{i}^{k}$ is concatenated with its corresponding point feature ${f}_{i}^{k}$, obtaining an augmented feature vector ${\widehat{f}}_{i}^{k}$. Eventually, the output of the local spatial encoding unit is a new set of neighboring features ${\widehat{F}}_{i}=\{{\widehat{f}}_{i}^{1}\cdots {\widehat{f}}_{i}^{k}\cdots {\widehat{f}}_{i}^{K}\}$ that explicitly encodes the local geometric features for the center point ${p}_{i}$. Then, the attentive pooling unit gathers the set of neighboring point features ${\widehat{F}}_{i}$ by a shared function for learning a unique attention score for each feature. At last, the LFA module stack multiple local spatial encoding and attentive pooling units with a skip connection as a dilated residual block.

$$\begin{array}{c}{r}_{i}^{k}=MLP\left({p}_{i}\oplus {p}_{i}^{k}\oplus \left({p}_{i}-{p}_{i}^{k}\right)\oplus ‖{p}_{i}-{p}_{i}^{k}‖\right)\left(1\right)\end{array}$$

where ${p}_{i}$ and ${p}_{i}^{k}$ are the positions of points, $\oplus$ is the concatenation operation, and $‖\bullet ‖$ calculates the Euclidean distance between the neighboring and center points.${r}_{i}^{k}$ represents the encoded relative point position.

In the network of this study, four encoding layers are adopted for progressively reducing the resolution of the point cloud. The encoding layers down-sample the point cloud at a sampling rate of 4 times, while increasing the feature dimension of each point to retain more information. Each encoding layer uses the random sampling as the down-sampling strategy. Since random sampling discards points non-selectively, each encoding layer also has an effective local feature aggregation module (LFA) that learns features of the point cloud without losing important information (Hu et al. 2019). Four decoding layers are used after the four encoding layers. For each decoding layer, the K-Nearest Neighbor (KNN) algorithm is used to find the nearest neighbor of each point, whose features are then up-sampled through the nearest neighbor interpolation. These up-sampled features are concatenated with the intermediate feature maps generated by encoding layers using skip connections and then fed into the shared MLP layer. The final output of RandLA-Net is the semantic prediction label for each point, denoted as $N\times {n}_{ciass}$, where ${n}_{ciass}$ represents the number of classes in the input point cloud.

2.3 Point cloud filtering and DEM reconstruction

Most common filtering algorithms usually distinguish ground points and non-ground points by mature digital image processing theory or height differences among points, judging whether the difference between pixels or points is larger than the pre-determined threshold. To ensure the accuracy and resolution of DEM reconstruction, this study proposes an improved algorithm based on the hierarchical smoothing method (Xu and Wan 2010), as described below:

Step 1: Reference surface construction. To improve the efficiency of data processing, regular grids are used to re-sampled the point clouds and then a reference dataset is formed by mature imagery filtering theory. It is noted that the generated reference dataset is only taken as a threshold surface to avoid the accuracy loss of original point clouds, and the following filtering process is still conducted with the original point clouds.

Step 2: Hierarchical smoothing. For each grid cell of the reference dataset, the average vertical coordinate value (or height) of all points in a grid cell is taken as the cell value. Then, the reference dataset is smoothed by the hierarchical smoothing method, illustrated in Eqs. (2) and (3). If the height difference between the center cell value and the minimum value of grid cells in four directions is larger than the threshold, the center cell value will be replaced by the predicted value. Otherwise, the center cell maintains the initial value.

$$\begin{array}{c}{Z}_{\left[i,j\right]}=\left\{\begin{array}{c}{\sum }_{k=0}^{n}{Z}_{{P}_{k}}/n, \text{i}\text{f} \varDelta h\le {T}_{h}\\ \text{min}\left\{{Z}_{1}, {Z}_{2},{Z}_{3},{Z}_{4}\right\}, \text{i}\text{f} \varDelta h>{T}_{h}\end{array}\right.\left(2\right)\end{array}$$

$$\begin{array}{c}{T}_{h}=\left\{\begin{array}{c}{h}_{0}, \\ {h}_{0}+{S}_{w}\epsilon ,\end{array}\right.\begin{array}{c}\text{i}\text{f} {S}_{w}\le 1\\ \text{i}\text{f} {S}_{w}>1\end{array}\left(3\right)\end{array}$$

where ${Z}_{\left[i,j\right]}$ is the value of the grid cell $\left[i,j\right]$; ${P}_{k}$ is one point within the cell $\left[i,j\right]$, and the total number of points in the cell is denoted by $n$. ${Z}_{1}, {Z}_{2},{Z}_{3},{Z}_{4}$ are the prediction values of the grid cells in four directions; $\varDelta h$ and ${T}_{h}$ denote the height difference and the threshold, respectively; ${h}_{0}$ represents the minimum height difference; ${S}_{w}$ is the resolution of the dataset, and $\epsilon$ represents the scale factor ranging from 0 to 1.

Subsequently, a new dataset with higher resolution is generated by integrating original point clouds and the smoothed reference dataset. If the height difference between the new dataset and the smoothed reference dataset is larger than the threshold, the new cell value will be replaced by the corresponding value of smoothed reference dataset. At last, the above steps are repeated until the final dataset resolution is less than the threshold. During this process, the dataset resolution is varied with the threshold, and a higher resolution requires a stricter threshold.

Step 3: DEM reconstruction with features embedded. To improve the DEM reconstruction accuracy in mountainous areas, this study extracts the topographic feature lines of mountainous areas by planar surface fitting and intersecting method at first (Xu et al. 2014; Zhao and Wang 2022), and the local area is divided into two parts. Then, ground points around the topographic feature lines are densified by the inverse distance weighting method (Huang et al. 2023; Zhao and Wang 2022). The specific densification method can be expressed as Eq. (4).

$$\begin{array}{c}\left\{\begin{array}{c}{w}_{i}={d}_{i}^{-2}/{\sum }_{i=1}^{S}{d}_{i}^{-2}\\ {h}_{j}={\sum }_{i=1}^{S}{w}_{i}{h}_{i}\end{array}\right.\left(4\right)\end{array}$$

where ${w}_{i}$ denotes the weight of each known ground point in the semicircle; ${d}_{i}$ is the distance between the current position to be densified and the known ground point; ${h}_{i}$ and ${h}_{j}$ represent the elevation of each known ground point in the semicircle and the elevation of the current position to be densified, respectively.

The points on the topographic feature lines are used as the center of the search semicircle with a radius R, and a search semicircle is generated on one side of the topographic feature lines for point cloud densification. In this study, the number of known ground points ($S$) used for densification is 12 by default. If the known ground points in the search semicircle cannot reach $S$, the radius of the search semicircle is dynamically adjusted until the number of known ground points reaches $S$. In a similar way, the ground point densification is conducted on the other side of the topographic feature lines, as well as potential flood routing areas and the holes in the point clouds induced by the removal of the non-ground points.

At last, the high-precision DEM reconstruction is conducted, meeting the requirement of the Delaunay criterion. A strategy of point-by-point insertion and growth algorithm is applied to connect the discrete point clouds into a triangulated network based on the Delaunay criterion. As for the topographic feature lines or the boundary lines of different land covers (such as buildings, farmland, residential land, and so on), the elevation of the inserted node must be calculated first by interpolating the elevation of points on both sides of the lines.

3.1 Study area overview

In the last decades, human activities have significantly changed the former runoff characteristics of natural rivers. For example, the South-to-North Water Diversion Project in China truncates the natural rivers and a large number of drainage buildings have been built to discharge the flood across the water diversion project. However, after the service for several decades, the discharge capacities of some drainage buildings cannot meet the design requirements due to mountain cavitation, rubbish blocking, and factory or farmland occupancy of the river channel.

The study area is the Liuzhuang Ditch located in Baoding City, Hebei Province of China. The ditch is truncated by the South-to-North Water Diversion Project, and an aqueduct has been built to discharge the potential flood 500 m distant from the Liuzhuang West Village. According to the design report, the drainage areas of Liuzhuang Ditch are about 1.40 km², which belong to the small rivers without any monitoring information. For the aqueduct, the size of the hole is $8.00\text{m}\times 2.10$m (width $\times$ height). The peak discharges for design flood (50-year return period) and check flood (200-year return period) are 32.20 ${\text{m}}^{3}/\text{s}$ and 56.90 ${\text{m}}^{3}/\text{s}$, respectively. Figure 4 shows the designed flood hygrographs of Liuzhuang Ditch.

Recently, oblique photography has become highly integrated and lightweight, and here a 3D model is established based on multi-perspective imaging of an ultramicro oblique photography system, which is capable of providing high-resolution runoff and environmental information for field detection and further hydrodynamic modeling. Figure 5 shows the general view of the study area containing structures, bridges, roads and vegetation.

Table 1

The Manning roughness coefficients for different underlying surfaces
Underlying surface	Residential land	Farmland	Road	Shrub land	Grounds	Concrete
Manning roughness coefficients	0.070	0.060	0.025	0.050	0.030	0.014

It can be found that the northwestern and southeastern detected areas are mountainous regions. The main channel of the water diversion project locates between the mountainous regions, and the residential areas and farmland distribute on both sides of the main channel. If there is a sudden rainstorm, a flood will form in the mountainous region and flow across the main channel along the natural river channel, threatening the distributed villages. In addition, according to the aerial model, the natural river channel is largely occupied by farmlands and weeds, and even there exists a mound at the outlet of the aqueduct. All of these will further restrict the discharge of floodwater, threatening the residents’ lives and property. Thus, it is necessary to check the flood discharge capacity in the current situation, and the runoff characteristics related to the land covers are listed in Table 1.

3.2 Establishment of the dataset

Figure 5 shows the point clouds acquired by oblique photography, which is used to verify the effectiveness of the proposed method. The average points’ density of the test region is 25 points/m². The region is about 2116 m long, and 2016 m wide, covering an area of 4.265 km², and the regional elevation undulation is up to 115.6 m. The rendering result of local point clouds by intensity is also shown in Fig. 5. In sum, this dataset contains rich elements, such as geographical environment, urban buildings and different types of objects, which can fully validate the performance of RandLA-Net in segmenting features of interest and its feasibility for application in flood risk analysis projects. The point cloud is composed of a total of 6 types of ground features, including grounds, residential lands, farmlands, roads, shrubs, and concrete. CloudCompare software was used to manually annotate the point cloud data, and extract the points belonging to these 6 types of ground features. The format of the dataset was set with reference to the S3DIS standard for the purpose of facilitating the training and testing of RandLA-Net. In this dataset, labels, coordinates and RGB information of the points, were contained.

In this study, ten large-scale point clouds were generated as the experimental dataset. Then, the performance of the RandLA-Net for segmenting point clouds acquired by UAV oblique photography was evaluated by 10-fold cross-validation to avoid overfitting. In the 10-fold cross-validation, the data were first divided into 10 subsets, of which 9 subsets were employed to train the network and one remaining subset was used as test data for network performance evaluation. The step was repeated 10 times until all the subsets were given an opportunity to be used as the test data. Notably, the established data set was split into the training set and the test set at a ratio of 9:1, since the k for the cross-fold validation is set to 10.

4.1 Point cloud segmentation based on deep learning

In this study, the experimental environment during training and testing of the dataset was based on the Linux Ubuntu 18.04 operating system, using the deep learning framework of TensorFlow. During the training phase, the Adam optimizer was adopted to train the network with the number of nearest points K set to 16. The initial learning rate of the network was set to 0.01 and decreased by 5% after each epoch. We trained the RandLA-Net for 50 epochs with batch-size = 4. Both the training and the testing were conducted on a computer equipped with an Intel® Core™ i7-10700KF CPU @ 3.80GHz, 64 GB DDR4 and two Nvidia GeForce RTX 3070 GPU graphics cards.

Figure 6 depicts some qualitative results of point cloud segmentation using RandLA-Net. By comparing predictions and ground-truths of different land cover features as shown in Fig. 6, it can be seen that RandLA-Net has better segmentation performance for interesting ground features and is suitable for flood risk analysis projects. In the field of 3D point cloud semantic segmentation, mean class Interaction over Union (mIoU) and Overall Accuracy (OAcc) are two main metrics used to assess the performance of the segmentation method, which can be computed according to Eq. (5) and Eq. (6), respectively. Here, mIoU and OAcc were utilized as evaluation metrics for each class, which are listed in Table 2.

Table 2

Segmentation performance for ground features evaluated on the test set
Folds	Residential lands	Shrubs	Grounds	Farmlands	Roads	Concrete	mIoU	OAcc
1	85.84	88.99	62.22	63.99	71.46	63.04	72.59%	0.8899
2	87.82	90.67	53.91	81.99	78.52	65.49	76.40%	0.9137
3	83.31	89.76	52.08	63.23	84.01	62.07	72.41%	0.8893
4	88.66	89.17	68.67	69.56	74.03	65.02	75.85%	0.9101
5	88.29	86.07	62.05	52.58	75.88	62.32	71.20%	0.8895
6	45.14	98.08	46.60	84.34	11.37	77.31	60.47%	0.8702
7	87.17	84.04	61.72	54.97	79.50	61.23	71.44%	0.8854
8	95.44	76.89	67.52	69.31	53.13	71.08	72.23%	0.8308
9	89.77	85.94	63.00	86.38	79.17	67.38	78.61%	0.9246
10	20.19	95.78	51.29	74.91	57.27	44.21	57.28%	0.8660
Mean Values	77.16	88.54	58.91	70.13	66.44	63.92	70.85%	0.8870

$$\begin{array}{c}mIoU=\frac{1}{k+1}\sum _{i=0}^{k}\frac{{p}_{ii}}{{\sum }_{j=0}^{k}{p}_{ij}+{\sum }_{j=0}^{k}{p}_{ji}-{p}_{ii}}\left(5\right)\end{array}$$

$$\begin{array}{c}OAcc=\frac{1}{k+1}\sum _{i=0}^{k}\frac{{p}_{ii}}{{\sum }_{j=0}^{k}{p}_{ij}}\left(6\right)\end{array}$$

where $k+1$ is the total class number of points including the background. ${p}_{ii}$ denotes the total number of points both classified and labelled as class i, and ${p}_{ij}$ represents the number of points which are classified as class i, but labelled as class j. Similarly, ${p_{ji}}$ refers to the number of points which are classified as class j, but labelled as class i.

It can be seen from Table 2 that RandLA has a good segmentation effect on residential lands, shrubs and farmlands by comparing the metrics of IoU, mIoU and OAcc when there are multiple types of ground features in the point cloud data. The IoU mean values of these 3 ground features are over 70%. This is mainly caused by these three ground features themselves. Since the entire dataset has a large number of points belonging to residential lands, shrubs and farmlands, the network can better learn the features of these 3 objects during the training phase, and they are less likely to be misclassified. On the other hand, RandLA-Net is less effective in segmenting grounds, roads and concrete. One of the reasons is that grounds and roads are similar in appearance and it is difficult to distinguish them from each other. Another reason is that the number of points that belong to the aqueduct buildings in the established dataset is too small for the network to well learn the deep features of concrete.

4.2 Point cloud filtering and DEM reconstruction

In this study, for improving the computational efficiency, the original point clouds are pre-filtered based on the segmentation results, where the points of residential buildings, shrubs and concrete structures are removed. Then, the retained point clouds are sent to the hierarchical smoothing filter to further obtain the ground points or terrain. Figure 7(b) shows the result of point clouds after the preliminary filtering based on the hierarchical smoothing method. It is clear that the non-ground points distributed on the top hillside and low-lying areas are correctly filtered, and presented as scattered empty holes in Fig. 7(b). Then, high-precision DEM is reconstructed with the further filtered ground points through the inverse distance weighting method. In this process, the topographic feature lines extracted by the planar surface fitting and intersecting method are used here to improve the DEM reconstruction precision. It can be seen from Fig. 7(c) that the accuracy of DEM reconstructed with topographic feature line is improved obviously, particularly in hillside regions of topographical features located. Mountainous terrain can be clearly expressed, and the difference in terrain on both sides of the topographic feature line can be well expressed in complex terrain areas.

To illustrate the accuracy advantage of the proposed method, a DEM reconstructed with the point clouds directly filtered by the hierarchical smoothing method is used for the comparison. It is clear that if the points of land covers are retained, they will significantly restrict the DEM reconstruction as shown in Fig. 8(a), and the humps on the terrain will hinder the flood in the following hydrodynamics analysis. Thus, it is necessary to filter the points of land covers and regain the natural ground terrain before flood simulation. In this study, the scattered empty holes generated from segmented land covers removal will be recovered by the interpolation of the inverse distance weighting method, and the regained terrain is shown in Fig. 8(b). It is clear that the proposed method is able to better extract the true surface morphology, and the water-blocking effect of land covers will be considered by setting different Manning roughness coefficients in the flood simulation based on the segmentation results.

In addition, topographic feature lines also have a significant influence on the DEM reconstruction. Figure 9 shows the comparison result of ground point densification whether embedded with topographic feature lines. It is obvious that the slopes or mountainous areas generated with topographic feature embedding present an irregular zigzag form due to the uneven distribution of altimetric points used for modeling. For areas with complex terrain changes, the DEM reconstruction without topographic feature embedding cannot express the surface morphology well, and even some local areas also have the phenomenon of concave terrain due to the lack of elevation information.

4.3 Flood simulation based on hydrodynamics

The Mike 21 software is commonly used for the flood simulation. The generated high-precision DEM described in Subsection 4.2 is directly imported as Mike geometric model. For simplification, the simulated model region is shown in Fig. 10, and the domain size is $1300 \text{m}\times 1000$m (length $\times$ width). The inflow boundary is set at the upstream of the river channel. Adaptive grids are generated using the automatic grid-creating function in Mike 21. To meet the requirements of calculation accuracy, the grids are meshed according to the type of land cover divided by the segmentation results in Module 2. For example, high-precision grids of 5 m ~ 10 m and 1 m ~ 2 m are used for the mountainous areas and potential flood routing areas respectively, and coarse grids of 10 m ~ 15 m are used for other areas. Then, based on the coordinate mapping method, the meshed grids are assigned with different runoff parameters in Table 1 according to the land covers, and Fig. 10 shows the model parameter assignment result.

In this study, two scenarios of design flood (50-year return period) and check flood (200-year return period) are analyzed, and the maximum inundation maps and the corresponding flow speed maps are shown in Fig. 11 and Fig. 12. It is clear that the aqueduct is capable of discharging the design and check floods. However, in recent years a large amount of natural river channel has been changed by human activity, and a part of the village and the farmland will be inundated once a flood occurs. In addition, the maximum inundation area under the design flood is about 70,000 m², which will increase to nearly 1.5 times when a check flood occurs. The maximum inundation depth is about 3.0 m under the design flood, which locates at the downstream village 200 m distant from the outlet of the aqueduct, and it will increase to about 4.0 m under the check flood. On the other hand, at the instant of maximum inundation, the flow speed of most inundated area is less than 0.4 m/s, and the maximum flow speeds are nearly 1.2 m/s for the design flood and 1.6 m/s for the check flood, respectively, distributed along the natural river channel. That means the natural river channel exists the scouring risk under the design and check floods.

5.1 Segmentation accuracy of large-scale point clouds

To compare the segmentation efficiency and performance, the point clouds are divided into 100 blocks with a size of $100\text{m}\times 100\text{m}$, and one block usually contains about 0.5 million points. As shown in Fig. 13, the segmentation performance of suggested RandLA-Net under 50 computation epochs is compared with those of PointNet and PointNet++. The semantic segmentation of basin scenes is typically challenging due to the difficulty of distinguishing some inconspicuous portions of things. The detailed structures can be correctly captured by RandLA-Net, which is obviously closer to the ground-truth. On the other hand, although PointNet is capable of well segmenting the shrubs, it cannot correctly detect other objects such as residential lands, roads, grounds and farmlands. PointNet + + shows the worst segmentation performance of large-scale point clouds to any objects in basin scenes.

The superiority in segmentation accuracy of the suggested RandLA-Net is illustrated quantitatively in Table 3, achieving the best mIoU of 72.59% and OAcc of 0.8899. As a comparison, the mIoU and OAcc of PointNet are 25.52% and 0.5684 respectively, and those of PointNet + + are only 13.65% and 0.4112 respectively. Moreover, although all of these three networks show good segmentation performance of shrubs with the least IoU of 68.40%, both PointNet and PointNet + + almost have no segmentation performance to the points of grounds, farmlands and roads, with IoU near to or equal to zero.

Table 3

Segmentation performance of ground features using different networks on one block
Networks	IoU (%)					mIoU (%)	OAcc
Networks	Residential lands	Shrubs	Grounds	Farmlands	Roads	mIoU (%)	OAcc
RandLA-Net	85.84	88.99	62.22	63.99	71.46	72.59	0.8899
PointNet++	13.50	68.40	0	0	0	13.65	0.4112
PointNet	27.60	84.10	0	7.60	0	25.52	0.5684

The good semantic segmentation performance of RandLA-Net benefits from the introduced local feature aggregation module in Fig. 3, which is capable of effectively learning complex local structures by progressively increasing the receptive field size in each neural layer. The reason can be generalized as that in the module the local spatial encoding unit is introduced to explicitly preserve the local geometric structure for each 3D point, and the attentive pooling is leveraged to automatically keep the useful local features. Then, multiple local spatial encoding units and attentive pooling layers are stacked as a dilated residual block, greatly increasing the effective receptive field for each point (Hu et al. 2019; Zhan et al. 2023). Moreover, all these neural components are implemented as shared multilayer perceptrons, and are therefore remarkably memory and computationally efficient.

5.2 Segmentation efficiency of large-scale point clouds

Efficient semantic segmentation of large-scale 3D point clouds is a fundamental and essential capability for high-precision DEM reconstruction and flood risk analysis. Although deep convolutional networks show excellent performance in structured 2D computer vision tasks, they cannot be directly used for the segmentation of 3D point clouds due to the typically irregularly sampled, unstructured and unordered characteristics of 3D point clouds. In addition, limited by the expensive sampling techniques or computationally heavy pre/post-processing, most existing approaches are only able to process small-scale point clouds. Thus, it is necessary to explore an efficient point cloud segmentation approach with low computation and memory consumption but high information retention, when applied to large-scale (or basin-scale) flood risk analysis.

Recently, the pioneering work PointNet learns per-point features with shared multilayer perceptrons, which is a promising approach for directly processing irregular point clouds. More and more attempts have been paid to this field, and a series of point cloud segmentation algorithms are put forward such as PointNet++, PointCNN, and PointConv. Most of them are still limited to very small-scale point clouds (e.g. $1\text{m}\times 1\text{m}$ blocks containing four thousand points), though they can achieve a good performance in 3D object recognition and semantic segmentation. It is confirmed that as the size of input point blocks increases, the mIoU of the PointNet algorithm will decrease significantly since the role of global features obtained by simple global max-pooling becomes smaller in PointNet, which results in the continuous decline of segmentation performance. Moreover, although the segmentation performance of PointNet + + improves with the increase of point block size to some extent, the required time by network inference also increases significantly with significant efficiency loss (Hu et al. 2019; Zhan, et al. 2023). Thus, most of the point cloud segmentation algorithms are not suitable for large-scale (i.e., basin-scale) point cloud segmentation in flood risk analysis with simple point block size increase.

This study aims to investigate a lightweight, computationally-efficient and memory-efficient network for large-scale (i.e., basin-scale) point cloud segmentation in flood risk analysis without complex pre/post-processing such as voxelization, block partition, or graph construction. The suggested RandLA-Net in this study applies random sampling to efficiently process large-scale point clouds and proposes a local feature aggregation module in Fig. 3 to capture complex local structures over progressively smaller point-sets. Figure 14(a) shows the original point clouds contain 49,271,158 points, which is too large for most segmentation algorithms. For one block in Fig. 13, the computation times for PointNet and PointNet + + are both more than 24 hours with a bad segmentation performance, as shown in Fig. 13, not to mention the overall point clouds directly inputting. Figure 14(b) and 14(c) also demonstrate a good segmentation performance of RandLA-Net no matter whether the block partition is used.

However, once the block partition is used, some adverse impacts will be brought about. For example, when the point clouds are divided into 100 blocks, the computation time will increase significantly, on the contrary, from 8 min 46 sec to 156 min 40 sec. Moreover, some points in the black circle of Fig. 14 are misidentified as the ground when overall point clouds are divided into several blocks, since it is difficult for the network to effectively learn the overall geometry under the premise that the object geometry of the point cloud has been destroyed by the block partition.

5.3 Limitations of this work

The focus of this research is to solve the problem of fast flood risk simulation with due consideration of environmental change. The proposed automated framework adopts UAV-based digital aerial photogrammetry for data acquisition. Because of its convenience and low cost, this approach enables frequent implementation to consider the runoff change resulting from human activities. However, noises may exist in the point clouds due to reasons such as occlusion, parallax, texture loss, and lighting conditions during oblique photography. Although the noise points are not filtered in advance, the proposed method can be robust to noises, because the local feature extraction performed by the LFA module in RandLA-Net can implicitly identify the noises. Moreover, this data acquisition and preprocessing approach is only suitable for precise analysis in important areas, such as town-scaled or city-scaled flood risk evaluation.

On the other hand, the elevation quality of the reconstructed DEM is critical to an accurate flood simulation. Of course, the land covers will inevitably lead to a larger elevation error. Many scholars have carried out related research on this kind of problem, and a number of filtering algorithms have been proposed for extracting ground points from different perspectives. Generally, existing filtering algorithms can be classified into least square linear prediction algorithm, mathematical morphology algorithm, surface fitting method, region growing algorithm, progressive constructing triangular network method, and so on. However, ground point extraction is difficult for complex mountainous regions or morphological distortion of terrain. Thus, further study should be done on the filtering algorithms of ground points, when the proposed automated simulation framework is used in flood risk analysis of the mountainous areas.

This paper proposes an automated simulation framework for flood risk mapping with the integration of aerial point clouds and deep learning techniques. The framework was developed in 4 modules: (1) data acquisition and preprocessing, (2) point cloud segmentation based on deep learning, (3) point cloud filtering and DEM reconstruction, and (4) flood simulation based on hydrodynamics. The practicality of the proposed framework is demonstrated through a case study of a local area in the South-to-North Water Transfer Project of China.

The primary contributions of this study to the body of knowledge are its original methodology for automatic and accurate extraction of ground features with mIoU and OAcc up to 70.85% and 0.8870, respectively. In addition, an improved DEM reconstruction method is suggested to establish high-precision terrain by embedding topographic features, especially for the mountainous areas. The improved DEM reconstruction method does not need to change the DEM data structure and core DEM generation algorithm, and is easy to implement with existing GIS software. On this basis, the hydrodynamics simulation of flood inundation analysis can be quickly implemented by automatic grid-creating of interesting region and runoff parameter setting, based on the coordinate mapping of the point cloud segmentation results.

Moreover, it is verified that benefitting from the simple random sampling and an effective local feature aggregator, the suggested RandLA-Net is capable of segmenting large-scale 3D point clouds in basin scenes with high efficiency and accuracy by comparison with PointNet and PointNet + + networks, which is the fundament for high-precision DEM reconstruction and flood risk analysis. However, the data acquisition and preprocessing method in this study is only suitable for precise analysis in important areas, such as town-scaled or city-scaled flood risk evaluation. Thus, it is necessary to further study a high-precision DEM reconstruction method by integrating multi-source data from ground close-range photogrammetry, UAV-based oblique photography, and satellite-based remote sensing.

Acknowledgement

The authors would like to acknowledge the financial support from Major science and technology projects of the Ministry of Water Resources, China (No. SKS-2022133) for carrying out this study. The authors also acknowledge the support of the Hong Kong Research Grants Council Theme-based Research Scheme [T22-505/19-N].

Funding: This work was supported by Major science and technology projects of the Ministry of Water Resources, China (No. SKS-2022133) and the Hong Kong Research Grants Council Theme-based Research Scheme [T22-505/19-N].

Author contribution: Conceptualization and design: Xin Fang and Xiaohua Wang; Data collection: Yishu Lai; Data curation: Kang Liu; Methodology: Xin Fang; Writing-original draft: Xin Fang and Xiaohua Wang; Writing-review and editing: Xiaohua Wang and Chao Wang; Funding acquisition: Sherong Zhang and Heng Li; Resources: Jie Wu and Sherong Zhang; Project administration: Jie Wu and Peiqi Jiang; Supervision: Sherong Zhang and Heng Li.

Availability of Data and Materials: Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request, including aerial images taken by UAV, 3D point clouds and numerical simulation results.

Competing Interests

The authors have no relevant financial or non-financial interests to disclose.

Bodoque JM, Aroca-Jiménez E, Eguibar MÁ, García JA (2023) Developing reliable urban flood hazard mapping from LiDAR data. J Hydrol 617:128975. https://doi.org/10.1016/j.jhydrol.2022.128975.
Dubey AK, Kumar P, Chembolu V, Dutta S, Singh RP, Rajawat AS (2021) Flood modeling of a large transboundary river using WRF-Hydro and microwave remote sensing. J Hydrol 598:126391. https://doi.org/10.1016/j.jhydrol.2021.126391.
European Commission. 2007. Handbook on Good Practice for Flood Mapping in Europe. European Commission, Accessed 25 May 2022. http://ec.europa.eu/environment/water/flood_risk/flood_atlas/pdf/handbook_goodpractice.pdf.
Hao H, Jincheng Y, Ling Y, Gengyuan C, Sumin Z, Huan Z (2023) An improved PointNet++ point cloud segmentation model applied to automatic measurement method of pig body size. Comput Electron Agric 205:107560. https://doi.org/10.1016/j.compag.2022.107560.
He H, Khoshelham K, Fraser C (2020) A multiclass TrAdaBoost transfer learning algorithm for the classification of mobile lidar data. ISPRS J Photogramm Remote Sens 166:118-127. https://doi.org/10.1016/j.isprsjprs.2020.05.010.
Hu Q, Yang B, Xie L, Rosa S, Guo Y, Wang Z, Trigoni N, Markham A (2019) RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds (CVPR 2020 Oral).
Huang H, Liao W, Lei X, Wang C, Cai Z, Wang H (2023) An urban DEM reconstruction method based on multisource data fusion for urban pluvial flooding simulation. J Hydrol 617:128825. https://doi.org/10.1016/j.jhydrol.2022.128825.
Huang Y, Yu S, Luo B, Li R, Huang W (2022) Development of the digital twin Changjiang River with the pilot system of joint and intelligent regulation of water projects for flood management. J Hydraul Eng 53(3):253-269. 10.13243/j.cnki.slxb.20210865.
Jakovljevic G, Govedarica M, Alvarez-Taboada F, Pajic V (2019) Accuracy Assessment of Deep Learning Based Classification of LiDAR and UAV Points Clouds for DTM Creation and Flood Risk Mapping. Geosciences.
Kashefi A, Mukerji T (2022) Physics-informed PointNet: A deep learning solver for steady-state incompressible flows and thermal fields on multiple sets of irregular geometries. J Comput Phys 468:111510. https://doi.org/10.1016/j.jcp.2022.111510.
Kowalczuk Z, Szymański K (2019) Classification of objects in the LIDAR point clouds using Deep Neural Networks based on the PointNet model. IFAC-PapersOnLine 52(8):416-421. https://doi.org/10.1016/j.ifacol.2019.08.099.
Lin L, Tang C, Liang Q, Wu Z, Wang X, Zhao S (2023) Rapid urban flood risk mapping for data-scarce environments using social sensing and region-stable deep neural network. J Hydrol 617:128758. https://doi.org/10.1016/j.jhydrol.2022.128758.
Mudashiru RB, Sabtu N, Abustan I, Balogun W (2021) Flood hazard mapping methods: A review. J Hydrol 603:126846. https://doi.org/10.1016/j.jhydrol.2021.126846.
Rizaldy A, Persello C, Gevaert C, Oude Elberink S, Vosselman G (2018) Ground and Multi-Class Classification of Airborne Laser Scanner Point Clouds Using Fully Convolutional Networks. Remote Sens 10(11):1723. https://doi.org/10.3390/rs10111723.
Sofman B, Bagnell J, Stentz A, Vandapel N (2006) Terrain Classification from Aerial Data to Support Ground Vehicle Navigation.
Sun C, Zhang F, Zhao P, Zhao X, Huang Y, Lu X (2021) Automated Simulation Framework for Urban Wind Environments Based on Aerial Point Clouds and Deep Learning. Remote Sens 13(12):2383. https://doi.org/10.3390/rs13122383.
Xu J, Kou Y, Wang J (2014) High-precision DEM reconstruction based on airborne LiDAR point clouds. In Remote Sensing of the Environment: 18th National Symposium on Remote Sensing of China 9158:37-44. https://doi.org/10.1117/12.2064237.
Xu J, Wan Y (2010) Filtering of LIDAR Points by a Hierarchical Smoothing Method. In Proc., 2010 6th International Conference on Wireless Communications Networking and Mobile Computing (WiCOM), pp 1-4. 10.1109/WICOM.2010.5600133.
Zhan L, Li W, Min W (2023) FA-ResNet: Feature affine residual network for large-scale point cloud segmentation. Int J Appl Earth Obs Geoinf, 118:103259. https://doi.org/10.1016/j.jag.2023.103259.
Zhang L, Wang H (2020) A Novel Segmentation Method for Cervical Vertebrae based on PointNet++ and Converge Segmentation. Comput Methods Programs Biomed 200:105798. 10.1016/j.cmpb.2020.105798.
Zhao M, Wang J (2022) A new method of feature line integration for construction of DEM in discontinuous topographic terrain. Environmental Earth Sciences 81(15):397. 10.1007/s12665-022-10527-1.

Download PDF

Journal Publication

published 26 Jan, 2024

Read the published version in Water Resources Management →

Editorial decision: Major revisions
14 Nov, 2023
Reviewers agreed at journal
13 Oct, 2023
Reviewers invited by journal
13 Oct, 2023
Editor assigned by journal
13 Oct, 2023
First submitted to journal
12 Oct, 2023

You are reading this latest preprint version

Automated simulation framework for flood risk mapping integrating aerial point clouds and deep learning

Status:

Journal Publication

Version 1

Abstract

Figures

1 Introduction

2 Methodology

2.1 Automated simulation framework for flood risk analysis

2.2 Point cloud segmentation based on deep learning

2.3 Point cloud filtering and DEM reconstruction

3 Study area and data source

3.1 Study area overview

3.2 Establishment of the dataset

4 Results and analysis

4.1 Point cloud segmentation based on deep learning

4.2 Point cloud filtering and DEM reconstruction

4.3 Flood simulation based on hydrodynamics

5 Discussion

5.1 Segmentation accuracy of large-scale point clouds

5.2 Segmentation efficiency of large-scale point clouds

5.3 Limitations of this work

6 Conclusions

Declarations

References

Status:

Journal Publication

Version 1