Description of study area:
This research is conducted in Upper Indrawati Watershed (85o33’N-85o44’N; 27o49’E-28o07’E) of Sindhupalchok district, located in ~ 40 km north-east of Kathmandu, Nepal (Fig. 1: map of the study area with Fault line). We did not analyze the area with the altitude more than 4000 m because of the lack of geological data (no detailed data available in Department of Mines and Geology, Nepal), and unavailability of qualitative satellite images because of cloudy weather condition and snow-capped areas. Therefore, the altitude of the area we analyzed ranges from 796 m to 4000 m with the coverage of 364 km2 area. The slope gradient ranges from 0o to 73o with the mean slope of 31.6o.
The study area belongs to higher Himalayan zone (Pre-Cambrian) geology dominated by Sermanthang formation and Dhad Khola gneiss formation covering 35% and 24% area respectively according to the data from Department of Mines and Geology, Nepal. Sermanthang formation mostly covers the high-altitude region of the study area which includes lithology of interbedded feldspathic schist, augen gneiss, quartzite, and biotic-feldspathic schist. However, Dhad Khola gneiss covers the lower altitudinal region including porphyroblastic gneiss, augen gneiss with a thin band of quartzite and schist, and migmatitic gneiss lithology. The study area also composed of main central thrust and two fault lines (southern edge and middle of the western boundary of the study area; Fig. 1).
The study area experienced many tremors after the Gorkha earthquake which originated in main Himalayan thrust followed by hundreds of aftershocks. These tremors increased the earthquake severity considerably in the study area (Goda et al., 2015) which amplified likely occurrences of landslides.
Landslide distribution:
Gorkha earthquake followed by hundreds of aftershocks (Parameswaran et al., 2015) triggered many landslides in the study area. We performed field observation, on-screen detection of an aerial photographs, visual interpretation, and on screen as well as automatic detection of high-resolution satellite images following the Keefer et al. (2002) and Harp et al. (2011) for landslide detection (Xu et al., 2015). Use of high-resolution satellite image interpretation coupled with the field observation is quite useful and appropriate for coseismic landslides inventory (Xu et al., 2015). This is because the number, area, and volume of coseismic landslides is extensive (e.g., Keefer, 1994; Marc et al., 2016), and the location of landslides is spatially distributed over larger areas (Dai et al., 2011). Thus, we conducted onscreen digitization of post-earthquake high resolution, Geo Eye satellite imageries available on Google Earth in polygon format (Harp et al., 2011; Xu et al., 2014) and currently is appropriate alternative among the variety of methods available (Corominas et al., 2014). Even though its time consuming, it is more accurate than point based on susceptibility mapping (Xu et al., 2012). Furthermore, it was aided by field observation from Nov. 15 to Dec. 5, 2015.
Landslide distribution was expressed by landslide relative density (LRD) (Ayalew and Yamagishi, 2005), landslide number abundance (LNA) (Keefer, 2000; Xu et al., 2010; Xu et al., 2015) and landslide area abundance (LAA) (Dai et al., 2011; Xu et al., 2014; Xu et al., 2015) which helps to understand the effects of causal factors on landslide occurrences. LRD is the ratio of the frequency ratio value (FRV) of a class to total FRV of that causal factor and is calculated as suggested by Ayalew and Yamagishi (2005). LNA is the number of landslides per square kilometer, and LAA is the percentage of area affected by landslides (Xu et al., 2015).
Landslide causal factors:
The occurrence of the landslides depends upon the multiple causal factors. Even though seismic shaking is the driving force for the earthquake triggered landslides, local (natural and anthropogenic) factors play a dominant role (Kamp et al., 2008). Selection of the appropriate causal factors is essential for susceptibility mapping (Xu et al., 2013a). Casual factors should be measurable, varies spatially, available to cover the entire study area (Ayalew and Yamagishi, 2005) and have a certain degree of compatibility with the dependent variable (Garcia-Rodriguez et al., 2008). Hence, we selected nine factors for mapping which includes slope angle, aspect, elevation, curvature, distance to the fault and river, geological formation, seismic intensity, and land cover. The reason for the selection of causal factors is described in Table 1.
Table 1
Causal factors and its significance for landslide occurrence with the references
Causal Factors | Significance | Data Source |
Slope Angle | Slope gradient, a predominant factor for mass wasting (Garcia-Rodriguez et al., 2008; Corominas et al., 2014). Steeper the slope gradient higher the chance of landslides (Kamp et al., 2008). | ASTER gDEM1 |
Aspect | Related to the weather condition, weathering and land cover thereby affect the occurrence of landslides (Garcia-Rodriguez et al., 2008, Kamp et al., 2008; Corominas et al., 2014). | ASTER gDEM |
Elevation | Slope instability could be the result of changes in elevation (Corominas et al., 2014). The region having higher elevation mostly dominated by rocky slopes where the frequency of earthquake-induced landslides is higher (Owen et al., 2008). | ASTER gDEM |
Curvature (total) | Curvature, a topographic factor, crucially important for rock fall and highly important for shallow landslides (Hasegawa et al., 2009; Corominas et al., 2014). | ASTER gDEM |
Seismic intensity (PGA in %) | Earthquake shaking act as an additional driving force on the slope which favors the landslides (Duncan and Wright, 2005; Corominas et al., 2014): higher the energy of shaking, higher the risk of slope failure (Keefer, 2002; Delgado et al., 2011). We considered the peak ground acceleration (PGA) to understand the ground shaking. | USGS2 |
Geological formation | The strength and permeability of the slope depending on the lithology of the area thereby, is crucial conditioning factor for landslide occurrences (Dai and Lee, 2002; Corominas et al., 2014). We used geological formation according to the data availability as lithology varied according to the formation in the study area. | DMGN3 |
Distance to the fault | Distance to fault is highly relevant as both landslide conditioning and triggering factor (Corominas et al., 2014). | DMGN |
Distance to the river | Landslide distribution is related to the distance to the river (Mancini et al., 2010; Corominas et al., 2014). | ASTER gDEM |
Land cover | Mechanical anchoring of the land depends upon the land cover (Meusburger and Alewall, 2008) and is highly important conditioning factor for landslide occurrences (Montgomery et al., 2000; Garcia-Rodriguez et al., 2008; Corominas et al., 2014). | ICIMOD4 |
Note: 1Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) global digital elevation model (gDEM), available on https://earthexplorer.usgs.gov; 2United States Geological Survey (USGS), 3Department of Mines and Geology, Nepal; 4International Centre for Integrated Mountain Development (ICIMOD) (https://rds.icimod.org/Home/DataDetail?metadataId=9224) |
All the causal factors were categorized from four to nine classes to understand their effect for landslide occurrences due to earthquake. Slope angle, elevation was classified using equal interval method in ArcGIS 10.3; natural break method was used for curvature to categorize continuous data in six discrete classes. We preserved the classes of the causal factor aspect (nine classes), seismic intensity (four classes), geological formation (six classes) and land cover (seven classes). Buffering was performed in ArcGIS for distance to the fault (three km buffer provided seven classes) and distance to the river (four classes). The categorical map of the study area is presented in Fig. 2 (a-i).
The categorical factors (land cover, geological formation, and PGA) were dealt by creating dummy variables whereas the non-categorical factors with continuous data were dealt as they are (Nefeslioglu et al. 2008) for the purpose of susceptibility mapping using LR in this study. As the scales of the input variables (casual factors) are different, the input data was normalized from 0 and 1 in order to increase the speed and accuracy of data processing, using Eq. 1 following the study of Nefeslioglu et al. (2008).
where, Xnorm is normalized value of Xi input data, Xi is the input data that should be normalized, Xmax is the maximum value of the input data, and Xmin is the minimum value of the input data.
Preparation of training and validation landslide:
All the inventoried landslides were divided as training landslide and validation landslide for LR modelling and LSM validating respectively (Aditian et al., 2018; Ba et al., 2018). As we only considered the earthquake induced landslide which were triggered by an event, the landslides of one part of the study area were used as the training landslide and another part as a validation landslide. We considered the Indrawati river (flowing from north to south) as the border: landslides occurred in the left bank of the river were used for LR modeling (training landslide) while landslides occurred in the right bank of the river were used for validation of the model (see Fig. 3).
In total, we mapped 402 earthquake-induced landslides in polygon format which covers 2.748 km2. All the mapped landslides were rasterized in GIS platform in 30 × 30 m resolution for analysis, resembling the causal factors resolution as determined by digital elevation model (DEM), which covers a total of 3018 pixels. Out of which training landslide (left bank's landslides) consists of 309 landslide polygons with 2063 pixels whereas, validation landslide consists of 93 landslide polygons with 955 pixels. The equal number of non-landslides pixel for both training and validation landslide were selected for modelling and validating of LSM.
Landslide susceptibility mapping:
We used LR, a multivariate statistical approach for LSM in the study area (eg., Ayalew, and Yamagishi, 2005; Garcia-Rodriguez et al., 2008; Bai et al., 2010; Mancini et al., 2010; Xu et al., 2013a; Abedini et al., 2017; Aditian et al., 2018). LR is also called as a generalized linear model (GLM) for binary response variables (Hosmer and Lemeshow 2000). LR allows describing the effects of all the independent variables on dependent variables in the form of linear regression equation (Atkinson and Masari, 1998). LR has a distinct advantage in which the independent variables can be either continuous or (and) discrete as if the link function is added to usual linear regression model (Bai et al., 2010), and not necessarily normally distributed (Garcia-Rodriguez, 2008; Bai et al., 2010). LR analysis is based on the analysis of dependency of a binary dependent variable having possible outcome of either 1 (landslide) or 0 (no landslide) to independent variables (landslide causal factors) (Mancini et al., 2010; Budimir et al., 2015). It calculates the probability of specific event occurrences, landslide occurrence in this case as in the study by Ayalew and Yamagishi (2005) which can be expressed as in Eq. (2).
Where P is the probability of landslide occurrence, e is the exponential function and Z is the logit value which is expressed by a linear equation as;
Where, b0 is the intercept, b1, b2,……bn are the coefficient of landslide causal factors x1, x2,……xn respectively, n is the number of causal factors (9 in this study). The linear model is logistic regression and represents the presence and absence of landslides on independent variables (Bai et al., 2010).
The data were arranged in such a way that the scaled values of the causal factors were represented by the binary dependent variables (0 and 1), and LR was performed. The "glm" package in R 3.5.2 was used to perform GLM in the study area. Provided results were the coefficient of causal factors by the process of maximum likelihood criterion (Mancini et al., 2010).
Susceptibility scale of the region is relative (Fell et al., 2008) hence, obtained probability value were categorized in five discrete classes as susceptibility categories (extremely low, very low, low, medium and high) using the natural breaks (Jenks) method in ArcGIS (E.g., Lee and Sambath, 2006; Pradhan, 2010; Chalkias et al., 2014; and Aditian et al., 2018).
Map validation:
The overall performance of the mapping approach is assessed by identifying the correctly classified landslides under the susceptibility categories of LSM (Xu et al., 2013b). Area under the curve (AUC) was calculated to predict the accuracy of the model quantitatively (Akgun et al., 2012; Abedini et al., 2017; Ba et al., 2018). The AUC suggest the model's quality for reliable prediction of existence and non-existence of landslides (Aditian et al., 2018). The larger the AUC values, the higher the accuracy of model performance (Corominas et al., 2014). Additionally, the values close to 1.0 indicating perfect fit is the ideal condition whereas, close to 0.5 indicates the random fit (Carvalho et al., 2014).
We compared the LSM with both training and validation landslides. The curve obtained by overlaying susceptibility map with training landslide suggests the model's capability of classifying the area (Mancini et al., 2010; Xu et al., 2012) and also called as success rate curve (Chung and Fabri, 2003; Aditian et al., 2018; Ba et al., 2018); and with validation landslide it is prediction rate curve (Lee and Pradhan 2007; Ba et al., 2018) suggesting the model's ability of landslide prediction (Chung and Fabri, 2003; Aditian et al., 2018). The AUC value generated from the success rate curves also indicates the correlation between dependent and independent variables in LR analysis (Mancini et al., 2010).
Further, the susceptibility map was validated using the seed cell area index (SCAI) as in Abedini et al. (2017), and Nicu and Asandulesei (2018) as suggested by Suzen and Doyuran (2004). The SCAI illustrates the landslide density on susceptibility class (Suzen and Doyuran, 2004) and is the ratio of the percentage of the area in susceptibility category to the percentage area of landslides occurred in that category (Abedini et al., 2017). The SCAI reflect the accuracy of the mapping approach qualitatively (Abedini et al., 2017). In ideal situation, the SCAI value is higher in low and very low susceptibility classes, whereas it is smaller in high and very high susceptibility classes (Kincal et al., 2009; Sdao et al., 2013; Chen et al., 2016; Abedini et al., 2017; Nicu and Asandulesei, 2018).