Study area and animals
Our study focused on the four main islands of Japan (Hokkaido, Honshu, Shikoku and Kyushu), excluding smaller islands. The primary vegetation zones in Honshu, Shikoku, and Kyushu are cool temperate and warm temperate vegetation, while Hokkaido, the northernmost island, mostly has boreal and cold temperate vegetation40. Approximately 80% of Japan's land area is rugged and mountainous terrain, covered with forests that account for about 67%40 (Fig. 4 and Supplementary Fig. 1). The human population in Japan peaked at 128 million in 2008 and has since begun to decline29. This downward trend is expected to continue due to the aging population (over 25%) and lower total fertility rates (1.26–1.54 in the past two decades)29.
Sika deer inhabits all four main islands in Japan. The population of sika deer in Hokkaido was distinguished from those of the other three islands from the statistical analysis due to differences in vegetation zones. Wild boar, Japanese serow, and Japanese macaque inhabit Honshu, Shikoku, and Kyushu. Asiatic black bear inhabits Honshu and Shikoku, but their populations on Shikoku are endangered; hence, Shikoku populations were excluded from the statistical analysis. Brown bear inhabits only Hokkaido.
Distribution range
We used distribution data for six species compiled by the Ministry of Environment12–15,41. These data were recorded as either presence or absence of each species in each grid cell measuring 3′45′′× 2′30′′ (approximately 5×5 km at Japan’s latitude; hereafter referred to as 5 × 5 km grid cell). We defined a grid cell where species presence was recorded as an 'occupied cell' and a cell where species absence was recorded as an 'unoccupied cell'. In 1978 and 2003, the distribution of each of the six species was surveyed, respectively, through field surveys, interviews, and questionnaires12,13. In 2014, the distribution of sika deer and wild boar was surveyed in each area where the presence of species had not been identified until 2003, respectively41. In 2017, the distribution of Asiatic black bears, brown bears, Japanese serows, and Japanese macaques across the country was surveyed again using the same method as in 1978 and 200314,15.
Predictor variables
To test our hypotheses, we examined the effects of six predictor variables that are expected to affect the range expansion of large terrestrial mammals in different periods: 1978–2003 for all six species; 2003–2014 for sika deer and wild boar; 2003–2017 for Asiatic black bear, brown bear, Japanese serow and Japanese macaque, respectively. Because these species are terrestrial, areas of oceans or lakes within each cell were excluded from the calculation of predictor variables.
To evaluate the impact of habitat quality, we used five environmental variables (forest cover, cultivation cover, terrain ruggedness index, artificial light at night, maximum snow depth, and area of abandoned agricultural lands). For forest and agricultural land cover, we used the land use data in 2006 and 2014 for expansions in 1978–2003 and after 2003 (i.e., 2003–2014 for sika deer and wild boar; 2003–2017 for Asiatic black bear, brown bear, Japanese serow and Japanese macaque), respectively, obtained from the Digital National Land Information provided by the Ministry of Land, Infrastructure, Transport, and Tourism (https://nlftp.mlit.go.jp/ksj/index.html; accessed July 2022). The mean proportion of cultivation cover as well as forest cover were calculated for each 5 × 5 km grid cell, respectively. We used terrain ruggedness index as topographical variable. We developed a terrain ruggedness index using a 30-m resolution digital elevation model derived from the USGS (https://earthexplorer.usgs.gov; accessed July 2022), and calculated the mean terrain ruggedness index for each 5 × 5 km grid cell. We also used the level of artificial light at night in the final year of each expansion period as a measure of human activities. For example, to analyze the expansion of sika deer in 1978–2003, we used data on artificial light at night in 2003. For artificial light at night in 2003, we extracted data from the global nighttime light composite images (Nighttime Lights Time Series Version 4) of satellite number F15 from the Defense Meteorological Satellite Program’s Operational Linescan System (DMSP-OLS) (https://ngdc.noaa.gov/eog/dmsp/downloadV4composites.html; accessed July 2022). We used the stable_lights data, which contains the lights from cities, towns, and other sites with persistent lighting, including gas flares, while discarding ephemeral events such as fires. For artificial light at night in 2014 and 2017, we extracted data from the NASA/NOAA Visible Infrared Imaging Radiometer Suite (VIIRS) (https://eogdata.mines.edu/products/vnl; accessed July 2022). We used the average-masked data of V.2, which had background, biomass burning, and aurora values zeroed out. We calculated the mean artificial light at night of each year for each 5 × 5 km grid cell. We used the maximum snow depth during the previous year of the final year of each expansion period to minimize the inclusion of snow depth data from the winter following the compilation of species distribution data. For example, to analyze the expansion of wild boar from 2003 to 2014, we used data on the maximum snow depth in 2013. We obtained maximum snow depth data, which recorded in 1 km resolution, from Agro-Meteorological Grid Square Data System operated by the National Agriculture and Food Research Organization42. We, subsequently, calculated the mean maximum snow depth for each 5 × 5 km grid cell. All the data were processed using QGIS (version 3.30.2, https://www.qgis.org). For the area of abandoned agricultural lands, we used area of abandoned cultivation, including farmlands, paddy fields, and orchards, of 2005 and 2015 to use for the expansions in 1978–2003 and after 2003, respectively. We acquired the area of abandoned cultivation data collected in the 2005 and 2015 from the Census of Agriculture and Forestry, a questionnaire-based nationwide census conducted every five years by Ministry of Agriculture, Forestry and Fisheries, under a permission, and then calculated the total area of abandoned cultivation for each 5 × 5 km grid cell. To assess the effect of propagule pressure, we used the number of adjacent source cells (See statistical analysis section; Supplementary Fig. 2).
Statistical analysis
A statistical comparison of samples of used and available space, referred to as the "used-available" or "case-control" framework, is an effective approach in animal ecology to quantify the disproportionate space use by animals43. To assess the factors driving range expansion, we compared the newly occupied cells to unoccupied cells, representing used and available areas, respectively, for the range expansion of each large terrestrial mammal species in two different periods i.e., 1978–2003 and after 2003 (2003–2014 for sika deer and wild boar; 2003–2017 for Asiatic black bear, brown bear, Japanese serow, and Japanese macaque). Because of the broad geographical range of our dataset across the country, to minimize potential regional variation in the effect of variables, we first estimated the sources of range expansion and defined available cells as those adjacent to the source cells based on the following assumptions (Supplementary Fig. 2): The expansion of large terrestrial mammals occur sequentially, spreading from cells occupied in the previous period (i.e., originally occupied cells) to cells that are both adjacent and connected by land, based on the reports that the range expansion of large terrestrial mammals typically extends from the areas originally occupied to the surrounding areas at a broad scale44,45. The originally occupied cell adjacent to newly occupied cells were classified as the source of expansion (hereafter referred to as source cell). Newly occupied cells adjacent to these source cells were then classified as first-stage expansion cells. Next, newly occupied cells adjacent to the first-stage expansion cells but not yet categorized were classified as second-stage expansion cells. The first-stage expansion cells adjacent to the second-stage expansion cells were subsequently classified as the source cells for the second stage of expansion. This process was repeated, with each new stage of expansion identifying source cells from the previous stage, until no further cells could be classified as newly occupied. Among the cells adjacent to each stage’s source cell, those that were newly occupied were defined as “used”, while those that remained unoccupied were defined as “available”.
Due to the uncertainty in estimating source cells, newly occupied cells that were not adjacent to any occupied cells from the previous period were excluded from further analyses. These cells accounted for an average of 4.6% (0.2%-10.0%) of the newly occupied cells in each period. We labelled the used cells as 1 and the available cells as 0, and subsequently contrasted them using conditional logistic regression (via the function clogit in the R package survival46) by considering the ID of the source cell as s strata. We excluded grid cells where the area of the ocean and/or lake within the cell exceeded half the total area, to avoid potential overestimation or underestimation for each variable. To improve the convergence of the fitting models and make comparisons of effect sizes easier, all predictor variables were standardised47. Next, we tested for multicollinearity among the variables using variance inflation factor (VIF) via the function vif in the R package car48. We found that the VIF values above 4, indicating a multicollinearity issue49, in all models for each species and period, and that the mean proportion of forest cover had the highest VIF value among all the variables. Therefore, we excluded the mean proportion of forest cover from all models. After excluding the mean proportion of forest cover, all VIF values for all models were less than 4. All models generally had acceptable concordance, ranging from 0.683 to 0.813 (Supplementary Table 2). All analyses were performed with R (version 4.3.3, https://www.r-project.org).
To identify the landscape changes associated with the range expansion of large terrestrial mammals across the country, we calculated the average area of forests, agricultural lands, and human developments, as well as the average elevation within the grid cells occupied by each species in 1978, and those newly occupied by 2003 and beyond.