The data from sightings suggest a causal correspondence between the distribution of survey effort and the distribution of whales in the Gulf of California

doi:10.21203/rs.3.rs-4178292/v1

Download PDF

Research Article

The data from sightings suggest a causal correspondence between the distribution of survey effort and the distribution of whales in the Gulf of California

https://doi.org/10.21203/rs.3.rs-4178292/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Data on the distribution of most species are often collected using non-standardized sampling protocols, resulting in biased data due to preferential selection of certain environmental conditions. This study aimed to assess the distribution of survey effort for whale monitoring in the Gulf of California, México and estimate its correlation with environmental variables at different resolutions. This comprehensive database compiles navigation details and species observations from 1982 to 2018. The number of navigation routes for whale monitoring in the Gulf of California was calculated, and 10% and 5% of the best-surveyed cells were located at five different resolutions. Generalized Linear Models were employed to estimate the explanatory capacity of eight environmental variables in the distribution of the survey effort. Only approximately 3%-10% of the entire area can be considered well-surveyed. Collection effort was highest in areas with cold waters, high levels of particulate organic carbon, and phytoplankton, irrespective of resolution. However, regardless of environmental conditions, the distribution of survey efforts correlated with available data on the distribution of whales. These results suggest that the knowledge and prolonged interaction between data collectors and the whale population mainly influence the heterogeneous distribution of survey effort. Understanding biases and associated factors in survey effort distribution may provide insights for future monitoring programs. This knowledge can inform effective conservation strategies for whales in the Gulf of California and beyond.

Effort

distribution

environmental conditions

cetaceans

whales monitoring

Biodiversity databases suffer from biases and deficiencies, which may be taxonomic, geographic, or environmental (Daru and Rodriguez 2023; García-Roselló et al. 2023). These patterns have their origin in the unequal distribution of the labor force in taxonomy and faunistic, differential interest and resources in some organisms over others, and preferences for specific places or environments (e.g., Reddy and Dávalos 2003; Reese et al. 2005; Boakes et al. 2010; Embling et al. 2015; Tyne et al. 2016; Tang et al. 2021; Davis et al. 2022). When mapped, the consequence of these heterogeneities in collection and study effort is that localities with few or insufficient records are considered poorly prospected and in need of additional survey efforts (“data gaps”). This situation is likely very common in most groups and regions, especially in biodiversity-rich areas and the initial phases of biodiversity inventorying (Sastre and Lobo 2009). When this occurs, it is convenient to examine the environmental spectrum covered by the localities considered as well-surveyed (Austin and Heyligers, 1989; Rocchini et al. 2011) to delimit the number and location of additional localities that should be surveyed and generate a representative sample useful in predictive modeling (Hortal and Lobo 2005; Hortal et al. 2007; Guralnick et al. 2007).

Cetacean data often faces spatial biases in completeness, primarily attributed to the extensive range of their distributions and the methodological challenges associated with detecting specimens (Redfern et al. 2006; Higby et al. 2012; Derville et al. 2018). Consequently, unsystematic records in cetacean databases are significantly influenced by the heterogeneous distribution of survey efforts. This distribution, in turn, is affected by the difficulty in selecting locations that maximize the likelihood of animal presence and/or provide suitable observations (Corkeron et al. 2011). In this study, we analyze surveys conducted by various Gulf of California, Mexico researchers, focusing on collecting whale records. Our objectives are to i) examine the spatial distribution of available data on whale sightings, ii) distinguish locations where higher or lower survey efforts are concentrated, and iii) estimate the environmental variables that likely influence the spatial distribution of survey efforts. The exploration of these goals aims to determine whether the intensity of whale sightings aligns with a "data gap" scenario and whether the distribution of survey efforts is environmentally determined. We hope that the results obtained will offer valuable insights, contributing to future assessments of the role of sampling effort in making reliable predictions regarding whale distributions in the selected region.

Survey effort surrogate

We developed a map to illustrate the extensive survey efforts dedicated to monitoring whales in the Gulf of California (GC). This map is derived from navigation routes followed by tracking and research teams. Our primary data source for this endeavor is the PRIMMA-UABCS database, accessible at https://sites.google.com/uabcs.mx/primma/p%C3%A1gina-principal?authuser. This comprehensive database compiles navigation details and species observations from 1982 to 2018. In addition to the PRIMMA-UABCS database, we integrated data from various reputable sources, including the Centro Interdisciplinario de Ciencias Marinas (CICIMAR-IPN), the Centro de Investigación Científica y Estudios Superiores de Ensenada (CICESE- CONACHyT), the Prescott College Research Center, and the Programa de Observación de Cetáceos (PROCETUS). This collaborative effort enhances the depth and breadth of our dataset, providing a more comprehensive overview of the survey efforts involved in monitoring wales in the Gulf of California.

To analyze the survey effort, we partitioned the study area into cells of varying resolutions (see Fig. 1): 4 x 4 km (n = 13,865 cells), 10 x 10 km (n = 2,281 cells), 20 x 20 km (n = 727 cells), 30 x 30 km (n = 229 cells) and 50 x 50 km (n = 91 cells). We quantified the number of navigation routes passing through each cell, using it as a surrogate for the survey effort. Subsequently, we examined the frequency distribution of routes within cells at each resolution. Only the top 10% and 5% of cells with the highest number of navigation routes were selected, identifying those cells that underwent relatively thorough surveying (hereinafter “well-surveyed”).

To illustrate the declining trend in the number of cells as survey effort increases, we employed CurveExpert 1.4 software (www.curveexpert.net) to fit the frequency distribution of routes to a negative exponential function. This analysis allowed us to depict the relationship visually. As we explored various resolutions, our goal was to investigate how biases in survey effort intensify with decreasing cell resolution. Additionally, we aimed to estimate the relative explanatory capacity of different predictors and their variation across the resolutions used.

Origin of the explanatory variables

For our study area, we compiled geographic and environmental variables from various sources (refer to Table 1). To obtain digital cartography at 4 km x 4 km resolution, we utilized NASA's OceanColor Web (https://oceancolor.gsfc.nasa.gov/). From this platform, we extracted monthly average values of sea surface temperature (SST), particulate organic carbon (POC), and chlorophyll a (CHLA) concentration spanning from June 2002 to April 2018. Additionally, we accessed the Bio-ORACLE platform (see http://www.bio-oracle.org/), known for providing digital marine cartography suitable for explaining the distribution of marine species (Tyberghein et al. 2012; Jueterbock et al. 2013; Stuart-Smith et al. 2013; Assis et al. 2017). From this source, we acquired eleven additional variables: calcite water content (CAL), diffuse attenuation (DA), dissolved molecular oxygen (DMO), nitrate (NIT), pH (PH), phosphate (PHO), phytoplankton (PHY), primary productivity (PP), salinity (SAL) and cloud cover (CC). Furthermore, we obtained bathymetric data (BAT) and slope (SLO) from the General Bathymetric Chart of the Oceans (GEBCO; https://www.gebco.net/). We acquired These data at a resolution of 0.0334 minutes, approximately 3.7 x 3.7 km. In addition to these environmental variables, we included the distance to the nearest population (DP) as a predictor, sourced from digital cartography of populated localities obtained from the Instituto Nacional de Estadística y Geografía de México (INEGI; https://www.inegi.org.mx/). Finally, to ensure consistency and comparability, all seventeen variables, including the environmental variables and DP (listed in Table 1), were resampled at the five spatial resolutions considered in our study. We performed this resampling process using ArcGIS software, specifically version 10.5.1.

Statistical analysis

Correlation among environmental variables can present challenges in estimating the regression coefficients of individual predictors and may even lead to the exclusion of significant predictors during the model-building process (Graham 2003). To address this issue, we employed the Variance Inflation Factor (VIF) to identify predictors demonstrating “true” independence (Booth et al. 1994). For this purpose, we utilized ModestR v4.0 software (García-Roselló et al. 2013; see www.ipez.es/modestr), systematically eliminating variables with a VIF exceeding 10 (Booth et al. 1994; Dormann et al. 2013). As a result of this variable selection, seven explanatory variables were excluded, as indicated in Table 1. The remaining predictors that exhibited sufficient independence and were considered suitable for further analysis include BAT, CHLA, DMO, DP, POC, PH, PHY, PP, SST, and SLO.

To assess the influence of the selected explanatory variables on the variation in survey effort, we employed Generalized Linear Models (GLMs). We assumed the survey effort followed a Poisson error distribution for the response variable, which was related to the predictor variables via a logarithmic link function (see Crawley 1993). In estimating the effects of individual variables while accounting for the influence of other predictors, we utilized a type III sum of squares approach. This method allowed us to evaluate each variable´s partial or individual effects while controlling for the impact of the remaining predictors. We standardized all predictors to mean = 0 and standard deviation = 1 to ensure comparability and eliminate the effect of measurement scale differences.

Table 1

Climatic and geographic variables obtained, calculated and selected (*) according to a VIF no exceeding 10. We obtained mean values for each variable in the surface layer from Bio-ORACLE. We acquired the annual mean from OCEANCOLOR and averaged from 2002 to 2018
Layer	Acronym	Units	Source	Resolution	VIF
Dependent variable
Effort	EFF	n°.cell^− 1	PRIMMA-UABCS	-
Explanatory variables
*Bathymetry	BAT	m	GEBCO	15 arc-second	2.77
*Chlorophyll a	CHLA	mg.m^− 3	OCEANCOLOR	4 km	1.62
*Dissolved molecular oxygen	DMO	mol.m^− 3	Bio-ORACLE	5 arcmin	2.70
*Distance to population	DP	m	Processed data from INEGI ^a	-	2.74
*Particulate Organic Carbon	POC	mg. m^− 3	OCEANCOLOR	4 Km	2.08
*pH	PH	-	Bio-ORACLE	5 arcmin	5.00
*Phytoplankton	PHY	umol.m^− 3	Bio-ORACLE	5 arcmin	4.23
*Primary productivity	PP	g.m-3.day^− 1	Bio-ORACLE	5 arcmin	3.59
*Sea Surface Temperature	SST	ºC	OCEANCOLOR	4 km	10.00
*Slope	SLO	%	Processed data from GEBCO ^b	-	1.12

We measured the goodness-of-fit for the model by the deviance statistic (Dev), and we assessed the statistical significance of the obtained parameters using the Wald statistic. We considered only variables with a significance level of ≤ 0.01 in the final model. Additionally, we conducted saturated models, encompassing all previously mentioned variables simultaneously. This approach allowed us to estimate the complete variability accounted for by the full set of explanatory variables (DevSat). We performed all statistical analyses using the Statistica 12 package (Stat Soft Inc 2014).

We examined model residuals to detect autocorrelation and determine if a structured spatial pattern remained in the data beyond what could be explained by the explanatory variables (Diniz-Filho et al. 2003). To achieve this, we computed “local indicators of spatial association” (LISA), enabling the identification of statistically significant local Moran’s I values for each cell across all models (Anselin 1988). We calculated correlograms for the first 10 lag distance intervals, with the first lag selected to account for the eight cardinal directions (Carl and Kühn 2007). We carried out the LISA analysis using the SAM software (Spatial Analysis in Macroecology) developed by Rangel et al. (2010), accessible at www.ecoevol.ufg.br/sam. Additionally, we assessed the normality and homoscedasticity assumptions of the model residuals (Crawley 1993).

The negative exponential function effectively illustrated the decrease in the number of grid cells as survey effort increased (Fig. 2), with correlation values ranged from − 0.86 to -0.99 (Table 2). Regardless of the resolution, only a modest fraction of the overall cells could be considered relatively well-surveyed. Adopting the most stringent criterion of choosing the top 5% of cells with the highest numbers of navigation routes revealed that approximately 3–4% of the total cells met this criterion. Alternatively, when opting for the top 10% of cells, this proportion increased to around 6–10% of the total cells (Table 2).

Table 2

Total number of cells in the Gulf of California (Tcells) at different resolutions (Res). Each cell´s frequency distribution of navigation routes was fitted to a negative exponential function (Fig. 2), where r denoted the correlation coefficient for this relationship. C10 and C5 represented 10% and 5%, respectively, of the total of cells with the highest number of navigation routes. Nr10 and Nr5 indicated the minimum number of navigation routes required for a cell to be classified within the top 10% or 5% of those with the highest navigation routes
Res	Tcells	r	C10	C5	Nr10	Nr5
4 x 4 km	13865	-0.99	816	488	≥ 3	≥ 4
10 x 10 km	2281	-0.98	189	90	≥ 4	≥ 6
20 x 20 km	727	-0.96	50	31	≥ 4	≥ 5
30 x 30 km	229	-0.93	23	10	≥ 6	≥ 9
50 x 50 km	91	-0.86	9	3	≥ 9	≥ 14

Table 3

Significant predictor variables related to survey effort in the Gulf of California (navigation routes from 1987 to 2018) at the five considered resolutions. The Wald test measured statistical significance (*p < 0.01, **p < 0.001, ***p < 0.0001). The signs in brackets indicated the direction of the relationship (negative or positive). *Dev* represented the amount of deviance (in %) accounted for by the significant variables, and *DevSat* indicated the total deviance explained by all predictors included simultaneously (saturated models). Variable acronyms: Chlorophyll a (CHLA), Dissolved Molecular Oxygen (DMO), Nearest Population (DP), Slope (SLO), Phytoplankton (PHY), Particulate Organic Carbon (POC), Primary Productivity (PP), Sea Surface Temperature (SST), Bathymetric (BAT), and pH (PH)
Resolution	CHLA	DMO	DP	SLO	PHY	POC	PP	SST	BAT	pH	Dev	DevSat
50 x 50 km	-	-	-	-	10.04* (+)	-	9.15* (-)	8.62* (-)	9.42* (-)	-	58.82	75.38
30 x 30 km	-	9.23* (+)	-	-	-	7.09* (+)	-	29.94*** (-)	13.83** (-)	-	61.45	66.57
20 x 20 km	-	47.73*** (+)	-	-	32.32*** (+)	88.14*** (+)	32.32*** (-)	20.67*** (-)	18.42*** (-)	-	47.77	48.65
10 x 10 km	20.80*** (+)	-	99.55*** (-)	-	99.55*** (+)	11.16*** (+)	99.55*** (-)	130.90*** (-)	224.20*** (-)	224.20*** (-)	50.36	50.38
4 x 4 km	7.12* (+)	47.21*** (+)	308.53*** (-)	-	263.50*** (+)	51.67*** (+)	214.13*** (-)	290.66*** (-)	-	39.02*** (-)	30.77	30.79

The maps of predicted and residual values (Fig. 3B and 3C) interestingly revealed an underestimation of survey effort in those cells with higher navigation routes. Our analysis revealed a positive and significant correlation between the number of navigation routes in 10 km x 10 km cells and the residual values of a saturated model containing all used environmental predictors (Pearson r = 0.797; p < 0.001). This pattern also became evident when examining statistically significant autocorrelated residual values (Fig. 3D). Specifically, areas surrounding Angel de la Guarda Island, Tiburon Island, and Concepcion Bay appeared to have been underpredicted, suggesting that the survey effort in these regions may be higher than what the models suggest. Conversely, the southern part of La Paz Bay, the area between the southern coast of Sonora and the coast of Sinaloa, and the Grandes Islas in front of Bahia de Los Angeles were overpredicted, indicating that the survey effort in these areas was lower than expected (Fig. 3C and D).

In the present study, we examined the navigation routes used by various researchers during whale monitoring in the Gulf of California. Our findings highlight that only a tiny proportion of spatial units can be considered well-surveyed, and this biased pattern persist consistently across different resolutions. We observed an imbalanced distribution of survey effort, where fewer than ten navigation routes are sufficient for a cell to be among the top 5% with the highest number of routes. This result aligns with common patterns seen in the distribution of collection effort distribution across various biological groups (e.g., Boakes et al. 2010 or Lobo et al. 2018; García-Roselló et al. 2023), emphasizing the necessity for systematic monitoring to accurately estimate the abundance and distribution of both terrestrial and marine organisms (Tyne et al. 2016; Mannocci et al. 2018). Frequently, data collection occurs opportunistically and relies on prior knowledge of areas with high animal occurrence (Kot et al. 2010; Embling et al. 2015; Tyne et al. 2016). The prevalent research vessels in the Gulf of California are often small (length < 24m) and restricted to navigating close to the shore. Sampling in open waters, far from the coast, is feasible only with a few large vessels or by air routes, albeit at a significantly higher cost. Utilizing data on whale distribution from online opportunistic platforms substantially reduces research costs, particularly for animals with a high capacity for movement who spend a significant portion of their time diving (Evans and Hammond 2004). Through our analysis of navigation routes at various resolutions, we identified evidence of historical selectivity in recording whales, with only 3–10% of cells considered well-surveyed.

Nevertheless, a crucial question arises: Does the substantial spatial heterogeneity in survey effort stem from environmental and socio-economic factors, or is it merely a result of directing more efforts towards areas with a higher likelihood of whale sightings? The first scenario suggests that improving survey coverage would facilitate more accurate estimation of the density and distribution of various whale species. Conversely, the latter possibility implies that the distribution of survey efforts is associated with the distribution of whales, reflecting the knowledge and prolonged interaction between data collectors and the whale population. Gaps in biodiversity information may arise through a poorly considered mechanism. The existence of previously published or unpublished knowledge might suggest that a particular locality lacks biological interest, thus discouraging additional collection efforts (Dennis and Thomas 2000). We propose referring to these seeming data gaps as “knowledge gaps". Such gaps are likely to emerge once a certain threshold is surpassed in the inventory process of a region, especially in regions that undergo a relatively extensive collection effort for their biodiversity. This happens when one avoids sampling areas that, in the future, may have little interest, and collection efforts increasingly concentrate on locations recognized for their quality and ease in obtaining data on the target organisms. In this study, we suspect that these “knowledge gaps” are likely the primary cause of the relationship between the distribution of survey effort and whale spatial preferences. The spatial distribution of model residuals, along with their significantly autocorrelated values indicated that cells with the most navigation routes exhibited a much greater survey effort than expected based on environmental variables. These over-surveyed areas have recently been identified through satellite data as regions with a high occurrence of fin whales (Balaenoptera physalus) (Jiménes López et al. 2019). These areas predominantly include the route between Loreto and La Paz in the southwestern part of the Gulf and the waters between Bahía de Los Angeles and Angel de la Guarda Island in the north (Jiménes López et al. 2019). Moreover, the northern part of the Gulf of California has been recognized as a specific area where fin whales forage on daytime surface swarms of euphausiids (Ladrón de Guevara et al. 2008). Bahía de La Paz was another over-surveyed area where many cetacean species were also observed (Salvadeo et al. 2011; Pardo et al. 2013; Antichi et al. 2022).

Our results indicate that the distribution of survey effort is determined to minimally by environmental conditions, as these variables only partially influence it. This partial influence may be attributed, in part, to surface whale records being influenced by factors not considered in this study, affecting the dive behavior of individuals (Higby et al. 2012), or intermediate-depth conditions (Pardo et al. 2013; Dransfield et al. 2014). Be that as it may, as spatial resolution increases, the number of statistically significant variables appears to rise, albeit with an overall decrease in explanatory power. Regardless of resolution, our findings suggest that areas with the greatest effort are characterized by shallow and cold waters with high levels of particulate organic carbon and phytoplankton. These environmental conditions align with the abundance and distribution patterns of whales in the Gulf of California (Pardo et al. 2013; García-Morales et al. 2017) and other regions worldwide (Scales et al. 2017; Meynecke et al. 2021), results that further support the existence of “knowledge gaps”. However, it is important to note that these environmental variables often exhibit limited predictive capacity (Higby et al. 2012; Chavez-Rosales et al. 2022).

Previous studies have underscored the limitations of correlative statistical models when extrapolating suitability or probability values in areas with limited or no sampling effort, especially when using opportunistically collected data (Elith and Leathwick 2009; Chavez-Rosales et al. 2022). Nevertheless, even in the presence of opportunistic data, critical areas of occurrence can still be identified when considering sampling effort and incorporating key predictive variables (Higby et al. 2012; García-Roselló et al. 2015; Tang et al. 2021). In our study, the alignment of whale distribution with survey effort, coupled with the explanatory capacity of environmental variables, suggests that the distribution of whales can primarily drive the observed heterogeneous distribution of survey effort. Consequently, to avoid potential misinterpretations, we propose that future analyses aiming to predict any marine or terrestrial vertebrate distribution should consider the possible effect of the organism’s distribution on the distribution of the collection effort, even incorporating survey effort as an explanatory variable when necessary.

Collecting systematic data on marine or terrestrial vertebrates can be challenging, particularly in-migrant animals that are difficult to detect. However, it is crucial to consider survey efforts when analyzing opportunistic biological data to determine the existence of predetermined sites for data collection. On some occasions, the distribution and abundance of the organisms themselves can be the primary criterion explaining the preferential selection of certain localities. This preference may be influenced by the existence of previously published or unpublished knowledge and the prolonged interaction between data collectors and animals. We recommend exploring survey effort data before attempting to estimate suitability or favorable conditions for the species of interest. By doing so, potential errors in interpretations can be minimized, enhancing the accuracy of the findings.

Acknowledgments

We extend our gratitude to MSc Victoria González Cascón of the Computer Biogeography Laboratory at the Museo Nacional de Ciencias Naturales (CSIC), Madrid, Spain, for her invaluable assistance in analyzing georeferenced data. Special thanks to MSc Simone Antichi from the Universidad Autonoma de Baja California Sur and to PhD Angela Cuervo from the Instituto de Biologia, Universidad Nacional Autonoma de Mexico, for their language checks and insightful comments. We also appreciate the valuable input from the editor and reviewers, whose comments and suggestions significantly enhanced this work. We also appreciate the valuable input from the editor and reviewers, whose comments and suggestions greatly enhanced this work.

Author contributions

All authors contributed to the conception and design of the study. Conceptualization, methodology, formal analysis, investigation, data curation, writing, and visualization were conducted by OGC. JML participated in conceptualization, methodology, formal analysis, writing, and supervision. Resources, writing, and methodology were carried out by JUR and AGGU. LPG participated in process, writing, and visualization. The initial manuscript draft was written by OGC, AGGU and JML, and all authors provided comments on previous versions of the manuscript. All authors have read and approved the final manuscript.

Funding

Since 1986, the PRIMMA-UABCS has received funding from various institutions. This study utilizes navigation routes traced by these research endeavors. Omar's work was made possible through the support of CONACHYT Scholarship No. 462068 and Registration No. 618716. We employed Software applications under licenses granted by the Departamento de Biogeografía y Cambio Global, Museo Nacional de Ciencias Naturales (CSIC), Madrid, Spain.

Data availability

The data underpinning this study were gathered from diverse sources, encompassing our proprietary data as well as data generously provided by other research groups, such as CICIMAR-IPN and CICESE, under permission. Requests for the data can be made to the corresponding author, subject to consent from the collaborating research groups.

Conflicts of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethics approval

This study was performed under the permits by Secretaría de Medio Ambiente y Recursos Naturales de México. The study was fully observational following the “Guidelines for the treatment of marine mammals in field research” supported by the Society for Marine Mammalogy.

Anselin L (1988) Spatial Econometrics: Methods and Models. Kluwer Academic Publishers 1899–1925 pp. https://doi.org/10.1111/j.1468-0262.2004.00558.x
Antichi S, Jaramillo-Legorreta AM, Urbán J, Martínez-Aguilar S, Viloria-Gómora L (2022) Small vessel impact on the whistle parameters of two ecotypes of common bottlenose dolphin (Tursiops truncatus) in La Paz Bay, Mexico. Diversity 14(9):712. https://doi.org/10.3390/d14090712
Assis J, Tyberghein L, Bosch S, Verbruggen H, Serrão EA, De Clerck O (2017) Bio-ORACLE v2.0: Extending marine data layers for bioclimatic modelling. Glob Ecol Biogeogr 27(3):277–284. https://doi.org/10.1111/geb.12693
Austin MP, Heyligers PC (1989) Vegetation survey design for conservation: gradsect sampling of forest in North-eastern New South Wales. Biol Conserv 50, 13–32.
Boakes EH, Mcgowan PJK, Fuller RA, Chang-Qing D, Clark NE, O’connor K, Mace GM (2010) Distorted views of biodiversity: Spatial and temporal bias in species occurrence data. Plos Biol 8(6): e1000385. https://doi.org/10.1371/journal.pbio.1000385
Booth G, Niccolucci M, Schuster E (1994) Identifying proxy sets in multiple linear regression: an aid to better coefficient interpretation. US Dept of Agriculture, Forest Service, Interountain Research Station.
Carl G, Kühn I (2007) Analyzing spatial autocorrelation in species distributions using Gaussian and logit models. Ecol Modell 207(2–4):159–170. https://doi.org/10.1016/j.ecolmodel.2007.04.024
Chavez-Rosales S, Josephson E, Palka D, Garrison L (2022) Detection of habitat shifts of cetacean species: A comparison between 2010 and 2017 habitat suitability conditions in the northwest Atlantic ocean. Front Mar Sci 9:877580. https://doi.org/10.3389/fmars.2022.877580
Corkeron PJ, Minton G, Collins T, Findlay K, Willson A, Baldwin R (2011) Spatial models of sparse data to inform cetacean conservation planning: An example from Oman. Endanger Species Res 15:39–52. https://doi.org/10.3354/esr00367
Crawley MJ (1993) GLIM for ecologists. Blackwell Scientific Publications, 379 pp.
Daru BH, Rodríguez J (2023) Mass production of unvouchered records fails to represent global biodiversity patterns. Nat Ecol Evol 7:816–831. https://doi.org/10.1038/s41559-023-02047-3
Davis CL, Guralnick RP, Zipkin EF (2022) Challenges and opportunities for using natural history collections to estimate insect population trends. J Anim Ecol. https://doi.org/10.1111/1365-2656.13763
Dennis RLH, Thomas CD (2000) Bias in butterfly distribution maps: The influence of hot pots and recorder’s home range. J Insect Conserv 4(2):73–77. https://doi.org/10.1023/A:1009690919835
Derville S, Torres LG, Iovan C, Garrigue C (2018) Finding the right fit: Comparative cetacean distribution models using multiple data sources and statistical approaches. Divers Distrib 24 (11):1657-1673. https://doi.org/10.1111/ddi.12782
Diniz-Filho JAF, Bini LM, Hawkins BA (2003) Spatial autocorrelation and red herrings in geographical ecology. Glob ecol biogeogr 12(1):53–64. https://doi.org/10.1046/j.1466-822X.2003.00322.x
Dormann CF, Elith J, Bacher S et al (2013) Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography 36(1):27–46. https://doi.org/10.1111/j.1600-0587.2012.07348.x
Dransfield A, Hines E, McGowan J et al (2014) Where the whales are: Using habitat modeling to support changes in shipping regulations within national marine sanctuaries in central California. Endanger Species Res 26(1):39–57. https://doi.org/10.3354/esr00627
Elith J, Leathwick JR (2009) Species distribution models: Ecological explanation and prediction across space and time. Annu Rev Ecol Evol Syst 40:667–697. https://doi.org/10.1146/annurev.ecolsys.110308.120159
Embling CB, Walters AEM, Dolman SJ (2015) How much effort is enough? The power of citizen science to monitor trends in coastal cetacean species. Glob Ecol Conserv 3:867–877. https://doi.org/10.1016/j.gecco.2015.04.003
Evans PGH, Hammond PS (2004) Monitoring cetaceans in European waters. Mamm Rev 34(1‐2):131–156
García-Morales R, Pérez-Lezama EL, Shirasago-Germán B (2017) Influence of environmental variability on distribution and relative abundance of baleen whales (suborder Mysticeti) in the Gulf of California. Mar Ecol 38(6):e12479. https://doi.org/10.1111/MAEC.12479
García-Roselló E, Guisande C, González-Dacosta J et al (2013) ModestR: A software tool for managing and analyzing species distribution map databases. Ecography 36(11):1202–1207. https://doi.org/10.1111/j.1600-0587.2013.00374.x
García-Roselló E, Guisande C, Manjarrés-Hernández A et al (2015) Can we derive macroecological patterns from primary Global Biodiversuity Information Facility data? Glob ecol biogeogr 24: 335-347. https://doi.org/10.1111/geb.12260
García-Roselló E, González-Dacosta J, Lobo JM (2023) The biased distribution of existing information on biodiversity hinders its use in conservation, and we need an integrative approach to act urgently. Biol Conserv 283: 110118. https://doi.org/10.1016/j.biocon.2023.110118
Graham MH (2003) Confronting multicollinearity in ecological multiple regression. Ecology 84(11):2809-2815
Guralnick RP, Hill AW, Lane M (2007) Towards a collaborative, global infrastructure for biodiversity assessment. Ecol Lett 10 663–672
Higby LK, Stafford R, Bertulli CG (2012) An evaluation of ad hoc presence-only data in explaining patterns of distribution: Cetacean sightings from whale-watching vessels. Int J Zool. https://doi.org/10.1155/2012/428752
Hortal J, Lobo JM (2005) An ED-based protocol for optimal sampling of biodiversity. Biodivers Conserv 14:2913–2947
Hortal J, Lobo JM, Jiménez-Valverde A (2007) Limitations of biodiversity databases: case study on seed-plant diversity in Tenerife (Canary Islands). Conserv Biol 21, 853–863.
Jiménes López MEJ, Palacios MD, Jaramillo Legorreta A et al (2019) Fin whale movements in the Gulf of California, Mexico, from satellite telemetry. PLoS ONE 14(1). https://doi.org/10.1371/journal.pone.0209324
Jueterbock A, Tyberghein L, Verbruggen H et al (2013) Climate change impact on seaweed meadow distribution in the North Atlantic rocky intertidal. Ecol Evol 3(5):1356–1373. https://doi.org/10.1002/ece3.541
Kot CY, Fujioka E, Hazen LJ et al (2010) Spatio-temporal gap analysis of OBIS-SEAMAP project data: Assessment and way forward. PLoS ONE 5(9) https://doi.org/10.1371/journal.pone.0012990
Ladrón de Guevara PP, Lavaniegos BE, Heckel G (2008) Fin whales (Balaenoptera physalus) foraging on daytime surface swarms of the euphausiid Nyctiphanes simplex in Ballenas Channel, Gulf of California, Mexico. J Mammal 89(3):559–566. https://doi.org/10.1644/07-MAMM-A-067R2.1
Lobo JM, Hortal J, Yela JL et al (2018) KnowBR: An application to map the geographical variation of survey effort and identify well-surveyed areas from biodiversity databases. Ecol Indic 91(3):241–248, https://doi.org/10.1016/j.ecolind.2018.03.077.
Mannocci L, Roberts JJ, Halpin PN et al (2018) Assessing cetacean surveys throughout the Mediterranean Sea: A gap analysis in environmental space. Sci Rep 8(1):1–14. https://doi.org/10.1038/s41598-018-19842-9
Meynecke JO, de Bie J, Barraqueta JLM et al (2021) The role of environmental drivers in Humpback Whale distribution, movement and behavior: A review. Front Mar Sci 8:720774. https://doi.org/10.3389/fmars.2021.720774
Pardo MA, Silverberg N, Gendron DE et al (2013) Role of environmental seasonality in the turnover of a cetacean community in the southwestern Gulf of California. Mar Ecol Prog Ser 487:245–260. https://doi.org/10.3354/meps10217
Rangel TF, Diniz-Filho JAF, Bini LM (2010) SAM: A comprehensive application for Spatial Analysis in Macroecology. Ecography 33(1):46–50. https://doi.org/10.1111/j.1600-0587.2009.06299.x
Reddy S, Dávalos LM (2003) Geographical sampling bias and its implications for conservation priorities in Africa. J Biogeogr 30(11):1719–1727. https://doi.org/10.1046/j.1365-2699.2003.00946.x
Redfern J, Ferguson M, Becker E et al (2006) Techniques for cetacean–habitat modeling. Mar Ecol Prog Ser 310:271–295. https://doi.org/10.3354/meps310271
Reese GC, Wilson KR, Hoeting JA, Flather C (2005) Factors affecting species distribution predictions: a simulation modeling experiment. Ecol Appl 15(2):554–564. https://doi.org/10.1890/03-5374
Rocchini D, Hortal J, Lengyel S, et al (2011) Accounting for uncertainty when mapping species distributions: The need for maps of ignorance. Prog Phys Geogr 35(2):211–226. https://doi.org/10.1177/0309133311399491
Salvadeo CJ, Flores-Ramírez S, Gómez-Gallardo AU et al (2011) El rorcual de bryde (Balaenoptera edeni) en el suroeste del Golfo de California: Su relación con la variabilidad de ENOS y disponibilidad de presas. Cienc Mar 37(2):215–225. https://doi.org/10.7773/cm.v37i2.1840
Sastre P, Lobo JM (2009) Taxonomist survey biases and the unveiling of biodiversity patterns. Biol Conserv 142(2): 462-467. https://doi.org/10.1016/j.biocon.2008.11.002
Scales KL, Schorr GS, Hazen EL et al (2017) Should I stay or should I go? Modelling year‐round habitat suitability for whales in the California Current. Divers Distrib 23:1204–1215. https://doi.org/10.1111/ddi.12611
Stat Soft. Inc. (2014) Statistica (data analysis software system).
Stuart-Smith RD, Bates AE, Lefcheck JS et al (20139 Integrating abundance and functional traits reveals new global hotspots of fish diversity. Nature 501(7468):539–542. https://doi.org/10.1038/nature12529
Tang B, Clark JS, Gelfand AE (2021) Modeling spatially biased citizen science effort through the eBird database. Environ Ecol Stat 28(3):609–630. https://doi.org/10.1007/S10651-021-00508-1/TABLES/3
Tyberghein L, Verbruggen H, Pauly K (2012) Bio-ORACLE: A global environmental dataset for marine species distribution modelling. Glob Ecol Biogeogr 21(2):272–281. https://doi.org/10.1111/j.1466-8238.2011.00656.x
Tyne JA, Loneragan NR, Johnston DW et al (2016) Evaluating monitoring methods for cetaceans. Biol Conserv 201:252–260. https://doi.org/10.1016/j.biocon.2016.07.024

Download PDF

Reviewers agreed at journal
11 Apr, 2024
Reviewers invited by journal
02 Apr, 2024
Editor assigned by journal
28 Mar, 2024
First submitted to journal
27 Mar, 2024

You are reading this latest preprint version

The data from sightings suggest a causal correspondence between the distribution of survey effort and the distribution of whales in the Gulf of California

Status:

Version 1

Abstract

Figures

Introduction

Material and methods

Survey effort surrogate

Origin of the explanatory variables

Statistical analysis

Results

Discussion

Declarations

Acknowledgments

References

Status:

Version 1