Our results confirm that N-mixture models can provide ecologically reasonable abundance estimates for some reptile species, but that choice of model variant can have important implications for estimates of abundance. We found that overall congruence between model variants was rare, although there were exceptions (e.g., TTD-RN and Poisson-ZIP variant pairs were generally congruent in their abundance estimates). Our results show that N-mixture model performance across variants increased significantly with species detectability, although models performed very poorly for the two most conspicuous species (i.e., highest p). Importantly, we demonstrate that individual aspects of model performance varied between model variants, such that blanket application of a single model variant to a suite of ecologically diverse species has the potential to introduce artefactual variation in abundance estimates.
The five N-mixture model variants we tested performed best for widespread, fairly easy-to-find species, being well-fitted and providing congruent abundance estimates. For species with lower occupancy and detection probability, we tended to observe considerable variation in the performance of each model variant, suggesting that specific variants may provide substantially better abundance estimates for these species. However, the relative performance of individual model variants was inconsistent between ecologically similar species. What is more, some models that ranked highly according to our performance metric were either poorly fitted or provided erroneous abundance estimates (e.g., excessive presence estimates of habitat specialists). We attribute this to the rare instances where presence estimates were higher than expected but models were well fitted and generated reasonable abundance estimates, and thus were not penalised. As such, a priori matching of species and N-mixture model variants based on individual performance criteria is difficult, highlighting the risks of using a single variant to estimate abundance. Nonetheless, we observed a number of interactions between model variant, aspects of model performance, and species detectability that are important for users to consider.
Within variants, models selected by AICc ranking were generally well-fitted and included ecologically relevant site covariates. However, parsimony did not necessarily imply that the models provided ecologically reasonable abundance estimates. Due to underlying differences in model variant formulation, we cannot comment on differences in parsimony between TTD, RN, and count variants. But, within the count variants, which are directly comparable by AICc, the NB variant often provided unreasonably high abundance estimates despite frequently ranking as the most parsimonious count variant. Despite that fact, the NB variant performed well for some species with moderate to high occupancy and detection probabilities (Fig. X), providing reasonable abundance estimates for G. flavigularis, L. capensis, and T. damarana. However, the tendency for the NB variant to severely overestimate abundance, especially for rare and inconspicuous species and with few survey replicates, has been widely observed (Joseph et al 2009; Couturier et al 2013; Dennis et al 2015; Kéry 2018). Thus, we reaffirm that additional measures, such as interrogation of the ecological feasibility of estimates, should be incorporated in N-mixture model selection and interpretation (cf. Joseph et al 2009; Koetke et al 2024).
The nine lizard species we modelled occur widely in Zimbabwe and their ecology is comparatively well known (Branch 1998; Howard and Hailey 1999; Jacobsen and Broadley 2000, Pietersen et al 2021; Stander 2023). However, literature on their respective population densities is sparse (Meiri 2024). While we were able to deduce an informed range of ecologically feasible abundance estimates, the upper limit of this range is disputable. As such, congruence between multiple N-mixture model variants suggested that our ranges were valid, but well-fitted models with abundance estimates above our expected upper limit were still informative. For example, we observed the best performance scores in the models for P. maculicollis, an inconspicuous but widespread species. The TTD, Poisson, ZIP, and NB model variants predicted an upper abundance limit of approximately 71–77 individuals per hectare, with most sites predicted to have 13–14 individuals. This constitutes a higher-than-expected maximum, but one which may be ecologically reasonable, given the habitat heterogeneity within the study area. Had we only selected the RN variant (which was well-fitted but scored lowest according to our performance metric) to estimate abundance of this species, we would have concluded that most sites host around three individuals, which is unlikely given the diminutive body size of this species. This demonstrates how a comparison of multiple N-mixture model variants can improve our confidence in abundance estimates when prior knowledge of species density is lacking, as is typical for reptiles.
Comparing the outputs of multiple variants also aids in determining whether or not the study species are amenable to N-mixture modelling in the first place. Estimating the abundance of conspicuous habitat specialists (P. intermedius rhodesianus and T. margaritifera) proved to be problematic for all of the N-mixture model variants we tested. These species were encountered in every survey at sites where they were present. As such, they demonstrated zero heterogeneity in detection probability. This situation renders the RN variant useless, as there is little heterogeneity from which to generate an abundance estimate, apart from the dichotomy of occupied versus unoccupied sites. In our study, the RN variant provided few reasonable estimates for these species, and even estimated abundances of zero at sites where the species are known to be present. Detection times were consistently short enough to cause a similar effect in the TTD variant as well. While the three count variants produced some reasonable abundance estimates for these species, they demonstrated lack of fit (ĉ = 0 or ĉ >> 1) and erroneously predicted that the species were present at most sites. As such, it appears that there are critical natural parameter values at which N-mixture models tend to become unreliable. A traditional capture-mark-recapture (CMR) design may be a more appropriate method of obtaining abundance estimates of reliably detectable species. For such species, we recommend that efforts be directed at increasing capture efficiency, rather than computing the influence of variation in detection probability (which may be marginal).
We share the experience of other herpetologists (e.g., Steen 2010, Steen et al 2012) in that we were unable to fit reliable models for reptiles with very low detection probabilities (p < ~ 0.25). For the seven species which were not included in the model comparison exercise, we attribute poor models either to low detection rates (i.e., insufficient to fit a non-null model) or limitations in accurately measuring relevant covariates. The latter issue arose for enigmatic species (e.g., Mochlus sundevallii) but may also explain the poor performance of models for conspicuous habitat specialists, where sites appeared structurally heterogeneous although we observed little heterogeneity in detectability. Additionally, while we observed four snake species in our surveys, only one (Psammophis subtaeniatus) was encountered more than once. As this species is a highly mobile habitat generalist, we were unable to construct acceptable models for it. We reaffirm that obtaining sufficient detections and capturing relevant environmental data are central challenges to snake ecology.
The selection and accurate measurement of ecologically relevant covariates is both critical and a common challenge in N-mixture modelling (Angeli et al 2018; Ficetola et al 2018b). The fact that observational covariates were rarely included in the best performing models either suggests that remote sensing of the climatic variables was inadequate, or that climatic variation within the survey period was insufficient to explain heterogeneity in detection probability. It is also possible that moving cover and inspecting rock cracks during our surveys reduced the significance of these observational covariates, as some reptiles were detected when they were inactive (notably, nocturnal species such as A. transvaalica and H. tasmani). As the ecology of many reptile species is poorly known (Tingley et al 2016; Meiri 2024), dependencies on particular habitat variables may not be obvious, and it is unclear what measurement scale is required to generate reliable abundance estimates. Within a reptile community, species/individuals are likely to have very different body sizes and dispersal capabilities and are thus affected by environmental conditions at different scales (Doherty et al 2019). This creates a ‘Catch-22’, as N-mixture models have been cited as a promising tool for studying secretive species hampered by low detection probabilities, yet we lack the prior knowledge required to select valid covariates. Parameter estimates from N-mixture models are further limited by how accurately we can capture heterogeneity in detection across sites and between surveys (Barker et al, 2017; Goldstein and de Valpine 2022), but we may have no notion of whether these estimates are realistic or not.
How, then, can we best apply N-mixture models in reptile community ecology? While goodness-of-fit is an empirically crucial measure of model performance (Knape et al 2018), our results show that reliance on other, individual aspects of N-mixture model performance (parsimony, precision and ecological feasibility of estimates, and perceived relevance of covariates) may have serious implications in determining the acceptance or rejection of abundance estimates. Comparing multiple model variants eases our reliance on these factors in interpreting model outputs. It is unlikely that a single methodological framework will ever be appropriate for all members of a reptile community (Foster et al 2012), and the same applies to N-mixture models (Ficetola et al 2018b). When data on behavioural ecology and population density are lacking, model comparison may also allow us to infer reasonable site-level abundance estimates from a range of possible values. In such a situation, incongruence between model estimates is informative rather than problematic.
While different N-mixture model variants may indeed provide substantially incongruent abundance estimates, this does not invalidate their value in ecology. Rather, our results indicate that individual model variants are likely suitable to different species and different datasets, but this suitability is not inherently obvious. As the field of N-mixture modelling is growing in popularity and complexity, we recommend that researchers continue to compare multiple model variants to account for differences in model performance associated with species detectability. This, in turn, may inform future studies on ecologically similar species by providing context that is relevant to model selection and interpretation. In our experience, this is achievable at little extra cost to standard site occupancy surveys, simply requiring additional time spent on data processing and analysis. Comparing multiple N-mixture model variants may also indicate whether abundance estimates are sufficiently consistent to be used in population monitoring. Even if absolute abundance is not estimable by N-mixture models, relative abundance may still be a viable tool for interpreting ecological processes governing the distribution of species (Barker et al 2017; Goldstein and de Valpine 2022). If a single model variant is chosen for this purpose, then its congruence with other variants can be quantified and reported as demonstrated in this study. For enigmatic species, this comparative approach brings us a step closer towards identifying and understanding ecologically relevant covariates, convening on reasonable abundance estimates.
Finally, comparing the outputs of multiple model variants aids in determining whether an N-mixture approach is appropriate to studying the species at hand or not. Given that N-mixture models are cost-effective in terms of data collection, an alternative methodology, such as CMR, could be applied in the same system. We acknowledge the limitations of our data and predict that improved knowledge on species detectability and ecology, particularly with regards to appropriate covariate selection and measurement, may indeed allow users to match species with appropriate N-mixture model variants or alternative analyses a priori.