- Nov 2016
-
journals.plos.org journals.plos.org
-
My thoughts on Climatic Associations of British Species Distributions Show Good Transferability in Time but Low Predictive Accuracy for Range Change by Rapacciuolo et al. (2012).
-
Whilst the consensus method we used provided the best predictions under AUC assessment – seemingly confirming its potential for reducing model-based uncertainty in SDM predictions [58], [59] – its accuracy to predict changes in occupancy was lower than most single models. As a result, we advocate great care when selecting the ensemble of models from which to derive consensus predictions; as previously discussed by Araújo et al. [21], models should be chosen based on aspects of their individual performance pertinent to the research question being addressed, and not on the assumption that more models are better.
It's interesting that the ensembles perform best overall but more poorly for predicting changes in occupancy. It seems possible that ensembling multiple methods is basically resulting in a more static prediction, i.e., something closer to a naive baseline.
-
Finally, by assuming the non-detection of a species to indicate absence from a given grid cell, we introduced an extra level of error into our models. This error depends on the probability of false absence given imperfect detection (i.e., the probability that a species was present but remained undetected in a given grid cell [73]): the higher this probability, the higher the risk of incorrectly quantifying species-climate relationships [73].
This will be an ongoing challenge for species distribution modeling, because most of the data appropriate for these purposes is not collected in such a way as to allow the straightforward application of standard detection probability/occupancy models. This could potentially be addressed by developing models for detection probability based on species and habitat type. These models could be built on smaller/different datasets that include the required data for estimating detectability.
-
an average 87% of grid squares maintaining the same occupancy status; similarly, all climatic variables were also highly correlated between time periods (ρ>0.85, p<0.001 for all variables). As a result, models providing a good fit to early distribution records can be expected to return a reasonable fit to more recent records (and vice versa), regardless of whether relevant predictors of range shift have actually been captured. Previous studies have warned against taking strong model performance on calibration data to indicate high predictive accuracy to a different time period [20], [24]–[26]; our results indicate that strong model performance in a different time period, as measured by widespread metrics, may not indicate high predictive accuracy either.
This highlights the importance of comparing forecasts to baseline predictions to determine the skill of the forecast vs. the basic stability of the pattern.
-
Most variation in the prediction accuracy of SDMs – as measured by AUC, sensitivity, CCRstable, CCRchanged – was among species within a higher taxon, whilst the choice of modelling framework was as important a factor in explaining variation in specificity (Table 4 and Table S4). The effect of major taxonomic group on the accuracy of forecasts was relatively small.
This suggests that it will be difficult to know if a forecast for a particular species will be good or not, unless a model is developed that can predict which species will have what forecast qualities.
-
The correct classification rate of grid squares that remained occupied or remained unoccupied (CCRstable) was fairly high (mean±s.d. = 0.75±0.15), and did not covary with species’ observed proportional change in range size (Figure 3B). In contrast, the CCR of grid squares whose occupancy status changed between time periods (CCRchanged) was very low overall (0.51±0.14; guessing randomly would be expected to produce a mean of 0.5), with range expansions being slightly better predicted than range contractions (0.55±0.15 and 0.48±0.12, respectively; Figure 3C).
This is a really important result and my favorite figure in this ms. For cells that changed occupancy status (e.g., a cell that has occupied at t_1 and was unoccupied at t_2) most models had about a 50% chance of getting the change right (i.e., a coin flip).
-
The consensus method Mn(PA) produced the highest validation AUC values (Figure 1), generating good to excellent forecasts (AUC ≥0.80) for 60% of the 1823 species modelled.
Simple unweighted ensembles performed best in this comparison of forecasts from SDMs for 1823 species.
-
Quantifying the temporal transferability of SDMs by comparing the agreement between model predictions and observations for the predicted period using common metrics is not a sufficient test of whether models have actually captured relevant predictors of change. A single range-wide measure of prediction accuracy conflates accurately predicting species expansions and contractions to new areas with accurately predicting large parts of the distribution that have remained unchanged in time. Thus, to assess how well SDMs capture drivers of change in species distributions, we measured the agreement between observations and model predictions of each species’ (a) geographic range size in period t2, (b) overall change in geographic range size between time periods, and (c) grid square-level changes in occupancy status between time periods.
This is arguably the single most important point in this paper. It is equivalent to comparing forecasts to simple baseline forecasts as is typically done in weather forecasting. In weather forecasting it is typical to talk about the "skill" of the forecast, which is how much better it does than a simple baseline. In this case the the baseline is a species range that doesn't move at all. This would be equivalent to a "naive" forecast in traditional time-series analysis since we only have a single previous point in time and the baseline is simply the prediction based on this value not changing.
-
Although it is common knowledge that some of the modelling techniques we used (e.g., CTA, SRE) generally perform less well than others [32], [33], we believe that their transferability in time is not as well-established; therefore, we decided to include them in our analysis to test the hypothesis that simpler statistical models may have higher transferability in time than more complex ones.
The point that providing better/worse fits on held out spatial training data is not the same was providing better forecasts is important especially given the argument about simpler models having better transferability.
-
We also considered including additional environmental predictors of ecological relevance to our models. First, although changes in land use have been identified as fundamental drivers of change for many British species [48]–[52], we were unable to account for them in our models – like most other published accounts of temporal transferability of SDMs [20], [21], [24], [25] – due to the lack of data documenting habitat use in the earlier t1 period; detailed digitised maps of land use for the whole of Britain are not available until the UK Land Cover Map in 1990 [53].
The lack of dynamic land cover data is a challenge for most SDM and certainly for SDM validation using historical data. If would be interesting to know, in general, how much better modern SDMs become based on held out data when land cover is included.
-
Great Britain is an island with its own separate history of environmental change; environmental drivers of distribution size and change in British populations are thus likely to differ somewhat from those of continental populations of the same species. For this reason, we only used records at the British extent to predict distribution change across Great Britain.
This restriction to Great Britain for the model building is a meaningful limitation since Great Britain will typically represent a small fraction of the total species range for many of the species involved. However this is a common issue for SDMs and so I think it's a perfectly reasonable choice to make here given the data availability. It would be nice to see this analysis repeated using alternative data sources that cover spatial extents closer to that of the species range. This would help determine how well these results generalize to models built at larger scales.
-