Old data, new tools: Using random forest modelling to reveal multi-species habitat associations from spoor data

In their new study, Searle, Kaszta, and co-authors from Botswana, Zimbabwe, Germany, the UK, and the US discuss how machine learning can be used to disentangle multi-species habitat relationships and inform conservation planning over large areas.

The importance of policy and governance in preserving wildlife areas has historically meant that conservation has been restricted to efforts within country borders. This approach is at odds with how the world is used by wildlife, and has resulted in a number of ecosystems and dispersal corridors being severed by political boundaries.

Transfrontier conservation areas (TFCAs) are a model that emerged in the late 20^th century to address this mismatch, by enabling coordinated management of mixed-use wildlife areas that straddle the boundaries of multiple countries. While the scale of TFCAs is their strength, it also presents a unique challenge for PA managers and policymakers. Not only must these stakeholders consider multi-species perspectives – as healthy wildlife communities are critical for functional ecosystems – but they must do so across very large areas.

The Kavango-Zambezi (KAZA) TFCA in southern Africa spans five countries, and encompasses a huge variety of habitats and land management types. At a whopping 520,000km² – more than twice the size of the UK – KAZA is the world’s largest terrestrial TFCA. Although KAZA has been in place for nearly twenty years, the area’s management authorities and policymakers require up-to-date information on how wildlife populations are using the TFCA, to feed into conservation and management plans.

Picture1 - Searle — The study area was located in the southern KAZA Transfrontier Conservation Area, across northern Botswana and western Zimbabwe

We knew that we wanted to use this dataset to model habitat associations for multiple mammal species, including both carnivores and herbivores. However, given the size of the study area (the same size as Belgium!) and diversity of study species, we needed a modelling technique that wouldn’t struggle with complex or unintuitive relationships. Luckily for us, recent years have seen huge advances in machine learning approaches, which have emerged as a powerful tool to model complex systems.

Random forest is a powerful machine-learning algorithm which has been used in remote sensing, restoration ecology, habitat suitability & connectivity analyses, and climate change predictions. In simple terms, random forest models consist of multiple decision trees that operate as an ensemble. Each of these individual decision trees makes a class prediction, and the random forest selects the class with the most predictions across all its constituent decision trees. The decision trees that make up a random forest are built with random subsets of a complete dataset, selected via a bootstrap aggregation (“bagging”) procedure; the remaining “out-of-the-bag” data are then used to evaluate the model.

We used random forest to model our large-scale spoor dataset in a multi-scale framework, to investigate habitat associations for 16 mammal species of high conservation importance. In doing so, we revealed some emerging trends that may impact KAZA’s mammal community in future years.

Rainfall was consistently important in predicting species distributions. Given that rainfall is predicted to decrease across southern Africa under multiple climate change scenarios, with significant drying projected over parts of Botswana, this highlights the significant impacts that climate change could have on wildlife distributions in the ecosystem.

Picture2 - Searle — Rainfall patterns are an important determinant of wildlife distributions, and are expected to change considerably under climate change. Photo: Charlotte E. Searle.

Cattle density was also negatively associated with detections of our study species. This hints at another concerning trend for KAZA’s wildlife populations, as food demand for livestock products in sub-Saharan Africa is projected to nearly double from 2010 to 2050. The intensification of livestock production required to meet this demand is likely to increase competition faced by wild herbivores, contribute to habitat degradation, and require the conversion of wild habitat to support grazing.

Picture3 - Searle — Livestock numbers in sub-Saharan Africa are expected to increase substantially in the coming years to meet growing demand, which will contribute to habitat degradation and conversion. Photo: Charlotte E. Searle.

These emerging threats are not just limited to the study landscape: similar shifts in rainfall are anticipated across much of the world, including many of the world’s most biodiverse regions, and global livestock numbers are expected to increase substantially in the coming years. In light of this, we recommend that wildlife management authorities should use modelling exercises and adaptive management to ensure protected area networks remain fit for purpose under anticipated changes in rainfall under climate change, and explore initiatives that promote coexistence of wildlife and livestock.

This study illustrates how new tools can be applied to existing data to improve our understanding of how wildlife communities use vast, mixed-use conservation landscapes like TFCAs. In doing so, we can unlock new insights that help identify threats and guide conservation actions.

Read the full paper Random forest modelling of multi-scale, multi-species habitat associations within KAZA transfrontier conservation area using spoor data in Journal of Applied Ecology.