Samuel Fischer: How can invasive species management benefit from smartphone data?

Shortlisted for the 2023 Southwood Prize


Samuel Fischer discusses how he and colleagues researched and utilised an angler smartphone app to build a stochastic model for angler traffic in the Canadian province of Alberta. Anglers facilitate the spread of whirling disease, a parasite-induced fish disease, meaning this model demonstrates the importance of individual-specific behaviour of vectors for propagule transport.

From smartphone data to invasive species management

Animal diseases and invasive species threaten ecosystems all over the world, often spread via human traffic and trade. Ironically, it can be those who love nature the most who put it at risk:

  • campers (carrying non-native insects along with their firewood)
  • recreational boaters (transporting invasive mussels along with their boats)
  • anglers (carrying fish parasites along with their gear).

Once introduced, the success in eradicating or containing pathogens or invasive species at a site depends heavily on how quickly the infestation was spotted. Managers therefore need risk estimates to decide when and where early detection and rapid response measures should be employed.

Infestation risk estimates require information on how many people travel from infested to uninfested sites. Gathering such data for recreational traffic can be challenging, as it varies widely according to the personal preferences and decisions of the travelling individuals. As a result, large (and often expensive) surveys were needed to estimate recreational traffic in the past.

Recreational fishing © Pixabay

However, as smartphones have found their way into almost all areas of our lives, including recreation, data collected via smartphone apps may offer new opportunities for estimating recreational traffic. For example, smartphone apps are used by anglers to record their fishing sites and to share these locations with one another. What if this information could also be used to stop the expansion of angler-spread fish diseases?

Some intricacies of smartphone data

In theory, data from mobile apps has many advantages: it can be gathered at a relatively low cost for users from various locations, and it may contain precisely georeferenced information with exact time stamps. Though great for scientists, this wealth of information can also be abused, making privacy a major concern.

Hence, researchers rely on voluntarily provided data. This is awesome, but also poses major challenges when estimating recreational traffic, because app users may not record all the recreational sites they visit. So even if a record shows that a user visited site A and site B, they may not have traveled directly from A to B but may rather have gone from site A to another site C (not recorded) before finally reaching site B.

Hence, they may have transported propagules or pathogens from A to C and from C to B, not from A to B. This makes it difficult to estimate trip counts for individual pairs of sites – the unrecorded trips leave us with an infinite number of possibilities to deal with.

So, what do we do?

Fortunately, there is a field where considering infinite possibilities is not an unsolvable issue: mathematics. Of course, before any problem can be tackled with mathematical tools, it must be expressed in mathematical terms. We did this by creating a mathematical model for recreational traffic, combining app data with socio-economic and geographical data, via basic assumptions about the behaviour of recreationists.

Summary of study © Fischer et al, 2023

The behaviour of two individuals can differ strongly, as their choices may be driven by personal preferences and past experiences. This is significant, and a challenge for modelling, as these preferences and experiences are typically unknown. However, we found that this could also yield a unique chance: based on personal preferences and repeating behaviour, we might be able to draw inference on unrecorded trips.

To deal with our missing knowledge of individual preferences, we took a ‘brute force’ approach: for all app users, we considered all potential preferences they could have (according to our model) and weighted them with a corresponding likelihood. That way, we were able to infer ‘mean’ expected traffic flows between sites despite missing trip records and missing information on personal preferences.

Estimated number of consecutive trips to subbasin pairs. Only subbasin pairs with more than 100 trips per year are shown © Fischer et al, 2023

We applied our approach to estimate angler traffic in the Canadian province of Alberta, where anglers are at risk of spreading whirling disease, a parasite-induced fish disease. To our own surprise, we were not only able to estimate the total angler traffic between each subbasin pair in the province but also found that in 64% of their trips, anglers revisited their previous fishing location, making these trips irrelevant for spreading the disease. Furthermore, in about half their trips, they visited sites in spatially contained areas of their personal preference. These results showed that models ignoring individual preferences and repeating behaviour may significantly overestimate long-distance traffic, which can be the driving force of disease and invasive species spread.

What comes next?

We hope that as more data becomes available, propagule transport models may become increasingly precise. Novel machine learning techniques could help us to achieve this goal. For example, machine learning may allow us to precisely infer missing data. This, of course, requires some complete data sets for training and validation – which were not available in our study.

But even without such data, machine learning could help us to determine where recreationists are most active or to assess the attractiveness of recreational sites. Such information is key for traffic estimators such as ours. The new data sources and techniques may then give us more effective tools to fight the spread of diseases and invasive species, helping us to preserve the biodiversity on our planet.

About the author

I am a theoretical/computational ecologist currently working as a post-doc at the Helmholtz Centre for Environmental Research in Leipzig, Germany. My route into the biological sciences was not direct. As a young student, I was not particularly interested in ecology, which I primarily associated with ‘learning a lot of stuff by heart’. However, I loved modelling, the process of looking at a system, trying to identify the core processes governing its behaviour and expressing these in mathematical/computational terms.

The author © Samuel Fischer

I was intrigued to see different processes being driven by similar underlying mechanics and simple interactions leading to wildly complex patterns and behaviour. Quickly, I discovered that this applies in particular to ecological systems, which are often too complex to be fully understood in every detail, but exhibit an order and intrinsic behaviour hinting at wonderfully mysterious underlying mechanisms. This got me hooked and made me increasingly passionate about ecology.

Following this excitement, I spent my PhD at the University of Alberta developing tools to estimate and control the spread of aquatic invasive diseases. This branch of research also led to the study presented here. After completing this paper, I switched gears to forest modelling, an equally exciting field that is currently experiencing a significant boost through the advent of new remote sensing data sources and computational techniques.

I believe that, to yield their full potential, models need to be combined with empirical data. This makes it necessary to jointly consider ecological domain knowledge, statistical methodology and computational performance requirements even during model design. Therefore, joint innovations in multiple disciplines are needed. This drives my interest in methodological aspects and the integration of mathematical modelling, algorithm design, and statistics. I hope to continue contributing to this field; the future will tell whether this can be in the context of science or elsewhere.

Read the full article “Boosting propagule transport models with individual-specific data from mobile apps in Journal of Applied Ecology.

Find the other early career researchers and their articles that have been shortlisted for the 2023 Southwood Prize here!

Leave a comment