Photo © Andreas Frisch
In this Q&A, we ask author Tanja Petersen about her team’s research on the GBIF database, and find out a little bit more about the author herself. This article is part of the BES cross-journal special feature on Citizen Science.
What’s your article about?
The article is about skews and biases in GBIF data. Specifically, how the origin of the different datasets affect how the records are skewed both geographically, taxonomically and between groups of conservation concern. For example, some habitat types have many more or fewer records than what is expected by random chance; some species groups (especially birds) are much better covered than other ones; and datasets with mostly citizen science records seem to report red-listed species much more frequently than what we expect by random chance.
What is the background behind your article?
The number of species occurrence records freely available through online portals has increased tremendously in recent years, as have the use of such records in research. This is mainly because of the vast technological advancements – essentially, most of us walk around with a decent GPS and camera in our pocket, constantly online. However, we already know that this kind of data come with inherent biases – if we want to use the data reliably, we need to understand what these biases are.
How did you come up with the idea for it?
The idea for this study came as a kind of side-track from the original scope of my PhD. Halfway through the project, my supervisors and I realised that the GBIF data we originally wanted to work with were not as ideal as we had initially hoped when we planned the overarching project. So this article came as an attempt to dive a little deeper into the data and really tease out which issues we are facing when trying to use data from compiled databases for studies on urban ecology.
Why is it important?
As data from these online data portals are used more and more (as they should be – they are an invaluable source of data!), it is increasingly important that we are aware of which strengths and weaknesses we are facing.
What are the key messages of your research?
The number of species occurrence records found in open databases are not evenly distributed among different habitat types, and the biases differ even more when we focus on red-listed or alien species. Since the biases in GBIF data depends on the origin and characteristics of the included datasets, how we account for these taxonomic and geographic biases needs to be a dynamic process, tailored to the individual datasets – especially as the proportion of citizen science records have increased immensely over time, potentially skewing these biases even further.
The bigger picture
In your opinion, what are the strengths and weaknesses of citizen science?
I will probably just repeat what has been stated by so many researchers before me, but repetition does not make it any less true. The most obvious strength is, in my opinion, sheer numbers. The amount of data available because of citizen science would be unimaginable otherwise. I think another great strength is more indirect: the motivation and engagement of non-professionals in research can spark interest for conservation and nature protection on a completely new level, which I think is needed if we are to achieve anything.
The weaknesses are most likely how unevenly distributed citizen science records can be (here I am mostly talking about the fully opportunistic kind) – as they are heavily skewed in both space, time and by species groups, there is a lot of “cleaning up” to do.
What is the next step in this field going to be and what would you like to do next?
I think the next step would be to develop methods and tools to account for these biases. A part of me would like to be involved in these next steps, but I think this might be a job for someone with a stronger statistical background than what I have to offer.
What is your favourite citizen science project or what would your ideal citizen science project be?
There are so many to choose from! I struggle to find a favourite. I think I am the most excited about the efforts in using AI in apps for helping citizen scientists identify species. A friend/colleague of mind is involved in developing this for a Norwegian app, and I am still highly entertained by the fact that in the early stages of training, he was himself identified as a tadpole fish.
Who should read your paper ?
Researchers and students who want to use GBIF (and similar) data, especially if they are aiming for conservation and/management-related research.
About The Author
How did you get involved in ecology?
When I started my bachelor’s in biology, I was originally convinced that I wanted to work with molecular stuff and genetics – I abandoned this idea within the first two years of studying! After a short detour through palaeontology, I finally did a few courses on Conservation Biology and Macroecology early in my Master’s program, and well… the rest is history.
What are you currently working on?
I am currently working on the presentation and defence of my PhD thesis: “Biodiversity dynamics in urban areas under changing land uses”, and I am still involved in a side-project on functional biogeography in Norway.
What is the best and worst thing about being an ecologist?
The best part of being an ecologist (and to some degree, a conservationist) is knowing that what you work on is important and can have a great impact on the planet. The worst part is understanding what a dire state the world’s ecosystems are in, and having to come to terms with not being able to do more to conserve them.
What do you do in your spare time?
I really like cooking while listening to audiobooks. Outside of lockdown periods, I do kickboxing.
One piece of advice for someone in your field…
Say yes to side-projects, whether they are other scientific projects, communication, or something completely unrelated – it is a great way of expanding your network and getting other kinds of experiences and qualifications.
Read the full research: “Species data for understanding biodiversity dynamics: The what, where and when of species occurrence data collection” in Issue 2:1 of Ecological Solutions and Evidence.
This article is part of the BES cross-journal special feature on Citizen Science and you can read the full collection here.
One thought on “Tanja Petersen: Uncovering biases in Citizen Science data”