From: David Clarance <dclarance@gmail.com>
Date: 2019-05-07 11:15
Subject: Kenya Bird Map data

Dear all,

Over the last 1.5 years I've been somewhat involved in trying to access and
analyze data from the Kenya Bird Map. While the methodology is sound and
the efforts have galvanized a community around it, I can't help but see
issues that go against the very ethos of open and citizen led science. I
hope this note is taken in a positive manner - as a way to push for the
rights of Kenyan citizen scientists and to strengthen our approach to
conservation.

First, a note on the arrangement of collection and storing data. Data is
collected via an app called BirdLasser which is an independently owned
private firm based in South Africa. Some of the data is transferred, in a
very specific format, to the University Of Cape Town servers which host the
data and provide the back-end for the various bird maps (Kenya Bird Map,
Nigeria Bird Map etc).

Here are a few points I would love for the more senior and influential
birders and researchers to think about:

*1. The data is owned by a private firm. *

This to me is the biggest issue. The folks at BirdLasser are wonderful and
no doubt committed to conservation and science as their constitution
<https://www.birdlasser.com/docs/birdLasser_constitution.pdf>declares.
However, their Terms and Conditions
<https://www.birdlasser.com/terms-and-conditions> are a bit more nuanced
and they are not exactly committed to sharing data with researchers. Let me
start with an example: A few researchers and I were interested in using the
GPS coordinates to build species distribution maps. We approached the KBM
to learn that the servers at UCT do not actually have that data. We went to
BirdLasser and were told that the user agreement states that they cannot
share data without the explicit consent of the user. This means that it is
virtually impossible to get all the data since you would need every user to
give consent. However, there are two ways to do this:

(a) Register a cause: Users sign in and give consent. Again a really
difficult thing to do because you need every user to sign up manually. Go
to causes > sign up etc.
(b) Ask for raw data: Only the GPS points, date and time, species is
available for a one time data ask. *Anything post this you would need to
pay ~KSH 5000 for every future data request.* Note that this is for EVERY
future request post the first one. I get that it's expensive to maintain
servers and the app but there are two major red flags for me here:

(i) Citizen scientists collect data for free using their own resources and
time. Surely they should not be charged for using data generated?
(ii) Can a Masters student in Kenya actually afford that amount? I
certainly would not be able to if I was a student.


*2. Getting data has been incredibly difficult*

I work as a data scientist with a background in experiment design. When I
learned about this incredible data source, I was REALLY excited and was
very surprised by how little it has been used. Colin Jackson and I tried
very, very hard to get data for over a year and only last month were able
to get access to one API call. The researchers at UCT (who I'm sure are
busy) have been very unresponsive to requests for data. Further, the data
we got was in a very specific format that was designed to produce reporting
rate curves. I used the API call to create a library in R for researchers
to use. You can find the repository here
<https://github.com/davidclarance/africabirdmap>.

As you can imagine, this process would virtually derail a PhD student's
thesis if it takes a year to get summarized data. I was fortunate enough to
know Colin and others who pushed strongly for it, but most students do not
have this. I think it's extremely unfair that researchers at UCT have
access to data collected in Kenya but Kenyan researchers do not.

*3. Valuable data is missing*

The various bird atlas projects were designed before BirdLasser came into
play. This means when the atlas was shifting over to used BirdLasser as an
input source, they chose to stick with the old tables and not update to
include valuable information such as breeding or the various other options
that BirdLasser provides.

What does this mean? Most of the additional information that you input into
the app such as breeding information, counts, all the species info are
actually not captured by the Atlas. This makes sense to a degree because
the atlas is meant to capture records in a square but pause for a bit and
really think about it. *We are losing incredible amount of information
especially breeding records that are captured by users but not used in the
map*? If you refer to point 1, this means you would potentially have to
start a cause or pay to get all this additional information.

I think BirdLasser/BirdAtlas is a valuable tool to maintain a bird map, but
if want to go deeper and think past presence/absence it gets murky. I think
most citizen scientists in KE are under the impression (as was I) that all
of this data is available to researchers but it really is not (at least not
easily).


*So here's my challenge: *

(a) Is the process of data collection and storage through BirdLasser really
citizen science? Citizen science comes with free availability of all the
data produced by citizen scientists.

(b) Is Kenya's bird data in its rawest form safe? As someone who has
experience in data engineering, I'm really not sure. In my opinion, public
data like this should *always* live with public institutions. Be it an open
repository like GBIF or a university like UCT. We need raw data to be
safely stored.


*Where do we go from here?*

These are my suggestion on the way forward:

1. I think the BirdLasser agreement needs to be rethought: It makes me
uncomfortable to have a private firm with no institutional or university
backing own all the data that birders collect. BirdLasser is wonderful and
have provided a brilliant service but perhaps we could set up an
institution that gets all of the BirdLasser data and keeps a backup --
perhaps in GBIF or at the museum? I know some of the pentad data already
exists in GBIF. It would be great to get raw data in there too.

2. Raise funds for the proper development of the KBM website and backend: