From: Paula Kahumbu <pkahumbu@gmail.com>
Date: 2019-08-24 12:30
Subject: Re: [KENYABIRDSNET] Kenya Bird Map data

Wow, we saw this with the IUCN African elephant database and MIKE data which had similar terms except you had to ask governments to give you permission for the data on elephants. The data was held by a few individuals who have been publishing articles in their own names, none of them are Africans...  sadly this is s the con in conservation


On Tue, May 7, 2019 at 11:18 AM David Clarance dclarance@gmail.com [kenyabirdsnet] <kenyabirdsnet-noreply@yahoogroups.com> wrote:
 

Dear all,

Over the last 1.5 years I've been somewhat involved in trying to access and analyze data from the Kenya Bird Map. While the methodology is sound and the efforts have galvanized a community around it, I can't help but see issues that go against the very ethos of open and citizen led science. I hope this note is taken in a positive manner - as a way to push for the rights of Kenyan citizen scientists and to strengthen our approach to conservation. 

First, a note on the arrangement of collection and storing data. Data is collected via an app called BirdLasser which is an independently owned private firm based in South Africa. Some of the data is transferred, in a very specific format, to the University Of Cape Town servers which host the data and provide the back-end for the various bird maps (Kenya Bird Map, Nigeria Bird Map etc).

Here are a few points I would love for the more senior and influential birders and researchers to think about:

1. The data is owned by a private firm. 

This to me is the biggest issue. The folks at BirdLasser are wonderful and no doubt committed to conservation and science as their constitution declares. However, their Terms and Conditions are a bit more nuanced and they are not exactly committed to sharing data with researchers. Let me start with an example: A few researchers and I were interested in using the GPS coordinates to build species distribution maps. We approached the KBM to learn that the servers at UCT do not actually have that data. We went to BirdLasser and were told that the user agreement states that they cannot share data without the explicit consent of the user. This means that it is virtually impossible to get all the data since you would need every user to give consent. However, there are two ways to do this:

(a) Register a cause: Users sign in and give consent. Again a really difficult thing to do because you need every user to sign up manually. Go to causes > sign up etc.
(b) Ask for raw data: Only the GPS points, date and time, species is available for a one time data ask. Anything post this you would need to pay ~KSH 5000 for every future data request. Note that this is for EVERY future request post the first one. I get that it's expensive to maintain servers and the app but there are two major red flags for me here:

(i) Citizen scientists collect data for free using their own resources and time. Surely they should not be charged for using data generated?
(ii) Can a Masters student in Kenya actually afford that amount? I certainly would not be able to if I was a student. 


2. Getting data has been incredibly difficult

I work as a data scientist with a background in experiment design. When I learned about this incredible data source, I was REALLY excited and was very surprised by how little it has been used. Colin Jackson and I tried very, very hard to get data for over a year and only last month were able to get access to one API call. The researchers at UCT (who I'm sure are busy) have been very unresponsive to requests for data. Further, the data we got was in a very specific format that was designed to produce reporting rate curves. I used the API call to create a library in R for researchers to use. You can find the repository here

As you can imagine, this process would virtually derail a PhD student's thesis if it takes a year to get summarized data. I was fortunate enough to know Colin and others who pushed strongly for it, but most students do not have this. I think it's extremely unfair that researchers at UCT have access to data collected in Kenya but Kenyan researchers do not. 

3. Valuable data is missing

The various bird atlas projects were designed before BirdLasser came into play. This means when the atlas was shifting over to used BirdLasser as an input source, they chose to stick with the old tables and not update to include valuable information such as breeding or the various other options that BirdLasser provides.

What does this mean? Most of the additional information that you input into the app such as breeding information, counts, all the species info are actually not captured by the Atlas. This makes sense to a degree because the atlas is meant to capture records in a square but pause for a bit and really think about it. We are losing incredible amount of information especially breeding records that are captured by users but not used in the map? If you refer to point 1, this means you would potentially have to start a cause or pay to get all this additional information. 

I think BirdLasser/BirdAtlas is a valuable tool to maintain a bird map, but if want to go deeper and think past presence/absence it gets murky. I think most citizen scientists in KE are under the impression (as was I) that all of this data is available to researchers but it really is not (at least not easily). 


So here's my challenge: 

(a) Is the process of data collection and storage through BirdLasser really citizen science? Citizen science comes with free availability of all the data produced by citizen scientists. 

(b) Is Kenya's bird data in its rawest form safe? As someone who has experience in data engineering, I'm really not sure. In my opinion, public data like this should always live with public institutions. Be it an open repository like GBIF or a university like UCT. We need raw data to be safely stored.


Where do we go from here?

These are my suggestion on the way forward:

1. I think the BirdLasser agreement needs to be rethought: It makes me uncomfortable to have a private firm with no institutional or university backing own all the data that birders collect. BirdLasser is wonderful and have provided a brilliant service but perhaps we could set up an institution that gets all of the BirdLasser data and keeps a backup -- perhaps in GBIF or at the museum? I know some of the pentad data already exists in GBIF. It would be great to get raw data in there too. 

2. Raise funds for the proper development of the KBM website and backend: From what I gathered, the project seems to be understaffed at UCT and is in desperate need of funds to build out a team. 

3. Have some sort of accountability mechanism: This is standard practice in any professional work environment where there are deadlines and targets that are agreed upon and then evaluated. The atlas currently does not have that. We have very little visibility on any of the upcoming features or plans. Perhaps the Bird Committee at Nature Kenya can demand accountability and transparency on what's happening with our data and why it's so hard to get raw data from UCT?

As I wrote earlier, I really hope this email will spark a productive conversation towards ensuring better quality and open data. I hope some day Masters or PhD students in Kenya and all over Africa and the world will be able to access complete data easily and enable conservation efforts.

Best wishes,
David