Published on: 04 April 2016
Written by: Jetske Adams

CognAC Project Synopsis

A Data-Driven Approach to Selecting Marine Protected  Areas

After the exciting welcome presentations in the evening, which explained the set-up of the datathon and briefly introduced us to the topic of sustainable fishing, we were brought to what would become our coding headquarters, at a location which was separate from the other teams. This meant there was less distraction, but also less interaction with the rest of the participants. Our first couple of hours were spent doing domain research. Not only because we had little prior knowledge of the domain, sustainable fishing, but also because the assignment was left very open. For a while, there was a repeating cycle of coming up with interesting research questions, collecting data, and having to discard the question as a result of shortcomings of the data.

At some point in the early morning, we decided on basing our project around marine protected areas (MPAs). These are areas of the sea where certain activities, such as fishing, are prohibited, ensuring fish have a place where they are left undisturbed, maintaining fish populations in a region. We found that the locations of MPAs are generally selected based on expert opinion, but with only little justification through data [GW Allison, J Lubchenco, MH Carr (1998)]. Subsequently we found research on methods of selecting MPAs based on data [RB Cabral, SS Mamauag, PM Aliño (2015)]. The relevant paper only focuses on a small region in the Philippines, and seems to involve manual assessment of local data. As such, we decided to try and apply this research on a larger scale - both in geographical terms, as well as in terms of data. But first we took a nap.

After going over to the main building for breakfast, we started on the necessary work. We gathered data on the locations of phenomena relevant to MPA suitability - coral reefs, mangroves, river estuaries, and more. And some other types of data, such as sea temperature and chlorophyll levels. We wrote scripts to convert the location data into distance data - for example, computing the distance from a given point to a coral reef, taking into account landmasses. We gathered all of the data into a simple program. The program shows a map of the world, with colors indicating the MPA suitability of each location. Above the map are sliders for setting the importance of the different factors (e.g. distance to reefs) in determining the MPA suitability of a location. Moving the sliders interactively shows the effect on the map. The program is open to the addition of new factors.

In parallel with this program, we worked on related ideas, such as a project using machine learning to predict new MPA locations based on existing MPAs and their environment and effectiveness. However, we did not manage to finish these projects in time. Indeed, we only just managed to finish our main program and presentation. In the last hour before dinner, a lot of separate pieces of work came together quite suddenly. Because of the harsh time constraints, there had not been a lot of time for planning infrastructural aspects of the project, such as file formats and code architecture. However, each of us managed to keep our inputs and outputs light so that the interfaces between our work were smooth.

After dinner we proudly presented our results along with the other teams. It was great to see the amazing work everyone had managed to cram in those 24 hours. We learned a lot from our datathon experience and above all we had a lot of fun!