Challenge provided by Cascais Municipality

Avencas Marine Protected Area: Predict the future of the local ecosystem and its species

The research findings suggest a correlation between ocean acidification and lower biodiversity in the AMPA. This connection can be used to raise awareness about climate change and its potential impact on marine biodiversity among locals and tourists.

The Avencas Marine Protected Area (AMPA) is a Biophysical Interest Zone in Cascais, Portugal. The AMPA has been under close observation since 2010, with regular biodiversity sampling taking place and the source of a case study by Ferreira et al. (2017).

Figure 1 - Location of the sampling areas in the AMPA.

To help protect this unique marine ecosystem, measures were taken to reduce human interference, but the system did not recover as well as expected. This is why the municipality of Cascais is looking for help getting a long-term analysis of changes in the abundance of species in the AMPA.

The two main focus areas are:

  1. Examine potential factors that influence the abundance of species in the AMPA and could cause the lack of recovery of biodiversity. The city of Cascais is looking for out-of-the-box ideas of potential correlating factors.
  2. Determine the trajectory species (especially endangered and invasive ones) are on in the AMPA to help provide evidence for the usefulness of the current measures if the development is positive or for advocating for more protection if the development is negative.


Identify variables that potentially impact the marine ecosystem of the Avencas Marine Protected Area and predict further developments with a special focus on endangered and invasive species.

United Nations SDG 

GOAL 14: Life below water

  • Target 14.2: By 2020, sustainably manage and protect marine and coastal ecosystems to avoid significant adverse impacts, including by strengthening their resilience and taking action for their restoration in order to achieve healthy and productive oceans.
  • Target 14.3: Minimize and address the impacts of ocean acidification, including through enhanced scientific cooperation at all levels


The following datasets were provided to the participants:

  • Percentage coverage with sessile species in samples taken between 2011 - 2020 in the AMPA
  • Number of mobile species in samples taken between 2011 - 2020 in the AMPA
  • A reference list of which species are considered invasive and which are considered endangered according to the IUCN. (Note that for some species, the Portugal-specific conservation status assigned by the IUCN is given)
  • Bathymetric data of the AMPA
  • Sampling area shapefiles


Cascais’ challenge centered on identifying new variables that might be impacting the marine ecosystem of the AMPA. Due to the exploratory nature of the challenge, teams used many additional data sources. The following list is a selection.

One team suggested setting up a measurement station close to the AMPA to gather data on chemical and physical properties directly in the area of interest.

Methods and Techniques

Approaches of the teams for this channel ranged from using SARIMA, LASSO, gradient boosting regressor, XGBoost and random forest regressor models to determine features importance. For predictive modeling, several teams relied on time-series models like SARIMA and VAR.

For data preprocessing one team leveraged PCA to reduce the dimensions of the environment data.

One team developed SARIMA models to predict the number of invasive species, the number of endangered species and the Shannon-Wiener Index derived from the mobile abundance data and a converted version of the sessile data using the methodology of Deepananda and Macusi (2013) with the formula:


H=Shannon diversity indexpi=proportion of individuals of ith species in the population

Fitting a SARIMA model to the derived Shannon-Index allowed for residual analysis to determine feature importance. This team chose to use a SARIMAX model incorporating the features identified for use in their product with an RMSE of 0.20 and an MAE of 0.17.

Another team determined diversity by using the Hill-Simpson metric inspired by Roswell et al. (2021). After testing several models, the team selected a LASSO model based on its strong regularization to determine feature importance. They noted that none of the many weather-based variables they analyzed was a strong predictor of species diversity but there was a trend that lower water temperatures, humidity, precipitation, water vapor pressure deficit, and a higher cloud cover was associated with more biodiversity.

Main Insights from Data

One team noticed how well the biodiversity in AMPA correlated with Ocean health indexes for all of Portugal postulating that this might indicate that the reason for the slower recovery of species in the AMPA could be caused by global rather than local factors. Invasive species were correlated with higher chlorine levels and endangered species by temperature-based features. In general, the biodiversity seemed stable over the years, which correlated with the information from the domain experts, that they did not see the recovery expected by their interventions.

Figure 2 - Analysis of one team of the monthly trend for the shannon index, showing a stationary biodiversity with strong inter-month differences.

Several teams noted that the occurrence of the vast majority of species was rare with only a few species being common in many samples, posing a difficulty for modeling.

Figure 3 - Feature selection based on time series model residuals by The Bayes Bunch

After testing conventional models, another team decided to add a more complex approach using an LSTM followed by fully connected final layers both with and without the 5 features: tide, weather condition, water temperature, season and moon phase trying to predict the abundance of Cladophora sp. Smooth, a green algae (Figure 4). They noted that adding these 5 features while increasing the train and validation performance did not meaningfully increase the performance of the model on the test set.

Figure 4 - Predictions of an LSTM model trained with 5 extra features: tide, weather condition, water temperature, season and moon phase on the abundance of Cladophora sp.


One team incorporated both their feature selection and forecasting work relative to ocean pH in a dashboard built with Streamlit (Figure 4). Additionally, this team created an open-source Python package beautiful-sea aimed at scaling their findings to other marine ecosystems.

Figure 5 - Streamlit App showing forecasting of the Shannon Index based on ocean pH.

Another team created a dashboard showing many of their findings such as feature importances derived from a catboost decision tree algorithm and abundance data for individual species as well as their conservation status.

Figure 6 - Dashboard showing the abundance of the threatened Diplodus sp.

A third team, after noticing that invasive species seem to thrive more when the ocean is getting warmer, looked into existing technology which could lower the sea temperature and proposed using shade balls. While shade balls are controversial in their initial purpose to save water due to requiring a lot of water to be manufactured, this team proposed this alternative use case for them in the interest of biodiversity. In any case, the focus on sea level temperature is extremely relevant since as of writing this metric has been off the charts with yet unknown impact on biodiversity and marine as well as all ecosystems.

Figure 7 - Global Sea surface level temperatures showing unprecedented values in 2023.

Social Impact

Using the insights from the teams, researchers and other interested parties are able to examine features for which modeling showed a high correlation with biodiversity measurements for their biological plausibility. In a second step this work can be used as a basis for specific interventions aimed at protecting the AMPA and to raise public awareness for marine biodiversity, especially in the local population and tourists visiting the area.

One core finding was the correlation between ocean acidification and a lower Shannon diversity index, indicating less biodiversity. This is especially prudent because climate change is causing the ocean to acidify, which is a potential direct link between the climate crisis and biodiversity in a local and well-studied marine area. Highlighting this connection to tourists and locals could increase climate change awareness and action by making the local impact tangible.

Open-source code

Other challenges