From policy to practice – Opening research data on phenology

31.10.2017 Olli-Pekka Mattila, Kristin Böttcher, Kaisu Harju
The group works at SYKE’s Data and Information Centre. Researcher Kristin Böttcher developed the methodology for remote sensing of vegetation phenology. GIS expert Kaisu Harju created the web map application and took care of open data delivery. Olli-Pekka Mattila is managing the development of tools for sharing research data in the Envibase project, including the interface for creating the metadata for the showcased data.

SYKE released a new research data policy at the end of September. The aim of such policy is to indicate to a researcher what the institute expects to be done to the datasets produced in research and development projects. On the other hand the policy also gives the researcher the guidelines, how the data can be further distributed. It also serves the institute in communicating the practices of data handling to partners and other stakeholders. One of the key principals is to have datasets from research and development projects to be openly available. This enables the results of research to be further used by fellow researchers and anybody interested. The data also needs a story, how it was conceived. Here, we can bring up another key principal from the data policy, which is the need for systematic metadata describing the available datasets.

Recently new datasets were added to SYKE’s Open Data services. Among these data was a 16-year times series of start of photosynthetically active period for coniferous forests and deciduous vegetation. The methodology and datasets) are a result of two EC Life+ projects Snowcarbo (LIFE07ENV/FIN/000133) and Monimet (LIFE12ENV/FI/000409). Although, the generation of the data was only part of these projects, this indicates that significant resources have been allocated to produce the data. What is the data good for then?

Initially in the Snowcarbo project the methodology was developed to create an indicator dataset to compare with carbon balance modelling, for which spatially extensive validation could not be obtained from in situ observations only. Remote sensing provided means to make indicative comparison with the timing of start of photosynthetic activity. Later the data was also used as a predictor for modelling peak flight period of moths. The question arises, what else could be done with the dataset? Who else would be able and willing to use the data, if it was easily accessible and well documented?

The open data movement is spreading rapidly from government institutes and other public authorities to the research community. The idea of sharing research data openly is not new, but has only recently reached the maturity both in mentality as well as technically, but there are still many open questions on what are the most efficient and useful ways of providing the data to the fellow researcher or other users. The likely answer, in the best case, is that there are different kinds of users and therefore, different needs. With abundance of research data resources opening, we might also want to market and showcase our own data to attract users and new collaboration.

The vegetation phenology datasets were published with an open standard (OGC) Web Map Service (WMS) interface. This means that the data can be shown in any web map application or GIS software that can make use of the standard interfaces. The data is not downloaded but shown directly from the source. A simple web map application was created for viewing and browsing the data. Links to relevant publications, metadata and site for downloading the data are also provided in the web map application. For further analysis of vegetation phenology, the datasets can be downloaded at SYKE’s open data web service as geotiff files. In addition, the metadata descriptions are available for both the datasets as well as for the web map application, providing further information about datasets and the application thus supporting the further usage of datasets.

Additionally to the existing tools for displaying and sharing research data, such as the ArcGIS Online, used for the web map application here, Envibase project is setting up new tools for storing, sharing and describing research datasets. The opening of vegetation phenology dataset is a good example of implementation of SYKE’s new data policy on datasets created in research and development projects. Now we’ll just have to wait and see if new audience and interested researchers are attracted to the data by showcasing it and making it easily accessible. The life cycle of research data should not finish to making the first peer-reviewed publication, but rather be the starting point of making full use of the time and effort invested in creating it.

Vegetation Phenology 2001-2016

Screenshot from the web map application created for the phenology dataset.

Opinions of blog contributors do not necessarily reflect the official views and opinions of the Finnish Environment Institute.

