Fale conoscoTwitter

Scientific Data Management

Scientific data management involves collecting, storing, managing, and sharing data from scientific research. Therefore, it is essential that the process is properly planned and accompanied by appropriate tools.

Based on the example of what has been happening all over the world, Brazilian development agencies are making it mandatory to present a scientific Data Management Plan (DMP) in the submission of projects to obtain financing, as well as the public availability of data generated projects.

The Office of Provost for Research with operational support from the Information Technology Superintendence (STI), is providing infrastructure for researchers to develop their plans for managing scientific data in a quickly and practical way. In addition researchers would make their data available on the platforms provided by STI following an acceptance term.

The Scientific Data Management is a set of activities that aims to collect, store, manage and share data from scientific research.

An effective data management enables the rationing of resources by reusing and sharing data.

At USP, the management of scientific data has the purpose of assisting the researcher in relation to:

  • planning, organization and security
  • documentation and sharing
  • preparation of the data sets for deposit
  • data preservation
  • copyright, licensing and intellectual property issues.

The scientific data management aims to meet the principles known as FAIR (Findable, Accessible, Interoperable, Reusable), widely disseminated in scientific communities around the world. These principles define that scientific data must be locatable, accessible, interoperable and reusable. To achieve each of these goals, sub-principles are established in relation to the data itself and the metadata (data description).

  • TO BE FINDABLE:

F1. (meta)data are assigned a globally unique and eternally persistent identifier.

F2. data are described with rich metadata.

F3. (meta)data are registered or indexed in a searchable resource.

F4. metadata specify the data identifier.

  • TO BE ACCESSIBLE:

A1. (meta)data are retrievable by their identifier using a standardized communications protocol.

A1.1. the protocol is open, free, and universally implementable.

A1.2. the protocol allows for an authentication and authorization procedure, where necessary.

A2. metadata are accessible, even when the data are no longer available.

  • TO BE INTEROPERABLE:

I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.

I2. (meta)data use vocabularies that follow FAIR principles.

I3. (meta)data include qualified references to other (meta)data.

  • TO BE RE-USABLE:

R1. meta(data) have a plurality of accurate and relevant attributes.

R1.1. (meta)data are released with a clear and accessible data usage license.

R1.2. (meta)data are associated with their provenance.

R1.3. (meta)data meet domain-relevant community standards.

The main reasons for the public availability of scientific data generated by your project are:

  • to enable a faster advance of the research of the area, from the reuse and sharing of the data generated;
  • enable auditing and replication of experiments;
  • increase search visibility;
  • comply with the obligation to make the data publicly available, determined by promoters.

These reasons aim to meet the principles known as FAIR (Findable, Accessible, Interoperable, Reusable), detailed in item 2

The Data Management Plan (DMP) is a formal document related to the research project that should answer two basic questions:

1. What data is generated?

The researcher must inform the format of the data and a brief description of it, so that those who want to use can understand them. The data generated may, for example, be in the form of spreadsheet, text or digital databases.

2. How and where is this data stored and made available?

The researcher must inform the place of storage and availability of the data, which may be a public or the researcher private repository. The answer to this question should involve ethical and legal aspects that are eventually involved.

Considering the simplest format, a Data Management Plan (DMP) can be elaborated in a text editor answering the questions mentioned in the previous item.

Example of a simplified model of Data Management Plan (PDF format)

There are tools available on the Internet for the development of DMP, which help the text development by following templates (questions that the user can answer). These tools make it possible to make the DMP available and share it on Internet, as well as to edit it in a collaborative way and to print it.

USP has become an affiliate of dmptool.org, an organization that makes the DMPTool tool available for quick and practical DMPs. This tool was set up by STI and USP researchers can answer (in Portuguese or English) questions which results in a DMP.

To access the DMPTool tool, access: https://dmptool.org/

Instructions for using the DMPTool tool:

Use option 1 (Option 1: If your institution is affiliated with DMPTool).

Access the "Your institution" button and enter "University of São Paulo". Then press the "Go" button.

For FAPESP instruction on how to elaborate the Data Management Plan: http://www.fapesp.br/gestaodedados/

The data of USP researchers (in any format) may be made available by USP, who will be responsible for their safety for a certain period of time. In addition to the data itself, the researcher should provide metadata (description of the data) in order to facilitate its understanding and reuse. Currently, USP offers three platforms built from free software:

CKAN

Dataverse

DSpace

See also the Frequently Asked Questions (FAQ) section

Some researchers at USP are involved in a Working Group led by FAPESP and that aims to establish guidelines for the management of scientific data of projects funded by that body. Within this context, USP participates, with other universities, in the development of a metasearcher that aims to compile in a single place the scientific data made available by researchers from the institutions involved.

Access to the metasearcher prototype: http://metabuscador.sc.usp.br/