Kategorien: Datenmanagement, Forschungsdaten
Index
CRIS
CRIS stands for Current Research Information System, the research information system at FAU. It stores information about research achievements, for example information about a publication, a research project, research data or inventions. Only metadata is stored, for example, the full text of a publication is not stored directly in CRIS, but it does indicate where this full text can be found (for example, by specifying the DOI).
In research information systems, the various data areas are linked: publications are not only assigned to persons, but also projects, projects in turn are assigned to specific research areas. For this purpose, internal data sources of the university are also used, which offers added value compared to classic list formats and data providers such as Scopus or Web of Science.
Data management plan
According to forschungsdaten.info, a data management plan (DMP) structures the handling of research data, or its „collection, saving, documentation, maintenance, processing, transfer, publication and storage, as well as the necessary resources, legal framework and responsible persons.“ A data management plan (DMP) documents the entire data lifecycle.
Many third-party funding organizations (DFG, FWF, SNSF, Horizon Europe, Volkswagen Foundation) expect information on the handling of research data as part of a funding application for the allocation of funds from certain funding lines.
The DMP describes how to handle research data from the planning stage, to collection, to long-term archiving or, if applicable, planned deletion. At the very least, the data management plan answers the following questions:
- What is collected?
- Which bodies must be consulted before collecting data?
- In what form and where will research data be stored in the various project phases?
- Who can access the data and when will it be available?
- Who is responsible for the individual steps?
- Which legal requirements must be observed? The DMP is a useful and necessary part of the project application.
- What exactly does this mean for research?
Why this approach is meaningful and sustainable is explained in this video.
Copyright
Literary, artistic and scientific works are protected by German copyright law.
In specific terms, this means that without a corresponding license, reuse is only possible in a restrictive manner.
We recommend licensing that is as open as possible, because this increases data reusability and boosts the reputation of researchers. If you have any questions, the CDI will be happy to advise you.
Read this article for further details.
Data organization
For many scientists, dealing with research data is the basis of their daily work. It therefore saves time and effort if this data is efficiently structured, documented and backed up from the outset.
Most of the data is initially stored in files. Files have different types or file formats that are sometimes identified by the file name extension, for example in the Windows operating system. Furthermore, files are stored in directories (folders). Naming files and directories systematically is very important. For example, the Stanford File Naming Handout.
Alternatively, data can also be stored in databases. Here, the effort is greater, because a database management system such as MySQL must first be set up. A database schema needs to be defined which provides a structure for storing data. Here, too, naming is of great importance. Databases support managed shared access to data much better than data stored in files. There are different types of databases: relational, hierarchical, graph-based, RDF triple stores and a few more.
The following animated video clearly summarizes the topic of data organization.
FAIR principles
FAIR principles are requirements that ensure sustainable and re-usable research data. The acronym FAIR stands for Findable, Accessible, Interoperable and Re-Usable. A number of research funding providers (including the EU, the DFG and the SNF) believe the FAIR principles are an important requirement for sustainable research and therefore expect them to be complied with. Using persistent identifiers and detailed metadata is considered particularly important for the findability of data. Using standards for interfaces, metadata and data supports accessibility and interoperability of data. Extensive content-related metadata and documentation and clear rights of reuse make it easier to reuse data. Data do not have to be classed as open in order to meet FAIR principles, but their metadata ought to be freely accessible. By complying with these guidelines, „machine-actionability“ is to be ensured. This means that a computer-aided system can find, access and reuse the digital objects with minimal human input.
Persistent Identifiers (PI)
Persistent identifiers (PI) are long-lasting references to digital resources. A PI is a unique name for digital objects of any kind (essays, data, software, etc., especially data records in research data management). This name, usually a longer sequence of digits and / or alphanumeric characters, is linked to the web URL of the digital resource. If the URL for the resource changes, only the address to which the PI refers has to be changed, whilst the PI itself can stay the same. This guarantees, for example, that a resource cited using the PI can still be found even if its physical storage place has changed. Examples of persistent identifiers are digital object identifiers (DOI), uniform resource names (URN) and handles.
Using a specific example, this video clearly explains what persistent identifiers are.
Data publication
In order to ensure transparent research and traceability of results, research data should be published wherever possible. In order to comply with FAIR principles, the corresponding metadata must be recorded. A repository is required for the publication of the research data.
FAUWissKICloud
FAUWissKICloud
The purpose of FAUWissKICloud is to host and maintain the FAU WissKI instances. The CDI manages the software and RRZE is responsible for maintaining the hardware. WissKIs are maintained and updated using the WissKI Distillery.
Ontology
An ontology is a system of terms that attempts to relate and thus define all concepts of a subject area as far as possible. Relationships include: „superordinate term – subordinate term“, „whole – component“ or also „means the same as“ (synonym). The terms are not simply named by words, but more precisely and unambiguously by URIs.
eLabFTW
eLabFTW is open source software used as an electronic lab notebook (ELN), data management platform and laboratory inventory management system.
The data can be exported to various formats that make it easier to import to another system with JSON and CSV files.
