OpenAire Guide for Researchers: How to make your data FAIR
FORCE11's Guiding Principles for Findable, Accessible, Interoperable, and Re-Usable Data
"How FAIR are your data?" checklist, by Sarah Jones & Marjan Grootveld, EUDAT
Crosas, M. et al (2018) "Data policies of highly-ranked social science journals"
Research data management (RDM) concerns the organisation of data, from its entry to the research cycle through to the dissemination and archiving of valuable results. RDM helps researchers navigate the increasingly complex landscape of data planning, storage, and sharing. It is part of the research process, and aims to make the research process as efficient as possible, and meet expectations and requirements of the university, research funders, and legislation.
RDM concerns how you:
Canadian Institutes of Health Research (CIHR)
"deposit bioinformatics, atomic, and molecular coordinate data into the appropriate public database (e.g. gene sequences deposited in GenBank) immediately upon publication of research results"
"retain original data sets for a minimum of five years (or longer if other policies apply)"
Many journals, especially journals with higher impact factors or journals associated with major publishers, have instituted policies regarding the availability of research data underlying publications. In many cases, journals require that data are made openly available as a condition for publishing an article.
The following list covers a few major publishers as examples:
Open data/research is the practice of research in such a way that others can collaborate and contribute, where research data, lab notes and other research processes are freely available, under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods. Open research data is data that can be freely accessed, reused, remixed and redistributed, for academic research and teaching purposes and beyond. Ideally, open data have no restrictions on reuse or redistribution, and are appropriately licensed as such. Openly sharing data exposes it to inspection, forming the basis for research verification and reproducibility, and opens up a pathway to wider collaboration.
However, there are also special considerations - not all data can or should be open. For example, to maintain Indigenous Knowledge sovereignty and Indigenous Data sovereignty (see CARE principles below in this page), or to protect the identity of human subjects, limited restrictions of access may be implemented.
Read more about Open research: https://book.fosteropenscience.eu/ (CC-0)
Since the publication in 2016 of "FAIR Guiding Principles for scientific data management and stewardship" in Scientific Data, the best practice for managing data is to adhere to the FAIR principles. The FAIR principles are a framework for ensuring that data collected by researchers across all disciplines and fields meet specific standards to promote open science, reproducibility of research, and maximize the benefits of research to academia and society.
The following description of the FAIR principles is taken directly from https://www.go-fair.org/fair-principles/
Findability:
The first step in (re)using data is to find them. Metadata (the description of the data) and data should be easy to find for both humans and computers. This means assigned a persistent identifier (PID) to the data/dataset (usually in the form of a digital object identifer, or DOI). Identifiers consist of an internet link (e.g., a URL that resolves to a web page where the data are located). Identifiers will help others to properly cite your work when reusing your data.
Accessibility:
Once the user finds the required data, they need to know how can they be accessed, possibly including authentication and authorisation. This does not mean that data should be open, necessarily. There are many reasons to restrict access to data (e.g. the data contain personally identifiable information (PII), are proprietary/licensed as intellectual property (IP), or contain other sensitive information). Accessibility essentially means that it should be clear under what conditions access is allowed. The rule with accessibility can be distilled to: "As Open as Possible, as Closed as Necessary"
Interoperable:
Interoperability refers to the ease by which data can be integrated with other/new data. In practice, storing data in open formats makes it easier to later integrate new data. On the other hand, storing data in proprietary formats hinders this effort. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing. This means that when possible, it's best practice to use standardized vocabularies/variable labels/terms.
Reusable:
The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings. In practice, this involves creating a README file with details on how to clean, transform, or manage the data, if applicable. This also involves applying a license to let others know if the data are public domain or if copyright is retained to some degree or completely.
The CARE Principles for Indigenous Data Governance are people and purpose-oriented, reflecting the crucial role of data in advancing Indigenous innovation and self-determination. These principles complement the existing FAIR principles (www.go-fair.org) encouraging open and other data movements to consider both people and purpose in their advocacy and pursuits.
Collective Benefit:
Data ecosystems shall be designed and function in ways that enable Indigenous Peoples to derive benefit from the data.
Authority to Control:
Indigenous Peoples’ rights and interests in Indigenous data must be recognised and their authority to control such data be empowered. Indigenous data governance enables Indigenous Peoples and governing bodies to determine how Indigenous Peoples, as well as Indigenous lands, territories, resources, knowledges and geographical indicators, are represented and identified within data
Responsibility:
Those working with Indigenous data have a responsibility to share how those data are used to support Indigenous Peoples’ self-determination and collective benefit. Accountability requires meaningful and openly available evidence of these efforts and the benefits accruing to Indigenous Peoples.
Ethics:
Indigenous Peoples’ rights and wellbeing should be the primary concern at all stages of the data life cycle and across the data ecosystem.
To learn about the principles of Ownership, Control, Access, and Possession, please visit https://fnigc.ca/ocap-training/
The following information is quoted from the First Nations Information Governance Centre's website:
"OCAP® asserts that First Nations alone have control over data collection processes in their communities, and that they own and control how this information can be stored, interpreted, used, or shared.
Ownership refers to the relationship of First Nations to their cultural knowledge, data, and information. This principle states that a community or group owns information collectively in the same way that an individual owns his or her personal information.
Control affirms that First Nations, their communities, and representative bodies are within their rights in seeking to control over all aspects of research and information management processes that impact them. First Nations control of research can include all stages of a particular research project-from start to finish. The principle extends to the control of resources and review processes, the planning process, management of the information and so on.
Access refers to the fact that First Nations must have access to information and data about themselves and their communities regardless of where it is held. The principle of access also refers to the right of First Nations’ communities and organizations to manage and make decisions regarding access to their collective information. This may be achieved, in practice, through standardized, formal protocols.
Possession While ownership identifies the relationship between a people and their information in principle, possession or stewardship is more concrete: it refers to the physical control of data. Possession is the mechanism by which ownership can be asserted and protected."
Please note: “OCAP® is a registered trademark of the First Nations Information Governance Centre (FNIGC)”
McGill Library • Questions? Ask us!
Privacy notice