Data Management Plan Guidance
Introduction
Data management plans (DMPs) are highly contextual to research field, project, and mode of inquiry, but every DMP should include considerations of several key topics – i.e., data collection; documentation and metadata; storage, deposit and preservation; sharing and reuse; responsibilities and resources; and ethics and legal compliance. See page two for guidance on completing each of these sections.
Typically, an ideal DMP will be complete, precise, and in line with disciplinary best practices. Shortcomings in a DMP will normally stem from lacking one of these features – e.g., it will not discuss data deposit (incomplete), or it will not say where data will be deposited (imprecise), or the chosen repository is poorly suited for the data (not in line with disciplinary best practices). CIHR recognizes that for many research fields, data management practices are in development and disciplinary best practices have not yet been established (e.g., preferred repositories, metadata standards, etc.).
DMPs should describe how data will be FAIR – findable, accessible, interoperable, and reusable. This does not mean that the DMP needs to include a section specifically devoted to making data FAIR. Rather, by completing each section completely, precisely, and in line with disciplinary best practices (where they exist), the DMP will describe how the data will be FAIR. CIHR recognizes that the extent to which data can be FAIR could be constrained by infrastructure limitations (e.g., lack of suitable repositories) and disciplinary practices (e.g., metadata standards have not been established). CIHR also recognizes that 'accessible' data is not synonymous with 'open' data. In many instances, due to ethical, commercial or legal obligations, access to data will need to be controlled; and in some instances, access to data cannot be provided at all. See CIHR's guidance on how to make data FAIR.
For research conducted by and with First Nations, Inuit and Métis communities, DMPs should be co-developed with these communities, in accordance with research data management principles that they accept, such as the CARE (collective benefit, authority to control, responsibility, and ethics) or OCAP© (ownership, control, access and possession) principles. Where co-development is not possible, the DMP should respect Indigenous data sovereignty, include considerations related to Indigenous research data management, and acknowledge the possibility that the DMP will be revised in line with the community's values and principles.
Data Management Plans – Guidance on Specific Sections
Data Collection
- Explain what data will be collected, created or used, and how – e.g., through observational studies, experiments, simulations, and the use of specific software or tools (e.g., REDCap).
Possible Shortcomings to avoid in this Section:
- The plan describes what data will be created but does not identify the software or platform being used to generate the data.
Documentation and Metadata
- Explain whether any information will be provided for others to understand and reuse the data – e.g., a 'readme' text file, code books, or lab notebooks, etc. Ideally, dataset documentation should be provided in machine readable, openly accessible formats (e.g., .csv, .txt file formats).
- Where possible and applicable, state what metadata standard will be followed. The standard can be general (e.g., Dublin Core), but will ideally be domain-specific (in which case, it may be supported by the repository where you plan to deposit the data). Stating the metadata standard will be particularly pertinent to research teams planning to establish a data platform or hub.
Possible Shortcomings to avoid in this Section:
- The plan commits to adequately documenting the data but does not explain how this will be done – e.g., whether through 'readme' text file, etc.
- The plan does not specify whether a metadata standard will be used or not, and why.
Storage, Deposit and Preservation
- Explain where and how data will be stored and backed-up during the research project. Oftentimes this will be at your institution; but sometimes storage during the research project will be supplied by external providers, particularly if you plan to collaborate with industry or other non-academic partners.
- Explain where data will be retained and deposited following completion of the research project, and for how long. Sometimes researchers will decide to keep the data in their institution's repository; sometimes they will deposit the data in an external repository; and sometimes they will choose to keep some data at their institution, and deposit other data in an external repository. This is important to plan from the outset of the project so that there is a clear plan in place in case the NPI retires or moves to another institution.
Possible Shortcomings to avoid in this Section:
- The plan indicates that the data will be stored and backed up during the project, but it lacks details regarding the location and procedures being followed.
- The plan indicates that data will be deposited, but it does not specify where and/or for how long; or it names the repository, but the repository is inadequate or not aligned with disciplinary best practices.
Sharing and Reuse
- Explain which data will be shared, and in what form (e.g., raw, processed). If applicable, explain whether data are subject to access controls or limitations due to confidentiality, privacy, and/or intellectual property considerations.
- If applicable, describe the data access procedures.
Possible Shortcomings to avoid in this Section:
- The plan says that data cannot be shared due to privacy or research ethics board (REB) requirements but does not explain why or the explanation is not compelling.
- The plan does not say how data will be made available to others – e.g., whether the repository makes the data available to anyone on the web or whether access will be controlled, or whether the data will be assigned a persistent identifier to be included in publications, etc.
Responsibilities and Resources
- Identify who will be responsible for managing the project's data during and after the project, and the major data management tasks for which they will be responsible.
- Provide an estimate of costs related to data management – e.g., data curation, file storage and backup, archiving.
Possible Shortcomings to avoid in this Section:
- The plan does not say who will be responsible for managing the project's data (e.g., overall responsibility, and/or responsibility for specific tasks), or these responsibilities are not reflected in other application documents (e.g., participant table).
- The plan provides estimated costs, but they are not included in, or do not align with, the project's overall budget (submitted as a separate document).
Ethics and Legal Compliance
- Describe any ethical or legal obligations the data are subject to. If the project includes sensitive data, describe how these data will be securely managed.
- The DMP should explain, if applicable, potential risks of managing and sharing sensitive data – e.g., privacy, consent, etc. – and describe how these risks will be mitigated.
Possible Shortcomings to avoid in this Section:
- The plan indicates that sensitive data will be subject to controlled access, but the procedures for accessing controlled data are not explained.
- The plan mentions privacy risks but does not explain how they will be mitigated.
How to Make Your Data FAIR – 5 Essential Steps
Researchers should follow international best practices for ensuring that research data are shared in a manner such that they are Findable, Accessible, Interoperable and Reusable, referred to as the FAIR principles. This document provides guidance on how to make research data FAIR.
The FAIR principles do not concern, and this document does not provide guidance on, ethical issues and approaches that must be considered to determine whether and how access should be provided to research data. For information and guidance on ethical issues related to data sharing, please reach out to data librarians and/or research ethics officers at your institution.
For research conducted by and with First Nations, Métis and Inuit communities, any data that are made FAIR should be done so only with the knowledge and consent by the Indigenous community and in accordance with research data management principles the community accepts, such as the CARE (collective benefit, authority to control, responsibility, and ethics) and OCAP© (ownership, control, access and possession) principles.
The box below includes a quick summary on how to make data FAIR in five steps. For more detailed and technical guidance, consult the FAIR Principles webpage at GOFAIR.
Making Your Data FAIR – 5 Essential Steps
- Create or save a version of your data using a commonly understood and non-proprietary file format (e.g., .txt, .csv). If your research field has data standards or expectations on data formats, follow them.
- Deposit your data in a domain repository that is recognized in your research field. If no domain repository exists, choose a generalist one (e.g., FRDR, Zenodo, Dryad). The repository should be indexed in leading database aggregators (e.g., OpenAIRE, Google Dataset Search, DataMed).
- The repository should use a metadata standard – follow the metadata standard when completing the metadata record for the dataset. You can use online tools such as the CEDAR Workbench to help you complete the metadata record without mistakes.
- Choose an appropriate license for re-use of your data.
- The repository should assign a persistent identifier (PID) to the dataset and/or metadata record – for example, a Digital Object Identifier (DOI). When you publish a paper that relies on the data, ensure that the paper has a data availability statement and includes the PID in the statement.
- Date modified: