1. FASEB DataWorks! Help Desk Knowledge Base
  2. Writing a Data Management and Sharing Plan - NIH Grant

NIH DMSP Guide – Element 1: Data Type

This article provides a comprehensive, step-by-step guide to Element 1 of the National Institutes of Health (NIH) Data Management and Sharing (DMS) Policy: Data Type.


Suggestions: Review the Overview of 2023 NIH Data Management and Sharing Policy for information on all areas and requirements for submitting an NIH DMS Plan with your grant application.

For example plans, see List of Sample Data Management and Sharing Plans.


Element 1: Data Type – Content

Requirement 1  Examples  Fill-in-the-Blank Template  NIH Guidance  
Requirement 2  Examples  Fill-in-the-Blank Template  NIH Guidance  
Requirement 3  Examples  Fill-in-the-Blank Template  NIH Guidance  
Additional Resources

Element 1: Requirement 1

Summarize the types and estimated amount of scientific data expected to be generated in the project.

Describe data in general terms that address the type and amount/size of scientific data expected to be collected and used in the project (e.g., 256-channel EEG data and fMRI images from ~50 research participants). Descriptions may indicate the data modality (e.g., imaging, genomic, mobile, survey); level of aggregation (e.g., individual, aggregated, summarized); and/or the degree of data processing that has occurred (i.e., how raw or processed the data will be).

Element 1: Requirement 1 Examples

Sample Plan Text  

Basic sciences data: In this proposed project, data will be generated via the following methods: cell culture, light microscopy, confocal microscopy, real-time quantitative polymerase chain reaction (PCR), and stereological counting techniques. These data will be collected from a minimum of three independent experiments, with each independent experiment consisting of three groups: Wild-type (Rest+/+), heterozygous (Rest+/–), and homozygous (Rest–/–) from both embryonic stem (ES) cells and the corresponding neural stem/progenitor (NS/P) cells. The total size of the data collected is projected to be 300 GB.

We expect to generate the following data file types and formats during this project: Carl Zeiss microscopic image (.CZI), images (.TIFF), tabular (.CSV), and Affymetrix GeneChip (.CEL).

Raw-data files will be analyzed to generate CSV files containing counts of cell type and total number of stem cells, and to enable statistical analysis.

Fill-in-the-Blank Template

This project will produce _________ [data type, e.g., imaging, sequencing, experimental measurements] data generated/obtained from __________ [data modality, e.g., instrument, method, survey, experiment, data source]. Data will be collected from ___ [number] of research participants/specimens/experiments, generating ___ [number] datasets totaling approximately ___ [amount of data] in size. The following data files will be used or produced in the course of the project: ______ [list input data files, intermediate files, and final, post-processed files]. Raw data will be transformed by ____ [analysis, method], and the subsequent processed dataset used for statistical analysis. To protect research participant identities, ___________ [e.g., de-identified individual, aggregated, summarized] data will be made available for sharing.

If working with human subjects, consider adding: Data collection will be performed at clinical sites in the ____ [location] area(s) with ____ [population(s) being studied; i.e., people with type 2 diabetes].

Element 1: Requirement 1 NIH Guidance

NIH Genomic Data Sharing Policy Considerations

Check if your research is subject to NIH Genomic Data Sharing (GDS) policy using these criteria and list those data and the levels of processing in this Element.

Data types expected to be shared under the GDS Policy should be described in this Element 1 section. Note that the GDS Policy expects certain types of data to be shared that may not be covered by the DMS Policy’s definition of “scientific data.” For more information on the data types to be shared under the GDS Policy, consult Data Submission and Release Expectations.

Individual NIH Institutes and Centers (ICs) may have additional expectations or requirements for GDS. Please check the IC-specific genomic data sharing requirements.


Element 1: Requirement 2

Describe which scientific data from the project will be preserved and shared, and provide the rationale for this decision.

Summarize what scientific data you will preserve and share. The NIH does not anticipate that researchers will preserve and share all scientific data generated in a study. Researchers should decide which scientific data to preserve and share based on ethical, legal, and technical factors. The plan should provide the reasoning for these decisions.

Element 1: Requirement 2 Examples

Sample Plan Text 

In this proposed project, the cleaned, item-level spreadsheet data for all variables will be shared openly, along with example quantifications and transformations from initial raw data. Final files used to generate specific analyses to answer the Specific Aims and related results will also be shared. The rationale for sharing only cleaned data is to foster ease of data reuse.  

Fill–in-the-Blank Template

Based on _______ [ethical, legal, technical] considerations, only the following data produced in the course of the project will be preserved and shared: ____ [list subsets of the data to be shared].

OR 

All data produced in the course of the project will be preserved and shared. 

If working with human subjects, consider adding: The final dataset will include _______ [e.g., self-reported demographic and behavioral data from interviews with participants and laboratory data from blood and urine specimens provided]. We will share de-identified individual-participant level (IPD) data. Appropriate measures such as _______ [describe specific de-identification practices to be used] will be used for data de-identification and sharing, and informed consent forms will reflect those plans.

Element 1: Requirement 2 NIH Guidance

Protecting Privacy When Sharing Human Research Participant Data

If human subjects’ data will be collected and only de-identified subsets are to be shared, consider specific de-identification approaches that fit the population and purposes. Guidance on protecting privacy is at NOT-OD-22-213. If you are generating genomic data, follow specific sharing requirements (data submission and release expectations) under the NIH GDS policy (five levels of processing and associated expectations for data submission and release).


Element 1: Requirement 3 

Briefly list the metadata, other relevant data, and any associated documentation (e.g., study protocols and data collection instruments) that will be made accessible to facilitate interpretation of the scientific data.

Summarize the necessary information or documentation that will be provided to make collected data interpretable and reusable. 

Element 1: Requirement 3 Examples

Sample Plan Text

To facilitate the interpretation and reuse of the data, a README file and data dictionary will be generated and deposited into a repository along with all shared datasets. The README file will include method description; instrument settings; RRIDs of resources such as antibodies, model organisms, cell lines, plasmids, and other tools (e.g., software, databases, services); and Protocol DOIs issued from protocols.io. The data dictionary will define and describe all variables in the dataset.

Fill-in-the-Blank Template 

To facilitate interpretation of the data, ______ [e.g., data dictionary, metadata, documentation, statistical analysis plans, bench protocols, data collection instruments] will be created, shared, and associated with the relevant datasets.

If working with human subjects, consider adding: In addition to ______ [individual participant data (IPD) dataset being shared by restricted access and/or aggregate data being shared openly], the researcher will share the ______ [describe any other elements of the final data package not already addressed]. Documentation and support materials will be compatible with the https://clinicaltrials.gov/ Protocol Registration Data Elements.

Element 1: Requirement 3 NIH Guidance

In addition to the documentation examples, consider metadata that will provide additional information intended to make scientific data interpretable and reusable (e.g., date; independent sample and variable construction and description; methodology; data provenance; data transformations; and any intermediate or descriptive observational variables).


Tip: Using the DMPTool

There are currently no specific formatting requirements included in the NIH DMS Application Guide. However, there is a helpful DMPTool, a free online wizard that walks you through the process of creating an NIH-compliant DMS Plan. The information in this article includes examples from DMPTool.


NIH Guide Notice: As outlined in the NIH Guide Notice Supplemental Policy Information: Elements of an NIH Data Management and Sharing Plan, DMS Plans should address six elements (areas): Data Type; Tools, Software, and Code; Data Standards; Data Preservation, Access, and Associated Timelines; Access, Distribution, or Reuse Considerations; and Oversight of Data Management and Sharing, as described in the Application Guide. The NIH suggests that a DMS Plan be no more than two pages. The plan should be attached to the application as a PDF file, as outlined in the NIH’s Format Attachments page.


 

Back to top of page

Additional Resources

Related Articles

Sources

NIH Template Working Group*. (2023). DMPTool NIH-Default DMSP template, v9. In California Digital Library (Ed.), DMPTool [DMP authoring software]. Retrieved from https://dmptool.org/template_export/118304408.pdf

* More information on the NIH Template Working Group history and membership can be found at https://blog.dmptool.org/2022/08/18/supporting-the-upcoming-nih-data-sharing-requirements-with-the-dmptool/