The NIH is requiring this information in a narrative form, which can be easily created via the UNMC-required DMPTool. Here is the expected information for the two-page DMSP to be uploaded as with the application submission (see Submitting Your DMS Plan).
Element 1: Data Type
A. Types and amount of scientific data expected to be generated in the project:
From the NIH:
Summarize the types and estimated amount of scientific data expected to be generated in the project.
For example:
“This project will produce sequencing, transcriptomic, and epigenetic data generated from 10x snRNAseq. Data will be collected from 100 patients, generating approximately 350 datasets. Estimated size of data is about 10,000 gigabytes (Gb). Data file types include: comma separated values (CSVs), fastq sequencing files, and R code (R).”
B. Scientific data that will be preserved and shared, and the rationale for doing so:
From the NIH:
Describe which scientific data from the project will be preserved and shared and provide the rationale for this decision.
For example:
“Raw sequencing files, validated data, processed data collected from raw sequencing files, and the code used to process files will be preserved and shared. Research participants and family member identities will be de-identified using masking techniques. Only de-identified individual data will be made available.”
C. Metadata, other relevant data, and associated documentation:
From the NIH:
Briefly list the metadata, other relevant data, and any associated documentation (e.g., study protocols and data collection instruments) that will be made accessible to facilitate interpretation of the scientific data.
For example:
“README files, code, clinical metadata in the form of persistent unique identifiers, biospecimen metadata such as specimen IDs, and assay metadata such as valid barcode reads will be shared to help interpret the data.”
Element 2: Related Tools, Software and/or Code
From the NIH:
State whether specialized tools, software, and/or code are needed to access or manipulate shared scientific data, and if so, provide the name(s) of the needed tool(s) and software and specify how they can be accessed.
For example:
“Raw sequencing files in fastq format will be made available and may need specialized programs to be manipulated. Metadata and processed sequencing data is available in comma separated format (.csv files) and do not need specialized tools to access or manipulate.”
Element 3: Standards
From the NIH:
State what common data standards will be applied to the scientific data and associated metadata to enable interoperability of datasets and resources, and provide the name(s) of the data standards that will be applied and describe how these data standards will be applied to the scientific data generated by the research proposed in this project. If applicable, indicate that no consensus standards exist.
For example:
“FAIR Data sharing protocols will be applied, so that data is Findable, Accessible, Interoperable, and Re-usable. Sequencing data will be structures and described using the following standards: 1) Description of the biological system, samples, and variables studied, 2) Sequence read data of each assay, 3) Final processed data for the set of assays, 4) General information about the experiment and sample-data relationships, and 5) Metadata appropriate to the datasets so that they can be linked.”
Element 4: Data Preservation, Access, and Associated Timelines
A. Repository where scientific data and metadata will be archived:
From the NIH:
Provide the name of the repository(ies) where scientific data and metadata arising from the project will be archived; see Selecting a Data Repository).
For example:
“All sharable datasets will be deposited int eh National Institute on Aging Genetics of Alzheimer’s Disease Data Storage (NIAGADS) repository”
B. How scientific data will be findable and identifiable:
From the NIH:
Describe how the scientific data will be findable and identifiable, i.e., via a persistent unique identifier or other standard indexing tools.
For example:
“The NIAGADS repository proves metadata, persistent identifiers (DOIs), and long-term access.”
C. When and how long the scientific data will be made available:
From the NIH:
Describe when the scientific data will be made available to other users (i.e., no later than time of an associated publication or end of the performance period, whichever comes first) and for how long data will be available.
For example:
“The data will be made available as soon as possible or at the start of the publication process, whichever comes first.”
Element 5: Access, Distribution, or Reuse Considerations
A. Factors affecting subsequent access, distribution, or reuse of scientific data:
From the NIH:
NIH expects that in drafting Plans, researchers maximize the appropriate sharing of scientific data. Describe and justify any applicable factors or data use limitations affecting subsequent access, distribution, or reuse of scientific data related to informed consent, privacy and confidentiality protections, and any other considerations that may limit the extent of data sharing. See Frequently Asked Questions for examples of justifiable reasons for limiting sharing of data.
For example:
“Following all federal, Tribal and state laws, all data from donors that do not allow for sharing will be excluded from shared datasets. Most participants allow for sharing for study of neurodegenerative diseases, with some allowing for sharing only for academic research use. Those allowing for partial sharing will be shared with NIAGADS with the conditions specified in the consent documentation.”
B. Whether access to scientific data will be controlled:
From the NIH:
State whether access to the scientific data will be controlled (i.e., made available by a data repository only after approval).
For example:
“All data will be shared in the controlled access data repository, NIAGADS. The access to this repository is limited to qualified investigators with a legitimate research interest, and is approved by an independent committee of researchers (the Data Use Committee) designated by NIAGADS.”
C. Protections for privacy, rights, and confidentiality of human research participants:
From the NIH:
If generating scientific data derived from humans, describe how the privacy, rights, and confidentiality of human research participants will be protected (e.g., through de-identification, Certificates of Confidentiality, and other protective measures).
For example:
“IRB documentation and informed consent documents will include language describing plans for data management and sharing data, describing the motivation for sharing, and explaining that personal identifying information will be removed. To protect participant and family member privacy and confidentiality, shared data will be de-identified according to all federal and state guidelines and following the safe-harbor method. Only the minimum of PHI will be collected for the purposes of the study, and all team members are HIPAA trained.”
Element 6: Oversight of Data Management and Sharing
From the NIH:
Describe how compliance with this Plan will be monitored and managed, frequency of oversight, and by whom at your institution (e.g., titles, roles).
For example:
“The following individual, XXXX, will ultimately be responsible for data collection, management, storage, retention, and dissemination of project data, including updating and revising the Data Management and Sharing Plan when necessary, and will report on data sharing and compliance in the annual project progress reports. This person is the Principal Investigator of the project, an Associate Professor of XXXX at UNMC. His email is xxxxx@unmc.edu. This other person is Research Project Coordinator in Dr. XXXX's lab, will also maintain the Data Management and Sharing Plan, and coordinate permissions with data repositories."