Best Practices for Organizing Research Data#

Understanding the Structure

MaterialsCommons organizes data in a hierarchical structure:

  • Project: The top-level container for related research

  • Studies: Subdivisions within a project that represent specific experiments, simulations or research phases

Example Directory Structure#

Before uploading to MaterialsCommons you should organize your research data. Below is an example of how to organize your local files:

ProjectName/
├── README.md           # Project overview.
├── Study1/
│   ├── README.md       # Study overview.
│   ├── study1.xlsx     # Spreadsheet with metadata to be processed
│   ├── raw_data/       # Original, unprocessed data
│   ├── processed_data/ # Analyzed or transformed data
│   ├── metadata/       # Documentation specific to this study
│   └── scripts/        # Analysis scripts and code
├── Study2/
│   ├── README.md       # Study overview.
│   ├── study2.xlsx     # Spreadsheet with metadata to be processed
│   ├── raw_data/
│   ├── processed_data/
│   ├── metadata/
│   └── scripts/
└── project_docs/       # Project-level documentation

File Naming Conventions#

  1. Use Descriptive Names

    • Include relevant information in file names

    • Example: 2024-03-15_sample-A_tensile-test.csv

  2. Avoid Special Characters

    • Use hyphens (-) or underscores (_) instead of spaces

    • Avoid characters like: * ? " < > | : \ /

  3. Include Dates

    • Use ISO format: YYYY-MM-DD

    • Place dates at the beginning of filenames for easy sorting

Metadata Organization#

  1. Project-Level Metadata

    • Create a README.md file in the project root

    • Include:

      • Project title and description

      • Principal investigator

      • Project dates

      • Funding sources

      • Contact information

  2. Study-Level Metadata

    • Create a README.md file in each study directory

    • Include:

      • Study objectives

      • Experimental methods

      • Computational Tools and versions

      • Equipment used

      • Data collection procedures

      • Analysis protocols

    • Create a metadata file or directory for each study to document metadata names and meaning

    • Optionally have a study Excel file to capture important values, workflow, samples and computations. See Using Spreadsheets For Data.

Best Practices#

  1. Version Control

    • Keep original data files unchanged

    • Store processed versions in separate folders

    • Use version numbers in filenames when applicable

    • Example: analysis_v1.2.xlsx

  2. Documentation

    • Document any data transformations

    • Include README.md files in each directory

    • Note any software or tools required

    • Document units of measurement

  3. Data Quality

    • Validate data before uploading

    • Check for completeness

    • Verify file formats are correct

    • Ensure consistent units across files

  4. Before Uploading

    • Review folder structure

    • Verify all necessary metadata is included

    • Check file permissions

    • Remove any temporary or unnecessary files

Tips for Specific Research Data Types#

  1. Experimental and Simulation Data

    • Group related experiments and simulations in the same study

    • Include calibration data

    • Store raw instrument output separately from processed results

  2. Image Data

    • Use consistent naming for series of images

    • Include scale bars and measurement information

    • Store original and processed images separately

  3. Analysis Scripts

    • Document dependencies

    • Include version information

    • Store input data requirements

    • Add comments explaining analysis steps

Regular Maintenance#

  1. Periodic Review

    • Check organization structure regularly

    • Update documentation as needed

    • Archive completed studies

    • Remove redundant files

  2. Backup Strategy

    • Maintain local copies of important data

    • Document any changes made in MaterialsCommons

    • Keep track of file versions

Final Note#

This organization structure will help ensure your research data is:

  • Easy to find and access

  • Well-documented

  • Preserved for long-term use

  • Ready for collaboration

  • Suitable for publication and sharing

Note

Good organization before upload makes your research more efficient, reproducible, and valuable to the scientific community.