Best Practices for Organizing Research Data#
Understanding the Structure
MaterialsCommons organizes data in a hierarchical structure:
Project: The top-level container for related research
Studies: Subdivisions within a project that represent specific experiments, simulations or research phases
Example Directory Structure#
Before uploading to MaterialsCommons you should organize your research data. Below is an example of how to organize your local files:
ProjectName/
├── README.md # Project overview.
├── Study1/
│ ├── README.md # Study overview.
│ ├── study1.xlsx # Spreadsheet with metadata to be processed
│ ├── raw_data/ # Original, unprocessed data
│ ├── processed_data/ # Analyzed or transformed data
│ ├── metadata/ # Documentation specific to this study
│ └── scripts/ # Analysis scripts and code
├── Study2/
│ ├── README.md # Study overview.
│ ├── study2.xlsx # Spreadsheet with metadata to be processed
│ ├── raw_data/
│ ├── processed_data/
│ ├── metadata/
│ └── scripts/
└── project_docs/ # Project-level documentation
File Naming Conventions#
Use Descriptive Names
Include relevant information in file names
Example:
2024-03-15_sample-A_tensile-test.csv
Avoid Special Characters
Use hyphens (-) or underscores (_) instead of spaces
Avoid characters like:
* ? " < > | : \ /
Include Dates
Use ISO format: YYYY-MM-DD
Place dates at the beginning of filenames for easy sorting
Metadata Organization#
Project-Level Metadata
Create a README.md file in the project root
Include:
Project title and description
Principal investigator
Project dates
Funding sources
Contact information
Study-Level Metadata
Create a README.md file in each study directory
Include:
Study objectives
Experimental methods
Computational Tools and versions
Equipment used
Data collection procedures
Analysis protocols
Create a metadata file or directory for each study to document metadata names and meaning
Optionally have a study Excel file to capture important values, workflow, samples and computations. See Using Spreadsheets For Data.
Best Practices#
Version Control
Keep original data files unchanged
Store processed versions in separate folders
Use version numbers in filenames when applicable
Example:
analysis_v1.2.xlsx
Documentation
Document any data transformations
Include README.md files in each directory
Note any software or tools required
Document units of measurement
Data Quality
Validate data before uploading
Check for completeness
Verify file formats are correct
Ensure consistent units across files
Before Uploading
Review folder structure
Verify all necessary metadata is included
Check file permissions
Remove any temporary or unnecessary files
Tips for Specific Research Data Types#
Experimental and Simulation Data
Group related experiments and simulations in the same study
Include calibration data
Store raw instrument output separately from processed results
Image Data
Use consistent naming for series of images
Include scale bars and measurement information
Store original and processed images separately
Analysis Scripts
Document dependencies
Include version information
Store input data requirements
Add comments explaining analysis steps
Regular Maintenance#
Periodic Review
Check organization structure regularly
Update documentation as needed
Archive completed studies
Remove redundant files
Backup Strategy
Maintain local copies of important data
Document any changes made in MaterialsCommons
Keep track of file versions
Final Note#
This organization structure will help ensure your research data is:
Easy to find and access
Well-documented
Preserved for long-term use
Ready for collaboration
Suitable for publication and sharing
Note
Good organization before upload makes your research more efficient, reproducible, and valuable to the scientific community.