Overview
Researchers at the National Institute for Materials Science (NIMS) have developed Research Data Express (RDE), a data management system designed to automate data processing and create AI-ready datasets for materials research. The system was published in Science and Technology of Advanced Materials: Methods.
Challenge in Materials Research
Materials research frequently generates large volumes of data in manufacturer-specific formats with inconsistent terminology. This inconsistency complicates the aggregation, comparison, and reuse of data. Researchers have traditionally spent significant time on tasks such as format conversion, metadata assignment, and characteristics extraction. These manual processes can deter data sharing, which is increasingly important for AI-driven materials discovery requiring high-quality datasets.
RDE System and Functionality
RDE addresses these challenges by automatically interpreting experimental data from raw files and manually inputted measurements. It then restructures and stores this information in a more readable format.
Jun Fujima, a corresponding author and researcher at NIMS's Materials Data Platform, stated that RDE reduces the burden of routine data processing and enhances data findability, interoperability, reusability (FAIR principles), and traceability, with the goal of promoting collaborative, data-driven materials research.
Key Innovation: Dataset Templates
A core innovation of RDE is its "Dataset Template," which defines how data from various experiment types should be processed. This differs from other systems that define the data format itself. For instance, a Dataset Template can be configured to interpret X-ray measurement spreadsheets from different sources, enabling the system to automatically perform advanced analyses and create visualizations. Researchers can prepare multiple templates for different materials research themes or easily create custom templates. Many templates have been developed and shared among users.
Impact and Adoption
Since its launch in January 2023, RDE has been adopted across Japan's materials research community, demonstrating its scalability. The system has over 5,000 users, with more than 1,900 Dataset Templates implemented for various experimental methods, over 16,000 datasets created, and more than three million data files accumulated. RDE serves as a data infrastructure for major national initiatives, including Japan's Ministry of Education, Culture, Sports, Science and Technology's Materials Research DX Platform initiative. The NIMS team has also released an open-source software toolkit (RDEToolKit) to encourage broader community use.