Difference between revisions of "Journal:Kadi4Mat: A research data infrastructure for materials science"

From LIMSWiki
Jump to navigationJump to search
(Created stub. Saving and adding more.)
 
(Saving and adding more.)
Line 6: Line 6:
|title_full  = Kadi4Mat: A research data infrastructure for materials science
|title_full  = Kadi4Mat: A research data infrastructure for materials science
|journal      = ''Data Science Journal''
|journal      = ''Data Science Journal''
|authors      = Brnadt, Nico; Griem, Lars; Herrmann, Christoph; Schoof, Ephraim; Tosato, Giovanna; Zhao, Yinghan; Zschumme, Philipp; Selzer, Michael
|authors      = Brnadt, Nico; Griem, Lars; Herrmann, Christoph; Schoof, Ephraim; Tosato, Giovanna; Zhao, Yinghan;<br />Zschumme, Philipp; Selzer, Michael
|affiliations = Karlsruhe Institute of Technology, Karlsruhe University of Applied Sciences, Helmholtz Institute Ulm
|affiliations = Karlsruhe Institute of Technology, Karlsruhe University of Applied Sciences, Helmholtz Institute Ulm
|contact      = Email: nico dot brandt at kit dot edu
|contact      = Email: nico dot brandt at kit dot edu
Line 31: Line 31:


==Introduction==
==Introduction==
 
In the engineering sciences, the handling of digital research data plays an increasingly important role in all fields of application.<ref name="SandfeldStrateg18">{{cite web |url=https://www.tib.eu/en/search/id/TIBKAT%3A1028913559/ |title=Strategiepapier - Digitale Transformation in der Materialwissenschaft und Werkstofftechnik |author=Sandfeld, S.; Dahmen, T.; Fischer, F.O.R. et al. |publisher=Deutsche Gesellschaft für Materialkunde e.V |date=2018}}</ref> This is especially the case, due to the growing amount of data obtained from experiments and simulations.<ref name="HeyTheData03">{{cite book |chapter=Chapter 36: The Data Deluge: An e‐Science Perspective |title=Grid Computing: Making the Global Infrastructure a Reality |author=Hey, T.; Trefethen, A. |editor=Berman, F.; Fox, G.; Hey, T. |publisher=John Wiley & Sons, Ltd |year=2003 |isbn=9780470867167 |doi=10.1002/0470867167.ch36}}</ref> The extraction of knowledge from these data is referred to as a data-driven, fourth paradigm of science, filed under the keyword "data science."<ref name="HeyTheFourth09">{{cite book |title=The Fourth Paradigm: Data-Intensive Scientific Discovery |author=Hey, T.; Tansley, S.; Tolle, K. |publisher=Microsoft Research |year=2009 |isbn=9780982544204 |url=https://www.microsoft.com/en-us/research/publication/fourth-paradigm-data-intensive-scientific-discovery/}}</ref> This is particularly true in [[Materials informatics|materials science]], as the research and understanding of new materials are becoming more and more complex.<ref name="HillMaterials16">{{cite journal |title=Materials science with large-scale data and informatics: Unlocking new opportunities |journal=MRS Bulletin |author=Hill, J.; Mulholland, G.; Persson, K. et al. |volume=41 |issue=5 |pages=399–409 |year=2016 |doi=10.1557/mrs.2016.93}}</ref> Without suitable [[Data analysis|analysis]] methods, the ever-growing amount of data will no longer be manageable. In order to be able to perform appropriate data analyses smoothly, the structured storage of research data and associated [[metadata]] is an important aspect. Specifically, a uniform research [[Information management|data management]] is needed, which is made possible by appropriate infrastructures such as research data repositories. In addition to uniform data storage, such systems can help to overcome inter-institutional hurdles in data exchange, compare theoretical and experimental data, and provide reproducible [[workflow]]s for data analysis. Furthermore, linking the data with persistent identifiers enables other researchers to directly reference them in their work.





Revision as of 19:48, 22 February 2021

Full article title Kadi4Mat: A research data infrastructure for materials science
Journal Data Science Journal
Author(s) Brnadt, Nico; Griem, Lars; Herrmann, Christoph; Schoof, Ephraim; Tosato, Giovanna; Zhao, Yinghan;
Zschumme, Philipp; Selzer, Michael
Author affiliation(s) Karlsruhe Institute of Technology, Karlsruhe University of Applied Sciences, Helmholtz Institute Ulm
Primary contact Email: nico dot brandt at kit dot edu
Year published 2021
Volume and issue 20(1)
Article # 8
DOI 10.5334/dsj-2021-008
ISSN 1683-1470
Distribution license Creative Commons Attribution 4.0 International
Website https://datascience.codata.org/articles/10.5334/dsj-2021-008/
Download https://datascience.codata.org/articles/10.5334/dsj-2021-008/galley/1048/download/ (PDF)

Abstract

The concepts and current developments of a research data infrastructure for materials science are presented, extending and combining the features of an electronic laboratory notebook (ELN) and a repository. The objective of this infrastructure is to incorporate the possibility of structured data storage and data exchange with documented and reproducible data analysis and visualization, which finally leads to the publication of the data. This way, researchers can be supported throughout the entire research process. The software is being developed as a web-based and desktop-based system, offering both a graphical user interface (GUI) and a programmatic interface. The focus of the development is on the integration of technologies and systems based on both established as well as new concepts. Due to the heterogeneous nature of materials science data, the current features are kept mostly generic, and the structuring of the data is largely left to the users. As a result, an extension of the research data infrastructure to other disciplines is possible in the future. The source code of the project is publicly available under a permissive Apache 2.0 license.

Keywords: research data management, electronic laboratory notebook, repository, open source, materials science

Introduction

In the engineering sciences, the handling of digital research data plays an increasingly important role in all fields of application.[1] This is especially the case, due to the growing amount of data obtained from experiments and simulations.[2] The extraction of knowledge from these data is referred to as a data-driven, fourth paradigm of science, filed under the keyword "data science."[3] This is particularly true in materials science, as the research and understanding of new materials are becoming more and more complex.[4] Without suitable analysis methods, the ever-growing amount of data will no longer be manageable. In order to be able to perform appropriate data analyses smoothly, the structured storage of research data and associated metadata is an important aspect. Specifically, a uniform research data management is needed, which is made possible by appropriate infrastructures such as research data repositories. In addition to uniform data storage, such systems can help to overcome inter-institutional hurdles in data exchange, compare theoretical and experimental data, and provide reproducible workflows for data analysis. Furthermore, linking the data with persistent identifiers enables other researchers to directly reference them in their work.



References

  1. Sandfeld, S.; Dahmen, T.; Fischer, F.O.R. et al. (2018). "Strategiepapier - Digitale Transformation in der Materialwissenschaft und Werkstofftechnik". Deutsche Gesellschaft für Materialkunde e.V. https://www.tib.eu/en/search/id/TIBKAT%3A1028913559/. 
  2. Hey, T.; Trefethen, A. (2003). "Chapter 36: The Data Deluge: An e‐Science Perspective". In Berman, F.; Fox, G.; Hey, T.. Grid Computing: Making the Global Infrastructure a Reality. John Wiley & Sons, Ltd. doi:10.1002/0470867167.ch36. ISBN 9780470867167. 
  3. Hey, T.; Tansley, S.; Tolle, K. (2009). The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research. ISBN 9780982544204. https://www.microsoft.com/en-us/research/publication/fourth-paradigm-data-intensive-scientific-discovery/. 
  4. Hill, J.; Mulholland, G.; Persson, K. et al. (2016). "Materials science with large-scale data and informatics: Unlocking new opportunities". MRS Bulletin 41 (5): 399–409. doi:10.1557/mrs.2016.93. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. The original article lists references in alphabetical order; however, this version lists them in order of appearance, by design.