Difference between revisions of "Journal:Sample identifiers and metadata to support data management and reuse in multidisciplinary ecosystem sciences"

From LIMSWiki
Jump to navigationJump to search
(Saving and adding more.)
(Saving and adding more.)
Line 120: Line 120:


We also reviewed existing metadata standards and templates that are relevant for samples collected by environmental scientists, including general digital object standards<ref>{{Cite journal |last=DataCite Metadata Working Group |date=2019 |others=Madeleine de Smaele, Amy Hatfield Hart, Jan Ashton, Isabel Bernal Martinez, Stefanie Dietiker, Jannean Elliot |title=DataCite Metadata Schema for the Publication and Citation of Research Data v4.2 |url=http://schema.datacite.org/meta/kernel-4.2/ |doi=10.5438/RV0G-AV03}}</ref><ref>{{Cite web |last=DCMI Usage Board |date=20 January 2020 |title=DCMI Metadata Terms |work=Dublin Core Metadata Initiative |url=https://www.dublincore.org/specifications/dublin-core/dcmi-terms/ |publisher=DCMI |accessdate=16 September 2020}}</ref><ref>{{Cite book |last=Cox |first=Simon Jonathan David |date=2011 |title=ISO 19156:2011 - Geographic information -- Observations and measurements |url=http://rgdoi.net/10.13140/2.1.1142.3042 |language=en |publisher=International Organization for Standardization |doi=10.13140/2.1.1142.3042}}</ref>, biodiversity records<ref name=":3" /><ref>{{Citation |last=Group |first=Darwin Core Task |date=2014-11-08 |title=Darwin Core: 2014-11-08 |url=https://zenodo.org/record/12694 |work=Biodiversity Information Standards (TDWG) |publisher=Zenodo |doi=10.5281/zenodo.12694 |accessdate=}}</ref>, omics (e.g. [[genomics]], metagenomics) material<ref name=":1" /><ref name=":2" /><ref>{{Cite journal |last=Reddy |first=T.B.K. |last2=Thomas |first2=Alex D. |last3=Stamatis |first3=Dimitri |last4=Bertsch |first4=Jon |last5=Isbandi |first5=Michelle |last6=Jansson |first6=Jakob |last7=Mallajosyula |first7=Jyothi |last8=Pagani |first8=Ioanna |last9=Lobos |first9=Elizabeth A. |last10=Kyrpides |first10=Nikos C. |date=2014-10-27 |title=The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification |url=https://doi.org/10.1093/nar/gku950 |journal=Nucleic Acids Research |volume=43 |issue=D1 |pages=D1099–D1106 |doi=10.1093/nar/gku950 |issn=1362-4962 |pmc=PMC4384021 |pmid=25348402}}</ref>, and geoscience samples<ref name=":4" /><ref>{{Cite journal |last=System For Earth Sample Registration (SESAR) |date=2020-02-17 |title=SESAR XML Schema for samples |url=https://zenodo.org/record/3875531 |language=en |doi=10.5281/ZENODO.3875531}}</ref> (see Additional files, Supplemental Table 2). We created a translation table comparing 49 metadata elements (see Additional files, Supplemental Table 3) in human-readable format. The translation table depicts linkages where metadata elements were common across standards, as well as differences.
We also reviewed existing metadata standards and templates that are relevant for samples collected by environmental scientists, including general digital object standards<ref>{{Cite journal |last=DataCite Metadata Working Group |date=2019 |others=Madeleine de Smaele, Amy Hatfield Hart, Jan Ashton, Isabel Bernal Martinez, Stefanie Dietiker, Jannean Elliot |title=DataCite Metadata Schema for the Publication and Citation of Research Data v4.2 |url=http://schema.datacite.org/meta/kernel-4.2/ |doi=10.5438/RV0G-AV03}}</ref><ref>{{Cite web |last=DCMI Usage Board |date=20 January 2020 |title=DCMI Metadata Terms |work=Dublin Core Metadata Initiative |url=https://www.dublincore.org/specifications/dublin-core/dcmi-terms/ |publisher=DCMI |accessdate=16 September 2020}}</ref><ref>{{Cite book |last=Cox |first=Simon Jonathan David |date=2011 |title=ISO 19156:2011 - Geographic information -- Observations and measurements |url=http://rgdoi.net/10.13140/2.1.1142.3042 |language=en |publisher=International Organization for Standardization |doi=10.13140/2.1.1142.3042}}</ref>, biodiversity records<ref name=":3" /><ref>{{Citation |last=Group |first=Darwin Core Task |date=2014-11-08 |title=Darwin Core: 2014-11-08 |url=https://zenodo.org/record/12694 |work=Biodiversity Information Standards (TDWG) |publisher=Zenodo |doi=10.5281/zenodo.12694 |accessdate=}}</ref>, omics (e.g. [[genomics]], metagenomics) material<ref name=":1" /><ref name=":2" /><ref>{{Cite journal |last=Reddy |first=T.B.K. |last2=Thomas |first2=Alex D. |last3=Stamatis |first3=Dimitri |last4=Bertsch |first4=Jon |last5=Isbandi |first5=Michelle |last6=Jansson |first6=Jakob |last7=Mallajosyula |first7=Jyothi |last8=Pagani |first8=Ioanna |last9=Lobos |first9=Elizabeth A. |last10=Kyrpides |first10=Nikos C. |date=2014-10-27 |title=The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification |url=https://doi.org/10.1093/nar/gku950 |journal=Nucleic Acids Research |volume=43 |issue=D1 |pages=D1099–D1106 |doi=10.1093/nar/gku950 |issn=1362-4962 |pmc=PMC4384021 |pmid=25348402}}</ref>, and geoscience samples<ref name=":4" /><ref>{{Cite journal |last=System For Earth Sample Registration (SESAR) |date=2020-02-17 |title=SESAR XML Schema for samples |url=https://zenodo.org/record/3875531 |language=en |doi=10.5281/ZENODO.3875531}}</ref> (see Additional files, Supplemental Table 2). We created a translation table comparing 49 metadata elements (see Additional files, Supplemental Table 3) in human-readable format. The translation table depicts linkages where metadata elements were common across standards, as well as differences.
The core IGSN Descriptive Metadata Schema<ref name="GHIGSNMeta">{{cite web |url=https://github.com/IGSN/metadata |title=IGSN metadata |author=IGSN |work=GitHub |date=24 August 2017}}</ref> includes basic metadata associated with sample collection, which is generally relevant across sample types. This schema links metadata profiles that differ across six currently-functioning IGSN allocating agents. The System For Earth Sample Registration (SESAR; the first allocating agent) has no access restrictions for obtaining IGSNs and provides user-friendly services for sample management.<ref name="SESARHome">{{cite web |url=https://www.geosamples.org/ |title=Welcome to SESAR |publisher=SESAR |date=2021}}</ref> The SESAR metadata profile and controlled terms are currently focused on geoscience samples, but the IGSN organization seeks to accommodate multiple disciplines and has already expanded into plant and other biological samples for some IGSN allocating agents. Our translation table for sample metadata allowed us to identify metadata elements and terms that could be revised or extended within the SESAR profile for improved representation of other sample types (see Additional files, Supplemental Table 3).
Biology-related standards are well-established, commonly used in the community, and are particularly important for ecosystem science samples. Genomic and metagenomic analyses and data publication require use of standards developed by the Genetic Standards Consortium (GSC)<ref name=":1" />, namely Minimum Information about any Sequence (MIxS) and Minimum Information about any Metagenome (MIMS).<ref name=":2" /> DarwinCore is a metadata standard for biodiversity records that has been widely adopted across the biocollections community.<ref name=":3" /> It is also required for submitting data to the Global Biodiversity Information Facility (GBIF), which allows global search and integration of biodiversity records.<ref name=":5">{{Cite journal |last=Samy |first=Gaiji |last2=Chavan |first2=Vishwas |last3=Ariño |first3=Arturo H. |last4=Otegui |first4=Javier |last5=Hobern |first5=Donald |last6=Sood |first6=Rajesh |last7=Robles |first7=Estrella |date=2013-07-09 |title=Content assessment of the primary biodiversity data published through GBIF network: Status, challenges and potentials |url=http://dx.doi.org/10.17161/bi.v8i2.4124 |journal=Biodiversity Informatics |volume=8 |issue=2 |doi=10.17161/bi.v8i2.4124 |issn=1546-9735}}</ref><ref>{{Cite journal |last=Robertson |first=Tim |last2=Döring |first2=Markus |last3=Guralnick |first3=Robert |last4=Bloom |first4=David |last5=Wieczorek |first5=John |last6=Braak |first6=Kyle |last7=Otegui |first7=Javier |last8=Russell |first8=Laura |last9=Desmet |first9=Peter |date=2014-08-06 |title=The GBIF Integrated Publishing Toolkit: Facilitating the Efficient Publishing of Biodiversity Data on the Internet |url=https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0102623 |journal=PLOS ONE |language=en |volume=9 |issue=8 |pages=e102623 |doi=10.1371/journal.pone.0102623 |issn=1932-6203 |pmc=PMC4123864 |pmid=25099149}}</ref> GBIF provides a valuable service as a data aggregator, and thus has driven standards adoption, enabling a wide range of data reuse applications in published biodiversity studies<ref name=":5" /><ref>{{Cite journal |last=Ball-Damerow |first=Joan E. |last2=Brenskelle |first2=Laura |last3=Barve |first3=Narayani |last4=Soltis |first4=Pamela S. |last5=Sierwald |first5=Petra |last6=Bieler |first6=Rüdiger |last7=LaFrance |first7=Raphael |last8=Ariño |first8=Arturo H. |last9=Guralnick |first9=Robert P. |date=2019-09-11 |title=Research applications of primary biodiversity databases in the digital age |url=https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0215794 |journal=PLOS ONE |language=en |volume=14 |issue=9 |pages=e0215794 |doi=10.1371/journal.pone.0215794 |issn=1932-6203 |pmc=PMC6738577 |pmid=31509534}}</ref>, including over 5,000 known citations from studies using biodiversity records.<ref name="GBIF">{{cite web |url=https://www.gbif.org/ |title=Global Biodiversity Information Facility |publisher=Global Biodiversity Information Facility |date=2021}}</ref>
We researched [[Ontology (information science)|ontologies]] that could be used to describe a broad set of environmental sample types, including the Biological Collections Ontology (BCO)<ref>{{Cite journal |last=Walls |first=Ramona L. |last2=Deck |first2=John |last3=Guralnick |first3=Robert |last4=Baskauf |first4=Steve |last5=Beaman |first5=Reed |last6=Blum |first6=Stanley |last7=Bowers |first7=Shawn |last8=Buttigieg |first8=Pier Luigi |last9=Davies |first9=Neil |last10=Endresen |first10=Dag |last11=Gandolfo |first11=Maria Alejandra |date=2014-03-03 |title=Semantics in Support of Biodiversity Knowledge Discovery: An Introduction to the Biological Collections Ontology and Related Ontologies |url=https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0089606 |journal=PLOS ONE |language=en |volume=9 |issue=3 |pages=e89606 |doi=10.1371/journal.pone.0089606 |issn=1932-6203 |pmc=PMC3940615 |pmid=24595056}}</ref>, Environment Ontology (ENVO)<ref>{{Cite journal |last=Buttigieg |first=Pier Luigi |last2=Pafilis |first2=Evangelos |last3=Lewis |first3=Suzanna E. |last4=Schildhauer |first4=Mark P. |last5=Walls |first5=Ramona L. |last6=Mungall |first6=Christopher J. |date=2016-09-23 |title=The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation |url=https://doi.org/10.1186/s13326-016-0097-6 |journal=Journal of Biomedical Semantics |volume=7 |issue=1 |pages=57 |doi=10.1186/s13326-016-0097-6 |issn=2041-1480 |pmc=PMC5035502 |pmid=27664130}}</ref>, Population and Community Ontology (PCO)<ref>{{Cite web |last=Osumi-Sutherland, D.; Zheng, J.; Buttigieg, P.L. et al. |title=Population and Community Ontology |url=https://raw.githubusercontent.com/PopulationAndCommunityOntology/pco/master/pco.owl |publication-date=n.d.}}</ref>, and Plant Ontology (PO)<ref>{{Cite journal |last=Avraham |first=Shulamit |last2=Tung |first2=Chih-Wei |last3=Ilic |first3=Katica |last4=Jaiswal |first4=Pankaj |last5=Kellogg |first5=Elizabeth A. |last6=McCouch |first6=Susan |last7=Pujar |first7=Anuradha |last8=Reiser |first8=Leonore |last9=Rhee |first9=Seung Y |last10=Sachs |first10=Martin M |last11=Schaeffer |first11=Mary |date=2008-01-01 |title=The Plant Ontology Database: a community resource for plant structure and developmental stages controlled vocabulary and annotations |url=https://academic.oup.com/nar/article/36/suppl_1/D449/2507667 |journal=Nucleic Acids Research |volume=36 |issue=suppl_1 |pages=D449–D454 |doi=10.1093/nar/gkm908 |issn=0305-1048 |pmc=PMC2238838 |pmid=18194960}}</ref> to identify additional or alternate terms to generally describe other types of soil, sediment, water, gas, and biology-related samples.<ref>{{Citation |last=Damerow |first=Joan |last2=Varadharajan |first2=Charu |last3=Boye |first3=Kristin |last4=Brodie |first4=Eoin |last5=Burrus |first5=Madison |last6=Chadwick |first6=Dana |last7=Cholia |first7=Shreyas |last8=Crystal-Ornelas |first8=Robert |last9=Elbashandy |first9=Hesham |date=2020 |title=ESS-DIVE Global Sample Numbers and and Metadata Reporting Format for Environmental Systems Science (IGSN-ESS) |url=https://www.osti.gov/servlets/purl/1660470/ |work=ESS-DIVE |language=en |publisher=Environmental System Science Data Infrastructure for a Virtual Ecosystem; Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE) |doi=10.15485/1660470 |accessdate=2021-12-07}}</ref>


==References==
==References==

Revision as of 22:43, 7 December 2021

Full article title Sample identifiers and metadata to support data management and reuse in multidisciplinary ecosystem sciences
Journal Data Science Journal
Author(s) Damerow, Joan E.; Varadharajan, Charuleka; Boye, Kristin; Brodie, Eoin L.; Burrus, Madison; Chadwick, K. Dana; Crystal-Ornelas, Robert; Elbashandy, Hesham; Alves, Ricardo J.E.; Ely, Kim S.; Goldman, Amy E.; Haberman, Ted; Hendrix, Valerie; Kakalia, Zarine; Kemner, Kenneth M.; Kersting, Annie B.; Merino, Nancy; O'Brien, Fianna; Perzan, Zach; Robles, Emily; Sorensen, Patrick; Stegen, James C.; Walls, Ramona L.; Weisenhorn, Pamela; Zavarin, Mavrik; Agarwal, Deborah
Author affiliation(s) Lawrence Berkeley National Laboratory, SLAC National Accelerator Laboratory, Stanford University, Brookhaven National Laboratory, Pacific Northwest National Laboratory, Metadata Game Changers, Argonne National Laboratory, Lawrence Livermore National Laboratory, University of Arizona
Primary contact Email: JoanDamerow at lbl dot gov
Year published 2021
Volume and issue 20(1)
Article # 11
DOI 10.5334/dsj-2021-011
ISSN 1683-1470
Distribution license Creative Commons Attribution 4.0 International
Website https://datascience.codata.org/articles/10.5334/dsj-2021-011/
Download https://datascience.codata.org/articles/10.5334/dsj-2021-011/galley/1055/download/ (PDF)

Abstract

Physical samples are foundational entities for research across the biological, Earth, and environmental sciences. Data generated from sample-based analyses are not only the basis of individual studies, but can also be integrated with other data to answer new and broader-scale questions. Ecosystem studies increasingly rely on multidisciplinary team-based science to study climate and environmental changes. While there are widely adopted conventions within certain domains to describe sample data, these have gaps when applied in a multidisciplinary context.

In this study, we reviewed existing practices for identifying, characterizing, and linking related environmental samples. We then tested practicalities of assigning persistent identifiers to samples, with standardized metadata, in a pilot field test involving eight United States Department of Energy projects. Participants collected a variety of sample types, with analyses conducted across multiple facilities. We address terminology gaps for multidisciplinary research and make recommendations for assigning identifiers and metadata that supports sample tracking, integration, and reuse. Our goal is to provide a practical approach to sample management, geared towards ecosystem scientists who contribute and reuse sample data.

Keywords: International GeoSample Numbers (IGSN), physical samples, soil, water, plant, leaf, microbial communities, related identifiers, persistent identifiers

Introduction

The study of natural ecosystems requires multidisciplinary science teams to understand and model processes from molecular to global scales.[1] Many research activities involve diverse collections of samples and associated field or laboratory measurements.[2][3] For example, studies of organic matter cycling through plants and soil involves analysis of samples to represent soil biogeochemistry, microbial communities, plant structures, leaf gas exchange, and traits of the specific organisms involved.[4][5][6] Each scientific expert, project team, and discipline has a responsibility to ensure that others can interpret, integrate, and reuse their sample data to help solve emerging problems as our global environment continues to change.[7]

Collaboration across disciplines requires a more unified approach to report basic information about key data entities, such as samples. One challenge in promoting a unified way of reporting sample data is that some research communities have already developed community-specific conventions, including those for omics samples[8][9][10], biodiversity records[11], and geoscience samples.[2][12] A larger challenge is that many researchers use no formal reporting conventions, or exclude information needed to interpret and reuse the data.[13] More coordination is needed across these communities to develop a multidisciplinary reporting format for physical samples that is widely adopted, or to ensure that standards are interoperable. Common reporting would support effective discovery, integration, and reuse of sample data that spans scientific domains.

Sample identifiers are also needed to associate and manage important information describing a sample (i.e., metadata), such as the location, date, environmental context, and purpose of sample collection. For multidisciplinary studies, the task of generating and managing unique sample identifiers and associated metadata can be complicated, particularly as important contextual information is added throughout the data lifecycle.[14] Samples are sent to different collaborators, laboratories, and user facilities, and then combined into a variety of digital records and publications (Figure 1).[15] As a result, scientists face challenges with data management, metadata management, tracking, or the ability to integrate and reuse valuable sample data. Without attention, these inefficiencies result in data and metadata loss and inhibit the potential of scientific discovery.


Fig1 Damerow DataSciJourn21 20-1.png

Figure 1. Tracking interdisciplinary samples throughout the cycle of field collection, transport to collaborators and other labs, various analyses, and digital records

Our overall goal was to address sample identification and metadata needs of ecosystem scientists, and was driven by the user community of the U.S. Department of Energy’s (DOE’s) data repository for Earth and environmental sciences, the Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE).[16] The DOE’s Environmental Systems Science (ESS) program relies on multidisciplinary, team-based science to study complex processes within terrestrial ecosystems, spanning from the bedrock through the rhizosphere and vegetation to the atmospheric surface layer.[17] This community is well-positioned to help address specific challenges in standardizing and integrating data and metadata about a variety of environmental samples (e.g., soil, water, plant, and associated biological material used for omics analyses), which applies broadly to environmental research.[18][19][20][21][22]

We focus on sample identifiers and metadata that support the FAIR Guiding Principles (findability, accessibility, interoperability, and reusability) from the multidisciplinary domain-science perspective.[23][24][25][26][27] We therefore use a community-focused approach to: a.) evaluate existing options for sample identifiers and metadata descriptions for ecosystem science samples; b.) pilot the process of standardizing sample information to evaluate practical issues from domain-science perspectives; and c.) outline practical recommendations for sample identifier allocation, tracking, and associated metadata.

Methods

Review of existing sample identifiers, metadata conventions, and standards

ESS-DIVE’s work on sample identifiers and metadata began in response to a specific problem with tracking multidisciplinary samples, as they are sent to different labs and user facilities, which DOE ESS scientists brought up during community meetings. As a community-focused data repository, our approach to this issue involved leading or participating in a variety of community discussions on sample identifiers and/or associated metadata. These included:

  • presenting identifier options in an ESS community webinar and whitepaper;
  • engaging in discussion with each pilot test participant;
  • holding several meetings with U.S. DOE user facilities and data systems representatives (Joint Genome Institute, National Microbiome Data Collaborative, Environmental Molecular Sciences Laboratory, and DOE Systems Biology Knowledgebase);
  • participating in broader community meetings on identifier and metadata practices for physical samples (Earth Science Information Partners [ESIP] and Research Data Alliance [RDA]]);
  • participating in a National Microbiome Data Collaborative (NMDC) Ontology workshop;
  • participating in a USGS workshop on sample collection metadata for the National Digital Catalogue; and
  • participating in the IGSN 2040 Steering Committee and business planning.

After reviewing the scope and use of available persistent identifier (PID) options (Table 1) and community discussions, we focused additional identifier comparison on International GeoSample Numbers (IGSNs) and Archival Resource Keys (ARKs), which are most commonly used for a variety of sample types (Additional files, Supplemental Table 1). Considerations in the identifier assessment included association with a broader international community focused on sample identification and description, associated metadata to describe samples and their relationships, availability of user-friendly infrastructure to mint identifiers and validate metadata, general ease of use, and other technical identifier characteristics, listed in Additional files, Supplemental Table 1.

Table 1. Examples of PIDs that have been used for samples, modified from Guralnick et al.[28]
 
ARK = Archival Resource Keys, URN = Uniform Resource Name, URI = Uniform Resource Identifier, DOI = Digital Object Identifier, UUID = Universally Unique Identifier, IGSN = International GeoSample Number, CETAF = Consortium of the European Taxonomic Facilities, RRID = Research Resource Identifier.
Identifier type Identifier example Scope
ARK ark:/12148/btv1b8449691v Flexible
URN urn:catalog:UMMZ:Mammals:171041 Flexible
HTTP URI http://data.rbge.org.uk/herb/E00115694 Flexible
DOI 10.7299/X7VQ32SJ Flexible, mostly papers and datasets
UUID EF0A4D3E-702F-4882-81B8- CA737AEB7B28 Flexible
IGSN IGSN: IECUR0002 Geoscience, working to become general physical sample identifier
CETAF URI (based on HTTP URI) http://data.rbge.org.uk/herb/E00421503 Species Occurrence, Specimens from CETAF institutions
RRID RRID:MGI:5630441 Biomedical Research Resources
BioSample accession number SAMN03983893 Biological source materials used in experimental assays

We also reviewed existing metadata standards and templates that are relevant for samples collected by environmental scientists, including general digital object standards[29][30][31], biodiversity records[11][32], omics (e.g. genomics, metagenomics) material[8][10][33], and geoscience samples[12][34] (see Additional files, Supplemental Table 2). We created a translation table comparing 49 metadata elements (see Additional files, Supplemental Table 3) in human-readable format. The translation table depicts linkages where metadata elements were common across standards, as well as differences.

The core IGSN Descriptive Metadata Schema[35] includes basic metadata associated with sample collection, which is generally relevant across sample types. This schema links metadata profiles that differ across six currently-functioning IGSN allocating agents. The System For Earth Sample Registration (SESAR; the first allocating agent) has no access restrictions for obtaining IGSNs and provides user-friendly services for sample management.[36] The SESAR metadata profile and controlled terms are currently focused on geoscience samples, but the IGSN organization seeks to accommodate multiple disciplines and has already expanded into plant and other biological samples for some IGSN allocating agents. Our translation table for sample metadata allowed us to identify metadata elements and terms that could be revised or extended within the SESAR profile for improved representation of other sample types (see Additional files, Supplemental Table 3).

Biology-related standards are well-established, commonly used in the community, and are particularly important for ecosystem science samples. Genomic and metagenomic analyses and data publication require use of standards developed by the Genetic Standards Consortium (GSC)[8], namely Minimum Information about any Sequence (MIxS) and Minimum Information about any Metagenome (MIMS).[10] DarwinCore is a metadata standard for biodiversity records that has been widely adopted across the biocollections community.[11] It is also required for submitting data to the Global Biodiversity Information Facility (GBIF), which allows global search and integration of biodiversity records.[37][38] GBIF provides a valuable service as a data aggregator, and thus has driven standards adoption, enabling a wide range of data reuse applications in published biodiversity studies[37][39], including over 5,000 known citations from studies using biodiversity records.[40]

We researched ontologies that could be used to describe a broad set of environmental sample types, including the Biological Collections Ontology (BCO)[41], Environment Ontology (ENVO)[42], Population and Community Ontology (PCO)[43], and Plant Ontology (PO)[44] to identify additional or alternate terms to generally describe other types of soil, sediment, water, gas, and biology-related samples.[45]

References

  1. Weart, Spencer (26 February 2013). "Rise of interdisciplinary research on climate". Proceedings of the National Academy of Sciences 110 (Supplement 1): 3657–3664. doi:10.1073/pnas.1107482109. PMC PMC3586608. PMID 22778431. https://www.pnas.org/content/110/Supplement_1/3657. 
  2. 2.0 2.1 Devaraju, A.; Klump, J.; Cox, S.J.D. et al. (1 November 2016). "Representing and publishing physical sample descriptions" (in en). Computers & Geosciences 96: 1–10. doi:10.1016/j.cageo.2016.07.018. ISSN 0098-3004. https://www.sciencedirect.com/science/article/pii/S0098300416302023. 
  3. Ponsero, Alise J; Bomhoff, Matthew; Blumberg, Kai; Youens-Clark, Ken; Herz, Nina M; Wood-Charlson, Elisha M; Delong, Edward F; Hurwitz, Bonnie L (31 July 2020). "Planet Microbe: a platform for marine microbiology to discover and analyze interconnected ‘omics and environmental data". Nucleic Acids Research 49 (D1): D792–D802. doi:10.1093/nar/gkaa637. ISSN 0305-1048. PMC PMC7778950. PMID 32735679. https://academic.oup.com/nar/article/49/D1/D792/5879428. 
  4. Cordeiro, Amanda L.; Norby, Richard J.; Andersen, Kelly M.; Valverde-Barrantes, Oscar; Fuchslueger, Lucia; Oblitas, Erick; Hartley, Iain P.; Iversen, Colleen M. et al. (2020). "Fine-root dynamics vary with soil depth and precipitation in a low-nutrient tropical forest in the Central Amazonia" (in en). Plant-Environment Interactions 1 (1): 3–16. doi:10.1002/pei3.10010. ISSN 2575-6265. https://onlinelibrary.wiley.com/doi/abs/10.1002/pei3.10010. 
  5. Malik, Ashish A.; Martiny, Jennifer B. H.; Brodie, Eoin L.; Martiny, Adam C.; Treseder, Kathleen K.; Allison, Steven D. (1 January 2020). "Defining trait-based microbial strategies with consequences for soil carbon cycling under climate change" (in en). The ISME Journal 14 (1): 1–9. doi:10.1038/s41396-019-0510-0. ISSN 1751-7370. PMC PMC6908601. PMID 31554911. https://www.nature.com/articles/s41396-019-0510-0. 
  6. Treseder, Kathleen K.; Balser, Teri C.; Bradford, Mark A.; Brodie, Eoin L.; Dubinsky, Eric A.; Eviner, Valerie T.; Hofmockel, Kirsten S.; Lennon, Jay T. et al. (3 September 2011). "Integrating microbial ecology into ecosystem models: challenges and priorities". Biogeochemistry 109 (1-3): 7–18. doi:10.1007/s10533-011-9636-5. ISSN 0168-2563. http://dx.doi.org/10.1007/s10533-011-9636-5. 
  7. Soranno, Patricia A.; Schimel, David S. (2014). "Macrosystems ecology: big data, big ecology" (in en). Frontiers in Ecology and the Environment 12 (1): 3–3. doi:10.1890/1540-9295-12.1.3. ISSN 1540-9309. https://onlinelibrary.wiley.com/doi/abs/10.1890/1540-9295-12.1.3. 
  8. 8.0 8.1 8.2 Field, Dawn; Amaral-Zettler, Linda; Cochrane, Guy; Cole, James R.; Dawyndt, Peter; Garrity, George M.; Gilbert, Jack; Glöckner, Frank Oliver et al. (21 June 2011). "The Genomic Standards Consortium" (in en). PLOS Biology 9 (6): e1001088. doi:10.1371/journal.pbio.1001088. ISSN 1545-7885. PMC PMC3119656. PMID 21713030. https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001088. 
  9. Reddy, T.B.K.; Thomas, Alex D.; Stamatis, Dimitri; Bertsch, Jon; Isbandi, Michelle; Jansson, Jakob; Mallajosyula, Jyothi; Pagani, Ioanna et al. (27 October 2014). "The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification". Nucleic Acids Research 43 (D1): D1099–D1106. doi:10.1093/nar/gku950. ISSN 1362-4962. PMC PMC4384021. PMID 25348402. https://academic.oup.com/nar/article/43/D1/D1099/2439522. 
  10. 10.0 10.1 10.2 Yilmaz, Pelin; Kottmann, Renzo; Field, Dawn; Knight, Rob; Cole, James R.; Amaral-Zettler, Linda; Gilbert, Jack A.; Karsch-Mizrachi, Ilene et al. (1 May 2011). "Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications" (in en). Nature Biotechnology 29 (5): 415–420. doi:10.1038/nbt.1823. ISSN 1546-1696. PMC PMC3367316. PMID 21552244. https://www.nature.com/articles/nbt.1823. 
  11. 11.0 11.1 11.2 Wieczorek, John; Bloom, David; Guralnick, Robert; Blum, Stan; Döring, Markus; Giovanni, Renato; Robertson, Tim; Vieglais, David (6 January 2012). "Darwin Core: An Evolving Community-Developed Biodiversity Data Standard" (in en). PLOS ONE 7 (1): e29715. doi:10.1371/journal.pone.0029715. ISSN 1932-6203. PMC PMC3253084. PMID 22238640. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0029715. 
  12. 12.0 12.1 System For Earth Sample Registration (SESAR) (6 February 2020) (in en). SESAR Batch Registration Quick Guide. doi:10.5281/ZENODO.3874923. https://zenodo.org/record/3874923. 
  13. Roche, Dominique G.; Kruuk, Loeske E. B.; Lanfear, Robert; Binning, Sandra A. (10 November 2015). "Public Data Archiving in Ecology and Evolution: How Well Are We Doing?" (in en). PLOS Biology 13 (11): e1002295. doi:10.1371/journal.pbio.1002295. ISSN 1545-7885. PMC PMC4640582. PMID 26556502. https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002295. 
  14. Treloar, Andrew; Klump, Jens (20 December 2019). "Updating the Data Curation Continuum" (in en). International Journal of Digital Curation 14 (1): 87–101. doi:10.2218/ijdc.v14i1.643. ISSN 1746-8256. http://www.ijdc.net/article/view/643. 
  15. Chase, John H.; Bolyen, Evan; Rideout, Jai Ram; Caporaso, J. Gregory (22 December 2015). "cual-id: Globally Unique, Correctable, and Human-Friendly Sample Identifiers for Comparative Omics Studies" (in EN). mSystems. doi:10.1128/mSystems.00010-15. PMC PMC5069752. PMID 27822516. https://journals.asm.org/doi/abs/10.1128/mSystems.00010-15. 
  16. Varadharajan, C.; Cholia, S.; Snavely, C. et al. (8 January 2019). "Launching an Accessible Archive of Environmental Data" (in en-US). Eos. doi:10.1029/2019eo111263. http://eos.org/science-updates/launching-an-accessible-archive-of-environmental-data. 
  17. Biological and Environmental Research Advisory Committee (2017). "Grand Challenges for Biological and Environmental Research: Progress and Future Vision" (PDF). U.S. Department of Energy. https://genomicscience.energy.gov/BERfiles/BERAC-2017-Grand-Challenges-Report.pdf. 
  18. Chadwick, K. Dana; Brodrick, Philip G.; Grant, Kathleen; Goulden, Tristan; Henderson, Amanda; Falco, Nicola; Wainwright, Haruko; Williams, Kenneth H. et al. (2020). "Integrating airborne remote sensing and field campaigns for ecology and Earth system science" (in en). Methods in Ecology and Evolution 11 (11): 1492–1508. doi:10.1111/2041-210X.13463. ISSN 2041-210X. https://onlinelibrary.wiley.com/doi/abs/10.1111/2041-210X.13463. 
  19. Serbin, Shawn P.; Wu, Jin; Ely, Kim S.; Kruger, Eric L.; Townsend, Philip A.; Meng, Ran; Wolfe, Brett T.; Chlus, Adam et al. (2019). "From the Arctic to the tropics: multibiome prediction of leaf mass per area using leaf reflectance" (in en). New Phytologist 224 (4): 1557–1568. doi:10.1111/nph.16123. ISSN 1469-8137. https://onlinelibrary.wiley.com/doi/abs/10.1111/nph.16123. 
  20. Stegen, James C.; Goldman, Amy E. (9 October 2018). "WHONDRS: a Community Resource for Studying Dynamic River Corridors" (in EN). mSystems. doi:10.1128/mSystems.00151-18. PMC PMC6178584. PMID 30320221. https://journals.asm.org/doi/abs/10.1128/mSystems.00151-18. 
  21. Wu, Jin; Rogers, Alistair; Albert, Loren P.; Ely, Kim; Prohaska, Neill; Wolfe, Brett T.; Oliveira, Raimundo Cosme; Saleska, Scott R. et al. (2019). "Leaf reflectance spectroscopy captures variation in carboxylation capacity across species, canopy environment and leaf age in lowland moist tropical forests" (in en). New Phytologist 224 (2): 663–674. doi:10.1111/nph.16029. ISSN 1469-8137. https://onlinelibrary.wiley.com/doi/abs/10.1111/nph.16029. 
  22. Wu, Jin; Serbin, Shawn P.; Ely, Kim S.; Wolfe, Brett T.; Dickman, L. Turin; Grossiord, Charlotte; Michaletz, Sean T.; Collins, Adam D. et al. (2020). "The response of stomatal conductance to seasonal drought in tropical forests" (in en). Global Change Biology 26 (2): 823–839. doi:10.1111/gcb.14820. ISSN 1365-2486. https://onlinelibrary.wiley.com/doi/abs/10.1111/gcb.14820. 
  23. Beck, Marcus W.; O’Hara, Casey; Lowndes, Julia S. Stewart; Mazor, Raphael D.; Theroux, Susanna; Gillett, David J.; Lane, Belize; Gearheart, Gregory (20 July 2020). "The importance of open science for biological assessment of aquatic environments" (in en). PeerJ 8: e9539. doi:10.7717/peerj.9539. ISSN 2167-8359. PMC PMC7377246. PMID 32742805. https://peerj.com/articles/9539. 
  24. Conze, Ronald; Lorenz, Henning; Ulbricht, Damian; Elger, Kirsten; Gorgas, Thomas (25 January 2017). "Utilizing the International Geo Sample Number Concept in Continental Scientific Drilling During ICDP Expedition COSC-1" (in en). Data Science Journal 16: 2. doi:10.5334/dsj-2017-002. ISSN 1683-1470. http://datascience.codata.org/articles/10.5334/dsj-2017-002/. 
  25. Lehnert, Kerstin; Wyborn, Lesley; Klump, Jens (2019). "FAIR Geoscientific Samples and Data Need International Collaboration" (in en). Acta Geologica Sinica - English Edition 93 (S3): 32–33. doi:10.1111/1755-6724.14236. ISSN 1755-6724. https://onlinelibrary.wiley.com/doi/abs/10.1111/1755-6724.14236. 
  26. Stall, Shelley; Yarmey, Lynn; Cutcher-Gershenfeld, Joel; Hanson, Brooks; Lehnert, Kerstin; Nosek, Brian; Parsons, Mark; Robinson, Erin et al. (1 June 2019). "Make scientific data FAIR" (in en). Nature 570 (7759): 27–29. doi:10.1038/d41586-019-01720-7. https://www.nature.com/articles/d41586-019-01720-7. 
  27. Wilkinson, Mark D.; Dumontier, Michel; Aalbersberg, IJsbrand Jan; Appleton, Gabrielle; Axton, Myles; Baak, Arie; Blomberg, Niklas; Boiten, Jan-Willem et al. (15 March 2016). "The FAIR Guiding Principles for scientific data management and stewardship" (in en). Scientific Data 3 (1): 160018. doi:10.1038/sdata.2016.18. ISSN 2052-4463. PMC PMC4792175. PMID 26978244. https://www.nature.com/articles/sdata201618. 
  28. Guralnick, Robert P.; Cellinese, Nico; Deck, John; Pyle, Richard L.; Kunze, John; Penev, Lyubomir; Walls, Ramona; Hagedorn, Gregor et al. (4 June 2015). "Community Next Steps for Making Globally Unique Identifiers Work for Biocollections Data" (in en). ZooKeys 494: 133–154. doi:10.3897/zookeys.494.9352. ISSN 1313-2970. PMC PMC4400380. PMID 25901117. https://zookeys.pensoft.net/article/5042/. 
  29. DataCite Metadata Working Group (2019). DataCite Metadata Schema for the Publication and Citation of Research Data v4.2. Madeleine de Smaele, Amy Hatfield Hart, Jan Ashton, Isabel Bernal Martinez, Stefanie Dietiker, Jannean Elliot. doi:10.5438/RV0G-AV03. http://schema.datacite.org/meta/kernel-4.2/. 
  30. DCMI Usage Board (20 January 2020). "DCMI Metadata Terms". Dublin Core Metadata Initiative. DCMI. https://www.dublincore.org/specifications/dublin-core/dcmi-terms/. Retrieved 16 September 2020. 
  31. Cox, Simon Jonathan David (2011) (in en). ISO 19156:2011 - Geographic information -- Observations and measurements. International Organization for Standardization. doi:10.13140/2.1.1142.3042. http://rgdoi.net/10.13140/2.1.1142.3042. 
  32. Group, Darwin Core Task (8 November 2014), "Darwin Core: 2014-11-08", Biodiversity Information Standards (TDWG) (Zenodo), doi:10.5281/zenodo.12694, https://zenodo.org/record/12694 
  33. Reddy, T.B.K.; Thomas, Alex D.; Stamatis, Dimitri; Bertsch, Jon; Isbandi, Michelle; Jansson, Jakob; Mallajosyula, Jyothi; Pagani, Ioanna et al. (27 October 2014). "The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification". Nucleic Acids Research 43 (D1): D1099–D1106. doi:10.1093/nar/gku950. ISSN 1362-4962. PMC PMC4384021. PMID 25348402. https://doi.org/10.1093/nar/gku950. 
  34. System For Earth Sample Registration (SESAR) (17 February 2020) (in en). SESAR XML Schema for samples. doi:10.5281/ZENODO.3875531. https://zenodo.org/record/3875531. 
  35. IGSN (24 August 2017). "IGSN metadata". GitHub. https://github.com/IGSN/metadata. 
  36. "Welcome to SESAR". SESAR. 2021. https://www.geosamples.org/. 
  37. 37.0 37.1 Samy, Gaiji; Chavan, Vishwas; Ariño, Arturo H.; Otegui, Javier; Hobern, Donald; Sood, Rajesh; Robles, Estrella (9 July 2013). "Content assessment of the primary biodiversity data published through GBIF network: Status, challenges and potentials". Biodiversity Informatics 8 (2). doi:10.17161/bi.v8i2.4124. ISSN 1546-9735. http://dx.doi.org/10.17161/bi.v8i2.4124. 
  38. Robertson, Tim; Döring, Markus; Guralnick, Robert; Bloom, David; Wieczorek, John; Braak, Kyle; Otegui, Javier; Russell, Laura et al. (6 August 2014). "The GBIF Integrated Publishing Toolkit: Facilitating the Efficient Publishing of Biodiversity Data on the Internet" (in en). PLOS ONE 9 (8): e102623. doi:10.1371/journal.pone.0102623. ISSN 1932-6203. PMC PMC4123864. PMID 25099149. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0102623. 
  39. Ball-Damerow, Joan E.; Brenskelle, Laura; Barve, Narayani; Soltis, Pamela S.; Sierwald, Petra; Bieler, Rüdiger; LaFrance, Raphael; Ariño, Arturo H. et al. (11 September 2019). "Research applications of primary biodiversity databases in the digital age" (in en). PLOS ONE 14 (9): e0215794. doi:10.1371/journal.pone.0215794. ISSN 1932-6203. PMC PMC6738577. PMID 31509534. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0215794. 
  40. "Global Biodiversity Information Facility". Global Biodiversity Information Facility. 2021. https://www.gbif.org/. 
  41. Walls, Ramona L.; Deck, John; Guralnick, Robert; Baskauf, Steve; Beaman, Reed; Blum, Stanley; Bowers, Shawn; Buttigieg, Pier Luigi et al. (3 March 2014). "Semantics in Support of Biodiversity Knowledge Discovery: An Introduction to the Biological Collections Ontology and Related Ontologies" (in en). PLOS ONE 9 (3): e89606. doi:10.1371/journal.pone.0089606. ISSN 1932-6203. PMC PMC3940615. PMID 24595056. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0089606. 
  42. Buttigieg, Pier Luigi; Pafilis, Evangelos; Lewis, Suzanna E.; Schildhauer, Mark P.; Walls, Ramona L.; Mungall, Christopher J. (23 September 2016). "The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation". Journal of Biomedical Semantics 7 (1): 57. doi:10.1186/s13326-016-0097-6. ISSN 2041-1480. PMC PMC5035502. PMID 27664130. https://doi.org/10.1186/s13326-016-0097-6. 
  43. Osumi-Sutherland, D.; Zheng, J.; Buttigieg, P.L. et al. (n.d.). "Population and Community Ontology". https://raw.githubusercontent.com/PopulationAndCommunityOntology/pco/master/pco.owl. 
  44. Avraham, Shulamit; Tung, Chih-Wei; Ilic, Katica; Jaiswal, Pankaj; Kellogg, Elizabeth A.; McCouch, Susan; Pujar, Anuradha; Reiser, Leonore et al. (1 January 2008). "The Plant Ontology Database: a community resource for plant structure and developmental stages controlled vocabulary and annotations". Nucleic Acids Research 36 (suppl_1): D449–D454. doi:10.1093/nar/gkm908. ISSN 0305-1048. PMC PMC2238838. PMID 18194960. https://academic.oup.com/nar/article/36/suppl_1/D449/2507667. 
  45. Damerow, Joan; Varadharajan, Charu; Boye, Kristin; Brodie, Eoin; Burrus, Madison; Chadwick, Dana; Cholia, Shreyas; Crystal-Ornelas, Robert et al.. (2020), "ESS-DIVE Global Sample Numbers and and Metadata Reporting Format for Environmental Systems Science (IGSN-ESS)" (in en), ESS-DIVE (Environmental System Science Data Infrastructure for a Virtual Ecosystem; Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE)), doi:10.15485/1660470, https://www.osti.gov/servlets/purl/1660470/. Retrieved 2021-12-07 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. The original article lists references in alphabetical order; however, this version lists them in order of appearance, by design.