Journal:Establishing a common nutritional vocabulary: From food production to diet

From LIMSWiki
Revision as of 17:49, 17 September 2022 by Shawndouglas (talk | contribs) (Saving and adding more.)
Jump to navigationJump to search
Full article title Establishing a common nutritional vocabulary: From food production to diet
Journal Frontiers in Nutrition
Author(s) Andrés-Hernández, Liliana; Blumberg, Kai; Walls, Ramona L.; Dooley, Damion; Mauleon, Ramil; Lange, Matthew; Weber, Magalie; Chan, Lauren; Malik, Adnan; Møller, Anders; Ireland, Jayne; Segovia, Lucia; Zhang, Xuhuiqun; Burton-Freeman, Britt; Magelli, Paul; Schriever, Andrew; Forester, Shavawn M.; Liu, Lei; King, Graham J.
Author affiliation(s) Southern Cross University, University of Arizona, Critical Path Institute, Simon Fraser University, IC-FOODS, INRAE BIA, Oregon State University, EMBL-EBI, Danish Food Informatics, University of London, Illinois Institute of Technology, WISEcode LLC, Nutrient Institute LLC, University of Nottingham
Primary contact graham dot king at scu dot edu dot au
Editors Harsa, Hayriye S.
Year published 2022
Volume and issue 9
Article # 928837
DOI 10.3389/fnut.2022.928837
ISSN 2296-861X
Distribution license Creative Commons Attribution 4.0 International
Website https://www.frontiersin.org/articles/10.3389/fnut.2022.928837/full
Download https://www.frontiersin.org/articles/10.3389/fnut.2022.928837/pdf (PDF)

Abstract

Informed policy and decision-making for food systems, nutritional security, and global health would benefit from standardization and comparison of food composition data, spanning production to consumption. To address this challenge, we present a formal controlled vocabulary of terms, definitions, and relationships within the Compositional Dietary Nutrition Ontology (CDNO) that enables description of nutritional attributes for material entities contributing to the human diet. We demonstrate how ongoing community development of CDNO classes can harmonize trans-disciplinary approaches for describing nutritional components from food production to diet.

Keywords: dietary composition, food composition, ontologies, nutritional security, FAIR data, knowledge representation, human health

Introduction

Food production and supply systems affect human nutrition and health in personalized and global contexts. [1] However, nutrition-based decisions and data are seldom integrated along the production and supply chain. This information may affect selection of cultivars and conservation of genetic resources, the management of food supply, processing and distribution, and analysis of dietary consumption patterns segmented by various demographics. [2] Although various conventions exist for naming individual chemicals and physical attributes of dietary components, comparison of data and feedback within food systems is often constrained by divergence in formal definitions and classifications. [3] The exchange of knowledge and operational data between domains would benefit from a consistent framework that defines nutritional and phytochemical composition, as well as other attributes of food, including their dietary role and physiological function.

Knowledge representation underpins communication, and it is particularly important for sharing complex data and information within and between diverse domains such as crop biodiversity, food supply, and nutrition. [4] Defining and classifying commonly understood terminology facilitates data acquisition, exchange, and interoperability, where formal systems of domain-specific controlled vocabularies such as ontologies contribute to the representation and sharing of complex knowledge. [5] They do this by defining terms with human-readable definitions alongside machine-readable relationships that facilitate the annotation, exchange, analysis, and interpretation of data. [6] Establishment of clearly defined ontology classes representing domain-specific terminology is the first step to building common platforms that are of practical value to data curators and to end-users searching for relevant information. An approachable lexical representation of objects or concepts from different perspectives, which also helps reduce ambiguities in terminology for non-specialists, is particularly important for describing datasets in food supply chains [7] (Supplementary Figure 1). For instance, nutritional composition may vary depending on factors such as cultivars, cultivation systems, processing variables, food storage, and preparation. Moreover, there is a need to distinguish between individual chemical components and the method by which their concentration is determined. In many standard food composition tables (FCTs) and databases (FCDBs), such information is often conflated or absent. [8]

The Open Biomedical and Biological Ontologies Foundry and Library (OBO) is responsible for the establishment and development of a wide range of formal vocabularies in the life sciences and related domains. [9] The OBO includes the ontology for Chemical Entities of Biological Interest (ChEBI) [10], which provides a valuable resource for structured sets of chemical definitions. OBO principles emphasize the value of reusing terms (formally known as "classes" or "properties") between ontologies. From these seeds the Compositional Dietary Nutrition Ontology (CDNO) [4] was born.

The development of the CDNO [4] was prompted by the need to follow the FAIR principles (findable, accessible, interoperable, and reusable) [11] of data sharing. CDNO was initially focused on vocabulary to describe nutritional components in plant-derived materials contributing to human diet, and particularly those that may vary according to crop variety or within genetic resource collections. [2, 4] However, we found that the structured reusable definitions of nutritional components were equally applicable to a wide range of food raw materials derived from livestock, fish, or any other organic or inorganic source described in the Food Ontology (FoodOn) [12] (Figure 1B).


Fig1 Andrés-Hernández FrontNut2022 9.jpg

Figure 1. Compositional Dietary Nutrition Ontology (CDNO) class relationships and interaction with FoodON. (A) Relationships and associations between major ontology classes. Solid symbols (circles and triangle) represent class hierarchies that may be used individually or in combination by curators to annotate datasets in the continuum between agriculture and health. Many terms within the "dietary nutritional component" (blue solid circle) hierarchy are imported and reused from ChEBI (purple arrow). Grey arrows indicate relationships between independent classes that may in the future be adopted where evidence is available. The "dietary nutritional component" [CDNO:0000001] provides a framework where terms are reused in the "concentration of dietary nutritional component in material entity" [CDNO:0200001] class hierarchy (green solid circle). A distinction is made between the latter class, and that required an independent "analytical methods" class (Figure 1, blue triangle) to provide vocabulary to describe analytical methods where terms would be used in combination to represent relevant metadata (Figure 1, green dotted arrow). The "dietary material physical attribute" class [CDNO:0400001] (cream solid circle) provides structured subclasses to describe properties that may inhere either in a food material or be associated with a specific "dietary nutritional component" [CDNO:0000001]. The "nutritional functional attribute" [CDNO:0300001] class hierarchy (pink solid circle) allows the description of quantifiable functional attributes that may be associated with or inhere in terms from the "dietary nutritional component" [CDNO:0000001] class hierarchy. Where evidence is available, terms from this class may also be associated with a human dietary role (Figure 1, pink dotted line). The "human dietary role" [CDNO:0500001] class (orange solid circle) includes structured terms representing biological roles that may be assigned to a specific "dietary nutritional component" [CDNO:0000001], where it is left to experts and data curators to assign supporting evidence that indicates a function defined at the levels of molecular interaction, cellular process, or physiological role. (B) The interaction between CDNO and FoodOn is shown with a purple double arrow. FoodOn reuses ~500 terms from the CDNO "dietary nutritional component" [CDNO:0000001] hierarchy within the "chemical food component" [FOODON:03411041] hierarchy (cyan solid circle). The FoodOn "food product by organism" [FOODON:00002381] class (olive solid circle) is not directly associated with CDNO classes, but can be used to describe a food source. These represent independent classes that may be combined and used in a relational, RDF, or graph database by data curators to annotate and perform information extraction based on particular evidence that may require annotation.

Methods

While developing and expanding CDNO, we have followed the OBO principles [13], which emphasize community development of interoperable ontologies. We focused on reuse and import of existing OBO terms, as well as ensuring open discussion within the CDNO GitHub repository. [14] In order to generate terms that are subclasses of CDNO "dietary nutritional component" [CDNO:0200001], a modified version of the Crop Dietary Nutrition Data Framework (CDN-DF) v.1.0 by Halimi et al. [15] was used, with definition and organization of additional terms arising from discussions with plant chemist domain specialists and curators from the International Network of Food Data Systems (INFOODs) collated by the Food and Agriculture Organization (FAO) [16], USDA FoodData Central [17], and the European Food Information Resource (EuroFIR) [18] food composition databases and repositories. The CDN-DF v.2.0 was used as an input for a Python script that parsed the CDN-DF_v.2.0.xlsx into the nutritional_components_framework.csv and sugar_derivatives.csv files, which were converted into input files for ROBOT templates. These templates were used to generate a revised organization of classes/terms compiled into the reference CDNO in a Web Ontology Language (OWL) [19] file. Dietary nutritional components not present in the ChEBI were proposed and accepted as new entities using the ChEBI submission tool and imported into CDNO. The remaining terms that did not fit within the ChEBI scope were formally defined in CDNO, supported by reference to peer reviewed literature and authoritative online resources. These terms were described by following existing ontology definition guidelines for development of genus-differentia definitions. [20] The class "concentration of dietary nutritional component in material entity" [CDNO:0200001], as well as its subclasses, were created using a Dead Simple OWL Design Pattern (DOS-DP) [21] modified from The Environment Ontology (ENVO). [22, 23] The DOS-DP combined terms from the Phenotype and Trait Ontology (PATO) (24), CDNO, and the Basic Formal Ontology (BFO) [25] with OWL equivalence axioms. The remaining major classes were proposed and discussed via the CDNO GitHub issue forum [14] and in online workshops and seminars.

The CDNO ontology and accompanying code was initially created using the Ontology-Development-Kit (ODK) [26], and later versions of CDNO were developed using the templates module from the ROBOT software. [27] The reference CDNO OWL file and the source code are available from the Github CDNO repository. [14] Additional database tables were added to the core CropStoreDB MySQL schema [28] to manage different nutritional data sources, along with an "ontology register" lookup table to CDNO, FoodOn, ChEBI, NCBI taxon [29] and Plant Ontology (PO) [30] terms.

Results and discussion

References

Notes

This presentation is faithful to the original, with only a few minor changes to presentation and updates to spelling and grammar. In some cases important information was missing from the references, and that information was added.