Difference between revisions of "Journal:ROBOT: A Tool for Automating Ontology Workflows"

From LIMSWiki
Jump to navigationJump to search
(Created stub. Saving and adding more.)
 
(Saving and adding more.)
Line 38: Line 38:


==Background==
==Background==
[[Ontology (information science)|Ontologies]] are vital parts of the [[Informatics (academic field)|informatics]] ecosystem, supporting life science research, enabling analysis of high-throughput datasets, data standardization and integration, search, and discovery. However, there is a lack of tools supporting the complete ontology development lifecycle, especially when compared with the [[Systems development life cycle|software development lifecycle]]. This has resulted in many groups developing their own ''ad hoc'' ontology development workflows, often with time-consuming and inefficient manual steps. In some cases, groups release ontologies without any kind of systematic workflow or [[quality control]] process, which can result in errors or problems with downstream applications or [[Data analysis|analyses]].
Noy ''et al.'' (2010) describe a general ontology lifecycle, with a focus on bio-ontologies.<ref name="NoyTheOnt10">{{cite journal |title=The ontology life cycle: Integrated tools for editing, publishing, peer review, and evolution of ontologies |journal=AMIA Annual Symposium Proceedings |author=Noy, N.; Tudorache, T.; Nyulas, C. et al. |volume=2010 |pages=552–6 |year=2010 |pmid=21347039 |pmc=PMC3041389}}</ref> First, requirements for the ontology are gathered. Then, the ontology is collaboratively developed in an ontology editor such as Protégé.<ref name="HorridgeSupport06">{{cite journal |title=Supporting early adoption of OWL 1.1 with Protégé-OWL and FaCT++ |journal=OWL: Experiences and Directions 2006 |author=Horridge, M.; Tsarkov, D. |pages=1–7 |year=2006 |url=http://webont.org/owled/2006/accepted06.html}}</ref> Once the requirements have been fulfilled, the ontology is published and feedback is solicited from the community. Feedback is integrated back into development, and the ontology is continuously updated and released. At any point after the initial publication, the ontology may be deployed in other applications.
In broad strokes, this ontology development lifecycle reflects much of our experience of ontology development in the Open Biological and Biomedical Ontologies (OBO) community<ref name="SmithTheOBO07">{{cite journal |title=The OBO Foundry: Coordinated evolution of ontologies to support biomedical data integration |journal=Nature Biotechnology |author=Smith, B.; Ashburner, M.; Rosse, C. et al. |volume=25 |issue=11 |pages=1251–5 |year=2007 |doi=10.1038/nbt1346 |pmid=17989687 |pmc=PMC2814061}}</ref>, circa 2010. A wide range of Semantic Web-based software exists to support these steps, including many tools for Web Ontology Language (OWL) ontology development. In practice, though, the OBO community has relied predominantly on the free and open-source Protégé OWL editor for manual editing and conversion, and on a small set of other tools supporting OBO conventions.





Revision as of 17:18, 17 February 2020

Full article title ROBOT: A Tool for Automating Ontology Workflows
Journal BMC Bioinformatics
Author(s) Jackson, Rebecca C.; Balhoff, James, P.; Douglass, Eric; Harris, Nomi L.; Mungall, Christopher J.; Overton, James A.
Author affiliation(s) Knocean, Inc.; University of North Carolina; Lawrence Berkeley National Laboratory
Primary contact Email: via SpringerLink
Year published 2019
Volume and issue 20
Page(s) 407
DOI 10.1186/s12859-019-3002-3
ISSN 1471-2105
Distribution license Creative Commons Attribution 4.0 International
Website https://link.springer.com/article/10.1186/s12859-019-3002-3
Download https://link.springer.com/content/pdf/10.1186/s12859-019-3002-3.pdf (PDF)

Abstract

Background: Ontologies are invaluable in the life sciences, but building and maintaining ontologies often requires a challenging number of distinct tasks such as running automated reasoners and quality control checks, extracting dependencies and application-specific subsets, generating standard reports, and generating release files in multiple formats. Similar to more general software development, automation is the key to executing and managing these tasks effectively and to releasing more robust products in standard forms.

For ontologies using the Web Ontology Language (OWL), the OWL API (application programming interface) Java library is the foundation for a range of software tools, including the Protégé ontology editor. In the Open Biological and Biomedical Ontologies (OBO) community, we recognized the need to package a wide range of low-level OWL API functionality into a library of common higher-level operations and to make those operations available as a command-line tool.

Results: ROBOT (a recursive acronym for “ROBOT is an OBO Tool”) is an open-source library and command-line tool for automating ontology development tasks. The library can be called from any programming language that runs on the Java Virtual Machine (JVM). Most usage is through the command-line tool, which runs on macOS, Linux, and Windows. ROBOT provides ontology processing commands for a variety of tasks, including commands for converting formats, running a reasoner, creating import modules, running reports, and various other tasks. These commands can be combined into larger workflows using a separate task execution system such as GNU Make, and workflows can be automatically executed within continuous integration systems.

Conclusions: ROBOT supports automation of a wide range of ontology development tasks, focusing on OBO conventions. It packages common high-level ontology development functionality into a convenient library and makes it easy to configure, combine, and execute individual tasks in comprehensive, automated workflows. This helps ontology developers to efficiently create, maintain, and release high-quality ontologies so they can spend more time focusing on development tasks. It also helps guarantee released ontologies are free of certain types of logical errors and conform to standard quality control checks, increasing the overall robustness and efficiency of the ontology development lifecycle.

Keywords: ontology development, automation, ontology release, reasoning, workflows, quality control, import management

Background

Ontologies are vital parts of the informatics ecosystem, supporting life science research, enabling analysis of high-throughput datasets, data standardization and integration, search, and discovery. However, there is a lack of tools supporting the complete ontology development lifecycle, especially when compared with the software development lifecycle. This has resulted in many groups developing their own ad hoc ontology development workflows, often with time-consuming and inefficient manual steps. In some cases, groups release ontologies without any kind of systematic workflow or quality control process, which can result in errors or problems with downstream applications or analyses.

Noy et al. (2010) describe a general ontology lifecycle, with a focus on bio-ontologies.[1] First, requirements for the ontology are gathered. Then, the ontology is collaboratively developed in an ontology editor such as Protégé.[2] Once the requirements have been fulfilled, the ontology is published and feedback is solicited from the community. Feedback is integrated back into development, and the ontology is continuously updated and released. At any point after the initial publication, the ontology may be deployed in other applications.

In broad strokes, this ontology development lifecycle reflects much of our experience of ontology development in the Open Biological and Biomedical Ontologies (OBO) community[3], circa 2010. A wide range of Semantic Web-based software exists to support these steps, including many tools for Web Ontology Language (OWL) ontology development. In practice, though, the OBO community has relied predominantly on the free and open-source Protégé OWL editor for manual editing and conversion, and on a small set of other tools supporting OBO conventions.


References

Notes

This presentation is faithful to the original, with only a few minor changes to presentation, spelling, and grammar. We also added PMCID and DOI when they were missing from the original reference.