Journal:Using interactive digital notebooks for bioscience and informatics education

From LIMSWiki
Revision as of 19:53, 22 May 2021 by Shawndouglas (talk | contribs) (Saving and adding more.)
Jump to navigationJump to search
Full article title Using interactive digital notebooks for bioscience and informatics education
Journal PLOS Computational Biology
Author(s) Davies, Alan; Hooley, Frances; Causey-Freeman, Peter; Eleftheriou, Iliada; Moulton, Georgina
Author affiliation(s) University of Manchester
Primary contact Email: alan dot davies-2 at manchester dot ac dot uk
Editors Ouellette, Francis
Year published 2020
Volume and issue 16(11)
Article # e1008326
DOI 10.1371/journal.pcbi.1008326
ISSN 1553-734X
Distribution license Creative Commons Attribution 4.0 International
Website https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008326
Download https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1008326 (PDF)

Abstract

Interactive digital notebooks provide an opportunity for researchers and educators to carry out data analysis and report results in a single digital format. Further to just being digital, the format allows for rich content to be created in order to interact with the code and data contained in such a notebook to form an educational narrative. This primer introduces some of the fundamental aspects involved in using Jupyter Notebook in an educational setting for teaching in the bioinformatics and health informatics disciplines. We also provide two case studies that detail 1. how we used Jupyter Notebooks to teach non-coders programming skills on a blended master’s degree module for a health informatics program, and 2. a fully online distance learning unit on programming for a postgraduate certificate (PG Cert) in clinical bioinformatics, with a more technical audience.

Keywords: bioinformatics, health informatics, programming, data analysis, Jupyter Notebook, education

Introduction

Universities and other higher education institutions are now under increasing pressure to provide more online and distance learning courses and to deliver them cost effectively and rapidly.[1] This increase in demand is partly based on students wanting more flexible study options in comparison to traditional higher education course delivery to aid in study around employment and family commitments. This is also driven by financial considerations that allow higher education institutions to scale course delivery while managing infrastructural provision (e.g., access to rooms for teaching and limited capacity for face-to-face delivery).[2] To meet this challenge, we require tools that cater for students with varying levels of digital literacy and reduce the burden of them having to download and install software, all of which requires support, which is more difficult to provide at a distance. This can be further complicated when students use managed equipment (e.g., National Health Service [NHS] employees) and may not have administrator rights to install software.

Digital notebooks provided us with a way of meeting these needs, as they are easy to set up, straightforward to use, and can support rich and interactive content. Here, we present a primer on how to use digital notebooks (specifically Jupyter Notebooks) for teaching and assessment, along with details of two case studies where we used notebooks to teach Python programming and database skills for clinical bioinformatics and health informatics students of varying levels of technical experience. The case studies and methods presented can be applied to both distance learning and face-to-face teaching scenarios.

We will start by covering what a Jupyter Notebook is along with the different “cell” types available. We then look at how they can be run and enhanced with extensions to add items like exercise tasks and other interactivity before looking at how they can be used in assessment. Next, we present two case studies where we have applied notebooks to teach different groups of students to give some examples of the different contexts they can be used in. Finally, we end with a discussion to synthesise our experiences of using notebooks to educate students and their further potential, with considerations for education.

What is a Jupyter Notebook?

Jupyter Notebook is an open-source web application that runs in an internet browser. It allows the sharing of code, data analysis, data visualizations (which can be interactive), math formulas, and other embedded media (e.g., YouTube videos, images, and web links), all in a single document combining interactive and narrative components. This takes the form of a document that is composed of multiple cells that encapsulate the content of the notebook (Figure 1).


Fig1 Davies PLOSCompBio20 16-11.png

Fig. 1 A new Python 3 notebook with three empty cells denoted by the grey rectangles. The currently selected cell is highlighted in green.

Jupyter notebooks were created by Project Jupyter, which, according to their website, states that “Project Jupyter exists to develop open-source software, open-standards, and services for interactive computing across dozens of programming languages.”[3] This includes various standards for interactive computing, including the notebook document format that is based on JavaScript Object Notation (JSON). The name Jupyter is composed of the initial three languages supported: Julia, Python, and R.[4]

Anatomy of a notebook

Jupyter notebooks are available in various programming languages, with current support for over 40 different programming languages.[3] These include the popular languages used for data science, such as Julia, Python, and R (Figure 2).


Fig2 Davies PLOSCompBio20 16-11.png

Fig. 2 A simple function that returns the value of the sum of two numbers showing different kernels (programming languages) in the notebooks; this example shows Python (left), Julia (middle), and R (right).

The notebooks are made up of units called “cells” that can be executed (run) in order to render their contents in different ways.

Cell types

There are two principle cell types. The first cell type is the “Markdown” cell, which is used to present text, images, equations, and other resources. The second cell type is the “code” cell, which allows the user to enter code written in a chosen programming language that will execute in the notebook. To execute the contents of any cell, the user can press the Shift and Enter keys together, or alternatively click on the “Run” button in the main menu bar across the top of the screen. If the cell being run is a code cell, it will cause the code in the cell to be executed and any output displayed immediately below it. This is indicated by the “In” and “Out” words located to the left of the cells, as seen in Fig 2.

Styling cells

Markdown cells can be styled with Markdown, which is a lightweight mark-up language for styling text.[5] This works by turning Markdown text into HTML (Figure 3).


Fig3 Davies PLOSCompBio20 16-11.png

Fig. 3 Example of a markdown cell (left) and the output of the styled cell when the cell is run (right).

These cells can also display plain text as output with no styling. Another useful feature for teaching math-based courses or sharing formulas, etc. is the integration of LaTeX support. LaTeX is a popular typesetting document preparation system [6] that was built on the Tex typesetting language, originally developed by the American computer scientist Donald Knuth. [6] LaTeX is widely used by the scientific community (e.g., computer scientists) to write academic publications (journal and conference papers). LaTeX math notation can be added to markdown cells to display formulas using common math notation. For example, the code below produces the output seen in Figure 4:

$ $

\sigma = \sqrt{\frac{1}{N}\sum_{i = 1}^{N} (x_i-\mu)^2}

$ $


Fig4 Davies PLOSCompBio20 16-11.png

Fig. 4 Output of LaTeX math notation producing the formula for the population standard deviation.


References

  1. Gregory, J.; Salmon, G. (2013). "Professional development for online university teaching". Distance Education 34 (3): 256–70. doi:10.1080/01587919.2013.835771. 
  2. Georgina, D.A.; Olson, M.R. (2008). "Integration of technology in higher education: A review of faculty self-perceptions". The Internet and Higher Education 11 (1): 1–8. doi:10.1016/j.iheduc.2007.11.002. 
  3. 3.0 3.1 "Jupyter". Project Jupyter. https://jupyter.org/. Retrieved 27 January 2020. 
  4. Richardson, M.L.; Amini, B.. "Scientific Notebook Software: Applications for Academic Radiology". Current Problems in Diagnostic Radiology 47 (6): 368–77. doi:10.1067/j.cpradiol.2017.09.005. PMID 29122394. 
  5. Cone, M.. "Markdown Guide". https://www.markdownguide.org/. Retrieved 23 September 2019. 

Notes

This presentation attempts to remain faithful to the original, with only a few minor changes to presentation. Grammar and punctuation has been updated reasonably to improve readability. In some cases important information was missing from the references, and that information was added.