Difference between revisions of "Journal:University-level practical activities in bioinformatics benefit voluntary groups of pupils in the last 2 years of school"

From LIMSWiki
Jump to navigationJump to search
(Created stub. Going to add text later.)
 
(Added content. Saving and adding more.)
Line 39: Line 39:


DNA sequences and related data are available at low cost (for new sequencing work) or free in online databases such as GenBank (Benson et al.<ref name="BensonGen">{{cite journal |title=GenBank |journal=Nucleic Acids Research |author=Benson, D.A.; Clark, K.; Karsch-Mizrachi, I.; Lipman, D.J.; Ostell, J.; Sayers, E.W. |volume=43 |issue=Database issue |pages=D30–D35 |year=2015 |doi=10.1093/nar/gku1216 |pmc=PMC4383990}}</ref>), Ensembl (Cunningham et al.<ref name="CunninghamEns">{{cite journal |title=Ensembl 2015 |journal=Nucleic Acids Research |author=Cunningham, F.; Amode, M.R.; Barrell, D., et al. |volume=43 |issue=Database issue |pages=D662-D669 |year=2015 |doi=10.1093/nar/gku1010}}</ref>) and hundreds of others (Galperin et al.<ref name="GalperinThe2015">{{cite journal |title=The 2015 ''Nucleic Acids Research'' Database Issue and Molecular Biology Database Collection |journal=Nucleic Acids Research |author=Galperin, M.Y.; Rigden, D.J.; Fernández-Suárez, X.M. |volume=43 |issue=Database issue |pages=D1-D5 |year=2015 |doi=10.1093/nar/gku1241 |pmid=25593347 |pmc=PMC4383995}}</ref>). Software for bioinformatics research is usually free, for example the very widely used sequence database search software, BLAST (Altschul et al.<ref name="AltschulGap">{{cite journal |title=Gapped BLAST and PSI-BLAST: a new generation of protein database search programs |journal=Nucleic Acids Research |author=Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. |volume=25 |issue=17 |pages=3389-3402 |year=1997 |doi=10.1093/nar/25.17.3389 |pmid=9254694 |pmc=PMC146917}}</ref>). Free resources are also available for bioinformatics teaching and learning, for example 4273''π'' (Barker et al.<ref name="Barker4273n">{{cite journal |title=4273''π'': Bioinformatics education on low cost ARM hardware |journal=BMC Bioinformatics |author=Barker, D.; Ferrier, D.E.K.; Holland, P.W.H.; Mitchell, J.B.O.; Plaisier, H.; Ritchie, M.G.; Smart, S.D. |volume=14 |pages=243 |year=2013 |doi=10.1186/1471-2105-14-243 |pmid=23937194 |pmc=PMC3751261}}</ref>), Bioinformática na escola (Marques et al.<ref name="MarquesBio">{{cite journal |title=Bioinformatics Projects Supporting Life-Sciences Learning in High Schools |journal=PLOS Computational Biology |author=Marques, I.; Almeida, P.; Alves, R.; João Dias, M.; Godinho, A.; Pereira-Leal, J.B. |volume=10 |issue=1 |pages=e1003404 |year=2014 |doi=10.1371/journal.pcbi.1003404 |pmc=PMC3900377}}</ref>), GOBLET (Corpas et al.<ref name="CorpasTheGOB">{{cite journal |title=The GOBLET training portal: a global repository of bioinformatics training materials, courses and trainers |journal=Bioinformatics |author=Corpas, M.; Jimenez, R.C.; Bongcam-Rudloff, E.; et al. |volume=31 |issue=1 |pages=140–142 |year=2015 |doi=10.1093/bioinformatics/btu601 |pmid=25189782 |pmc=PMC4271145}}</ref>), [http://www.nbic.nl/nl/education/high-school-programmes/bioinformaticsschool Bioinformatics@school] and the [http://evoed.evolutionsociety.org EvoEd Digital Library]. These publicly available data, software and materials present excellent opportunities for relatively low-cost teaching.  
DNA sequences and related data are available at low cost (for new sequencing work) or free in online databases such as GenBank (Benson et al.<ref name="BensonGen">{{cite journal |title=GenBank |journal=Nucleic Acids Research |author=Benson, D.A.; Clark, K.; Karsch-Mizrachi, I.; Lipman, D.J.; Ostell, J.; Sayers, E.W. |volume=43 |issue=Database issue |pages=D30–D35 |year=2015 |doi=10.1093/nar/gku1216 |pmc=PMC4383990}}</ref>), Ensembl (Cunningham et al.<ref name="CunninghamEns">{{cite journal |title=Ensembl 2015 |journal=Nucleic Acids Research |author=Cunningham, F.; Amode, M.R.; Barrell, D., et al. |volume=43 |issue=Database issue |pages=D662-D669 |year=2015 |doi=10.1093/nar/gku1010}}</ref>) and hundreds of others (Galperin et al.<ref name="GalperinThe2015">{{cite journal |title=The 2015 ''Nucleic Acids Research'' Database Issue and Molecular Biology Database Collection |journal=Nucleic Acids Research |author=Galperin, M.Y.; Rigden, D.J.; Fernández-Suárez, X.M. |volume=43 |issue=Database issue |pages=D1-D5 |year=2015 |doi=10.1093/nar/gku1241 |pmid=25593347 |pmc=PMC4383995}}</ref>). Software for bioinformatics research is usually free, for example the very widely used sequence database search software, BLAST (Altschul et al.<ref name="AltschulGap">{{cite journal |title=Gapped BLAST and PSI-BLAST: a new generation of protein database search programs |journal=Nucleic Acids Research |author=Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. |volume=25 |issue=17 |pages=3389-3402 |year=1997 |doi=10.1093/nar/25.17.3389 |pmid=9254694 |pmc=PMC146917}}</ref>). Free resources are also available for bioinformatics teaching and learning, for example 4273''π'' (Barker et al.<ref name="Barker4273n">{{cite journal |title=4273''π'': Bioinformatics education on low cost ARM hardware |journal=BMC Bioinformatics |author=Barker, D.; Ferrier, D.E.K.; Holland, P.W.H.; Mitchell, J.B.O.; Plaisier, H.; Ritchie, M.G.; Smart, S.D. |volume=14 |pages=243 |year=2013 |doi=10.1186/1471-2105-14-243 |pmid=23937194 |pmc=PMC3751261}}</ref>), Bioinformática na escola (Marques et al.<ref name="MarquesBio">{{cite journal |title=Bioinformatics Projects Supporting Life-Sciences Learning in High Schools |journal=PLOS Computational Biology |author=Marques, I.; Almeida, P.; Alves, R.; João Dias, M.; Godinho, A.; Pereira-Leal, J.B. |volume=10 |issue=1 |pages=e1003404 |year=2014 |doi=10.1371/journal.pcbi.1003404 |pmc=PMC3900377}}</ref>), GOBLET (Corpas et al.<ref name="CorpasTheGOB">{{cite journal |title=The GOBLET training portal: a global repository of bioinformatics training materials, courses and trainers |journal=Bioinformatics |author=Corpas, M.; Jimenez, R.C.; Bongcam-Rudloff, E.; et al. |volume=31 |issue=1 |pages=140–142 |year=2015 |doi=10.1093/bioinformatics/btu601 |pmid=25189782 |pmc=PMC4271145}}</ref>), [http://www.nbic.nl/nl/education/high-school-programmes/bioinformaticsschool Bioinformatics@school] and the [http://evoed.evolutionsociety.org EvoEd Digital Library]. These publicly available data, software and materials present excellent opportunities for relatively low-cost teaching.  
There has been a recent, encouraging increase in exposure of school pupils to bioinformatics (e.g. Gallagher et al.<ref name="GallagherAFirst">{{cite journal |title=A first attempt to bring computational biology into advanced high school biology classrooms |journal=PLOS Computational Biology |author=Gallagher, S.R.; Coon, W.; Donley, K.; Scott, A.; Goldberg, D.S. |volume=7 |issue=10 |pages=e1002244 |year=2011 |doi=10.1371/journal.pcbi.1002244 |pmid=22046118 |pmc=PMC3203055}}</ref>; Lewitter and Bourne<ref name="LewitterTeach">{{cite journal |title=Teaching bioinformatics at the secondary school level |journal=PLOS Computational Biology |author=Lewitter, F.; Bourne, P.E. |volume=7 |issue=10 |pages=e1002242 |year=2011 |doi=10.1371/journal.pcbi.1002242 |pmc=PMC3203059}}</ref>; McQueen et al.<ref name="McQueenDesign">{{cite journal |title=Design and implementation of a genomics field trip program aimed at secondary school students |journal=PLOS Computational Biology |author=McQueen, J.; Wright, J.J.; Fox, J.A. |volume=8 |issue=8 |pages=e1002636 |year=2012 |doi=10.1371/journal.pcbi.1002636 |pmid=22956895 |pmc=PMC3431290}}</ref>; Kovarik et al.<ref name="KovarikBio">{{cite journal |title=Bioinformatics education in high school: implications for promoting science, technology, engineering, and mathematics careers |journal=CBE – Life Sciences Education |author=Kovarik, D.N.; Patterson, D.G.; Cohen, C.; Sanders, E.A.; Peterson, K.A.; Porter, S.G.; Chowning, J.T. |volume=12 |issue=3 |pages=441–59 |year=2013 |doi=10.1187/cbe.12-11-0193 |pmid=24006393 |pmc=PMC3763012}}</ref>; Machluf and Yarden<ref name="MachlufInt">{{cite journal |title=Integrating bioinformatics into senior high school: design principles and implications |journal=Briefings in Bioinformatics |author=Machluf, Y.; Yarden, A. |volume=14 |issue=5 |pages=648-60 |year=2013 |doi=10.1093/bib/bbt030 |pmid=23665511}}</ref>; Wood and Gebhardt<ref name="WoodBio">{{cite journal |title=Bioinformatics goes to school – new avenues for teaching contemporary biology |journal=PLOS Computational Biology |author=Wood, L.; Gebhardt, P. |volume=9 |issue=6 |pages=e1003089 |year=2013 |doi=10.1371/journal.pcbi.1003089 |pmid=23785266 |pmc=PMC3681668}}</ref>; Marques et al.<ref name="MarquesBio" />; Toby and Pope<ref name="TobyBio">{{cite journal |title=Bioinformatics tools available for K-12 students to engage in research |journal=Aviation, Space, and Environmental Medicine |author=Toby, I.; Pope, A. |volume=85 |issue=4 |pages=484-485 |year=2014 |doi=10.3357/ASEM.3953.2014 |pmid=24754215}}</ref>). Genomics and associated topics have started to appear in many official school curricula, for example in Scotland (see “Discussion”, below), the Netherlands (College voor Examens, p. 17<ref name="CollegeBio">{{cite web |url=http://www.examenblad.nl/examenstof/syllabus-2016-biologie-vwo-nader/2016/vwo/f=/biologie_vwo_2016_def_voor_hervaststelling.pdf |format=PDF |title=Biologie VWO. Syllabus Centraal Examen 2016 |author=College voor Examens |pages=59 |year=April 2014 |accessdate=21 October 2015}}</ref>) and the USA (Wefer and Sheppard<ref name="WeferBio">{{cite journal |title=Bioinformatics in high school biology curricula: a study of state science standards |journal=CBE – Life Sciences Education |author=Wefer, S.H.; Sheppard, K. |volume=7 |issue=1 |pages=155–162 |year=2008 |doi=10.1187/cbe.07-05-0026 |pmc=PMC2262119}}</ref>). From a different angle, computer science is now a major part of the [https://www.gov.uk/government/publications/national-curriculum-in-england-computing-programmes-of-study primary school curriculum for England]. This is in line with a “back to basics” approach to computing currently emerging, as opposed to more traditional information and communications technology (ICT). In the UK, this change has been particularly associated with the low-cost [http://www.raspberrypi.org Raspberry Pi computer], which is suitable for educational projects in electronics and engineering as well as general use and has [http://www.wired.co.uk/news/archive/2015-02/18/raspberry-pi-5-million sold over 5 million units]. However, a practical link between computers and STEMM — which we will refer to as computational science, as opposed to computer science — still does not feature strongly on the UK school curriculum. DNA sequencing has a pervasive and increasing influence across traditionally disparate subject areas, including biochemistry, biomedical research, clinical medicine, evolutionary biology, ecology, neuroscience and anthropology. DNA sequencing is used to diagnose genetic and infectious diseases, discover drugs, characterise environments, monitor the progress of cancers, identify species and reveal evolutionary patterns. We consider increased amounts of practical bioinformatics at school to be a priority.
Motivated by the increasing importance of bioinformatics to the life sciences and its appearance on school curricula, we conducted a preliminary investigation of the benefits of bringing university-level bioinformatics teaching material to voluntary groups of children in the last 2 years of school in Scotland (S5 and S6; pupils aged 15–17). The material was originally developed for an optional, final-year undergraduate module at the University of St Andrews, [https://www.st-andrews.ac.uk/coursecatalogue/ug/2015-2016 BL4273 Bioinformatics for Biologists]. To better match bioinformatics as it is actually used in research at universities, institutes and industry, the material uses the Linux operating system, in this case a variant of Rasbpian Linux running on low-cost Raspberry Pi hardware. This material has been released under an open access licence, as part of [http://4273pi.org 4273''π''] (Barker et al.<ref name="Barker4273n" />). Our proposition was that school pupils can benefit from practical, undergraduate-level bioinformatics teaching material. Compared to the undergraduates for whom this material was originally developed, school pupils are less experienced and knowledgeable about biology in general. However, their levels of practical bioinformatics experience are broadly similar: zero in the case of the school pupils, and approximately ten actual contact hours among undergraduates at the time of starting the module.
Many of the skills developed in our activities, and 4273''π'' or bioinformatics in general, are generic skills in computational science. For example, although the programming language taught in the “INTRO” component — Perl — is particularly widely used in bioinformatics (e.g. Stajich et al.<ref name="StajichTheBio">{{cite journal |title=The BioPerl toolkit: Perl modules for the life sciences |journal=Genome Research |author=Stajich, J.E.; Block, D.; Boulez, K.; et al. |volume=12 |issue=10 |pages=1611–1618 |year=2002 |doi=10.1101/gr.361602 |pmid=12368254 |pmc=PMC187536}}</ref>; Stabenau et al.<ref name="StabenauTheEns">{{cite journal |title=The Ensembl core software libraries |journal=Genome Research |author=Stabenau, A.; McVicker, G.; Melsopp, C.; Proctor, G.; Clamp, M.; Birney, E. |volume=14 |issue=5 |pages=929–933 |year=2004 |doi=10.1101/gr.1857204 |pmid=15123588 |pmc=PMC479122}}</ref>), it is structurally similar to other programming languages widely used in science, including C, Fortran, Java, Python and R. Use of the command-line, emphasised in 4273''π'', is also essential in computational physics, computational chemistry and, indeed, computer science. Although computational chemistry is not yet part of the Higher qualification in Chemistry, several simulations are suggested by the Scottish Qualifications Authority.<ref name="ScottHigher">{{cite web |url=http://www.sqa.org.uk/files_ccc/CfE_CourseUnitSupportNotes_Higher_Sciences_Chemistry.pdf |format=PDF |title=Higher Chemistry Course Support Notes |author=Scottish Qualifications Authority |pages=98 |year=2015 |accessdate=20 October 2015}}</ref> Computational skills, as taught in 4273''π'', will be valuable to students taking chemistry, physics and other STEMM subjects at university.
Judged by pupil self-assessment forms, our preliminary trial was a success, though caution is required due to the small sample size. We will continue developing peer-reviewed bioinformatics material, targeted at school pupils and/or undergraduates, and applying it in practice. This will simultaneously lead to expansion of the 4273''π'' resource and the gathering of larger, more complex and conclusive educational data at a future date. 4273''π'' itself, and links to relevant social media groups, may be found at http://4273pi.org.
===Methods===
Two activities were carried out, each using a voluntary group of seven pupils studying science from a single school in Scotland. One group was from Kilgraston, an independent girls’ school, and the other was from Forfar Academy, a comprehensive school. In the case of Kilgraston, five pupils were at S5 and two were at S6 level, and instruction and assistance were provided by D.B., M.M.C., G.T.P.M. and H.P. In the case of Forfar Academy, all pupils were at S5 level, of whom two where girls and five were boys, and instruction and assistance were provided by R.G.A., D.B., L.D., J.L.M. and S.D.S. Generally, university staff or PhD students provided detailed instruction on the bioinformatics activity, and school staff highlighted links to material already taught and the curriculum. With a combination of university staff or PhD students and school staff, students were guided through the practical material of two components (modules) of 4273''π'' Bioinformatics for Biologists. At Kilgraston, the event was held at the school, occupying an entire day on which no other classes were scheduled. With Forfar Academy, the event was held at the University of St Andrews, where students participated in an afternoon and evening session, primarily held in the same room used, at other times, by undergraduates on the BL4273 module. Refreshment breaks were included, using the school’s usual facilities (Kilgraston) or the [http://www.st-andrews.ac.uk/museum/bellpettigrew Bell Pettigrew Museum] (St Andrews). In total, the teaching time was approximately 4 h.
Raspberry Pi Model B hardware was used, one per student (plus one connected to a projector for demonstration). Prior to the first event, at Kilgraston, tasks were selected from existing material in discussion between D.B. and H.P. (familiar with 4273''π'') and M.M.C. (familiar with the school curriculum). For both groups of pupils, the first task corresponded to the “INTRO” component of 4273''π'' Bioinformatics for Biologists, providing an introduction to the Raspberry Pi computer hardware, the Linux command-line, BLAST sequence similarity search software and Programming in the Perl language. The second task corresponded to the “DNA” component, involving an introduction to the FlyBase database (Dos Santos et al.<ref name="DosSantosIntro">{{cite journal |title=FlyBase: introduction of the Drosophila melanogaster Release 6 reference genome assembly and large-scale migration of genome annotations |journal=Nucleic Acids Research |author=Dos Santos, G.; Schroeder, A.J.; Goodman, J.L.; Strelets, V.B.; Crosby, M.A.; Thurmond, J.; Emmert, D.B.; Gelbart, W.M.; The FlyBase Consortium |volume=43 |issue=Database issue |pages=D690-7 |year=2015 |doi=10.1093/nar/gku1099 |pmid=25398896 |pmc=PMC4383921}}</ref>) and genome annotation with BLAST (Altschul et al.<ref name="AltschulGap" />), GeneWise (Birney et al.<ref name="BirneyGene">{{cite journal |title=GeneWise and Genomewise |journal=Genome Research |author=Birney, E.; Clamp, M.; Durbin, R. |volume=14 |issue=5 |pages=988–95 |year=2004 |doi=10.1101/gr.1865504 |pmid=15123596 |pmc=PMC479130}}</ref>) and SNAP (Korf<ref name="Korf">{{cite journal |title=Gene finding in novel genomes |journal=BMC Bioinformatics |author=Korf, I. |volume=14 |issue=5 |pages=59 |year=2004 |doi=10.1186/1471-2105-5-59 |pmid=15144565 |pmc=PMC421630}}</ref>). Hard-copy handouts were provided. The handouts for Kilgraston and Forfar Academy were identical in content apart from date, location of the event, staff details and location of files (~/kilgraston or ~/forfar_academy). For the record, the specific handouts used are available as Additional file 1 (Kilgraston) and Additional file 2 (Forfar Academy), but with names and contact details redacted. The latest, open access versions of these will be found in [http://4273pi.org 4273''π''].
Hard-copy, paired, “before” (prior to the use of the computers) and “after” questionnaires were used for pupils and school staff, involving questions on a 1–5 Likert scale for self-assessment of attitudes and free text (Table 1; Additional file 3). In preparing the questionnaires, for those questions on a Likert scale, the sequence of questions was randomised and the sense of each question (“1” corresponding to “good” on our subjective scale, vs “1” corresponding to “bad”) was randomised. The same sequence and sense were used for each questionnaire handed out; within each group (pupils or staff), the sequence of paired questions was the same “before” and “after”. Results of the questions on the Likert scale were summarised per question as a bar chart, and as a likelihood ratio sign test for evidence of systematic change over the course of the activity. We apply a likelihood approach to statistical inference (Birnbaum [1962]; Edwards [1992]; Royall [1997]; Barker [2015]). In common with other approaches to statistical inference, this provides no absolute threshold beyond which evidence is considered conclusive. By convention, we define “strong” evidence as a log (ln) likelihood ratio, Δℓ, of at least 2, or a likelihood ratio of at least 8 (Edwards [1992], pp. 199–202; Royall [1997]). Were Δℓ converted to a p value under the assumptions of a likelihood ratio test (Wilks [1938]), then for one free parameter Δℓ ≥ 2 corresponds to p ≤ 0.046, approximately the traditional threshold for statistical significance prior to any correction for multiple testing (i.e. p < 0.05). Calculations were performed in R (R Development Core Team [2010]).


==References==
==References==

Revision as of 20:26, 11 November 2015

Full article title University-level practical activities in bioinformatics benefit voluntary groups of pupils in the last 2 years of school
Journal International Journal of STEM Education
Author(s) Barker, Daniel; Alderson, Rosanna G.; McDonagh, James L.; Plaisier, Heleen;
Comrie, Muriel M.; Duncan, Leigh; Muirhead, Gavin T.P.; Sweeney, Stuart D.
Author affiliation(s) University of St. Andrews, University of Manchester, Kilgraston School, Forfar Academy, Portlethen Academy
Primary contact Email: db60@st-andrews.ac.uk
Year published 2015
Volume and issue 2
Page(s) 17
DOI 10.1186/s40594-015-0030-z
ISSN 2196-7822
Distribution license Creative Commons Attribution 4.0 International
Website http://www.stemeducationjournal.com/content/2/1/17
Download http://www.stemeducationjournal.com/content/pdf/s40594-015-0030-z.pdf (PDF)

Abstract

Background: Bioinformatics — the use of computers in biology — is of major and increasing importance to biological sciences and medicine. We conducted a preliminary investigation of the value of bringing practical, university-level bioinformatics education to the school level. We conducted voluntary activities for pupils at two schools in Scotland (years S5 and S6; pupils aged 15–17). We used material originally developed for an optional final-year undergraduate module and now incorporated into 4273π, a resource for teaching and learning bioinformatics on the low-cost Raspberry Pi computer.

Results: Pupils’ feedback forms suggested our activities were beneficial. During the course of the activity, they provide strong evidence of increase in the following: pupils’ perception of the value of computers within biology; their knowledge of the Linux operating system and the Raspberry Pi; their willingness to use computers rather than phones or tablets; their ability to program a computer and their ability to analyse DNA sequences with a computer. We found no strong evidence of negative effects.

Conclusions: Our preliminary study supports the feasibility of bringing university-level, practical bioinformatics activities to school pupils.

Keywords: Bioinformatics; Computational biology; Secondary school; Raspberry Pi; Open access teaching material; Case study

Findings

Introduction

Progress in Science, Technology, Engineering, Mathematics and Medicine (STEMM) subjects is increasingly dominated by computational analyses. In biological sciences, for example, the exceptional pace of recent advances in technology for DNA and genome sequencing has created a demand for computationally able researchers, to analyse the large amounts of data produced. A field specialising in application of computation to biological problems has emerged, known as bioinformatics. The development of bioinformatics is discussed by Hogeweg[1], and university-level bioinformatics education has been reviewed by Magana et al.[2]

DNA sequences and related data are available at low cost (for new sequencing work) or free in online databases such as GenBank (Benson et al.[3]), Ensembl (Cunningham et al.[4]) and hundreds of others (Galperin et al.[5]). Software for bioinformatics research is usually free, for example the very widely used sequence database search software, BLAST (Altschul et al.[6]). Free resources are also available for bioinformatics teaching and learning, for example 4273π (Barker et al.[7]), Bioinformática na escola (Marques et al.[8]), GOBLET (Corpas et al.[9]), Bioinformatics@school and the EvoEd Digital Library. These publicly available data, software and materials present excellent opportunities for relatively low-cost teaching.

There has been a recent, encouraging increase in exposure of school pupils to bioinformatics (e.g. Gallagher et al.[10]; Lewitter and Bourne[11]; McQueen et al.[12]; Kovarik et al.[13]; Machluf and Yarden[14]; Wood and Gebhardt[15]; Marques et al.[8]; Toby and Pope[16]). Genomics and associated topics have started to appear in many official school curricula, for example in Scotland (see “Discussion”, below), the Netherlands (College voor Examens, p. 17[17]) and the USA (Wefer and Sheppard[18]). From a different angle, computer science is now a major part of the primary school curriculum for England. This is in line with a “back to basics” approach to computing currently emerging, as opposed to more traditional information and communications technology (ICT). In the UK, this change has been particularly associated with the low-cost Raspberry Pi computer, which is suitable for educational projects in electronics and engineering as well as general use and has sold over 5 million units. However, a practical link between computers and STEMM — which we will refer to as computational science, as opposed to computer science — still does not feature strongly on the UK school curriculum. DNA sequencing has a pervasive and increasing influence across traditionally disparate subject areas, including biochemistry, biomedical research, clinical medicine, evolutionary biology, ecology, neuroscience and anthropology. DNA sequencing is used to diagnose genetic and infectious diseases, discover drugs, characterise environments, monitor the progress of cancers, identify species and reveal evolutionary patterns. We consider increased amounts of practical bioinformatics at school to be a priority.

Motivated by the increasing importance of bioinformatics to the life sciences and its appearance on school curricula, we conducted a preliminary investigation of the benefits of bringing university-level bioinformatics teaching material to voluntary groups of children in the last 2 years of school in Scotland (S5 and S6; pupils aged 15–17). The material was originally developed for an optional, final-year undergraduate module at the University of St Andrews, BL4273 Bioinformatics for Biologists. To better match bioinformatics as it is actually used in research at universities, institutes and industry, the material uses the Linux operating system, in this case a variant of Rasbpian Linux running on low-cost Raspberry Pi hardware. This material has been released under an open access licence, as part of 4273π (Barker et al.[7]). Our proposition was that school pupils can benefit from practical, undergraduate-level bioinformatics teaching material. Compared to the undergraduates for whom this material was originally developed, school pupils are less experienced and knowledgeable about biology in general. However, their levels of practical bioinformatics experience are broadly similar: zero in the case of the school pupils, and approximately ten actual contact hours among undergraduates at the time of starting the module.

Many of the skills developed in our activities, and 4273π or bioinformatics in general, are generic skills in computational science. For example, although the programming language taught in the “INTRO” component — Perl — is particularly widely used in bioinformatics (e.g. Stajich et al.[19]; Stabenau et al.[20]), it is structurally similar to other programming languages widely used in science, including C, Fortran, Java, Python and R. Use of the command-line, emphasised in 4273π, is also essential in computational physics, computational chemistry and, indeed, computer science. Although computational chemistry is not yet part of the Higher qualification in Chemistry, several simulations are suggested by the Scottish Qualifications Authority.[21] Computational skills, as taught in 4273π, will be valuable to students taking chemistry, physics and other STEMM subjects at university.

Judged by pupil self-assessment forms, our preliminary trial was a success, though caution is required due to the small sample size. We will continue developing peer-reviewed bioinformatics material, targeted at school pupils and/or undergraduates, and applying it in practice. This will simultaneously lead to expansion of the 4273π resource and the gathering of larger, more complex and conclusive educational data at a future date. 4273π itself, and links to relevant social media groups, may be found at http://4273pi.org.

Methods

Two activities were carried out, each using a voluntary group of seven pupils studying science from a single school in Scotland. One group was from Kilgraston, an independent girls’ school, and the other was from Forfar Academy, a comprehensive school. In the case of Kilgraston, five pupils were at S5 and two were at S6 level, and instruction and assistance were provided by D.B., M.M.C., G.T.P.M. and H.P. In the case of Forfar Academy, all pupils were at S5 level, of whom two where girls and five were boys, and instruction and assistance were provided by R.G.A., D.B., L.D., J.L.M. and S.D.S. Generally, university staff or PhD students provided detailed instruction on the bioinformatics activity, and school staff highlighted links to material already taught and the curriculum. With a combination of university staff or PhD students and school staff, students were guided through the practical material of two components (modules) of 4273π Bioinformatics for Biologists. At Kilgraston, the event was held at the school, occupying an entire day on which no other classes were scheduled. With Forfar Academy, the event was held at the University of St Andrews, where students participated in an afternoon and evening session, primarily held in the same room used, at other times, by undergraduates on the BL4273 module. Refreshment breaks were included, using the school’s usual facilities (Kilgraston) or the Bell Pettigrew Museum (St Andrews). In total, the teaching time was approximately 4 h.

Raspberry Pi Model B hardware was used, one per student (plus one connected to a projector for demonstration). Prior to the first event, at Kilgraston, tasks were selected from existing material in discussion between D.B. and H.P. (familiar with 4273π) and M.M.C. (familiar with the school curriculum). For both groups of pupils, the first task corresponded to the “INTRO” component of 4273π Bioinformatics for Biologists, providing an introduction to the Raspberry Pi computer hardware, the Linux command-line, BLAST sequence similarity search software and Programming in the Perl language. The second task corresponded to the “DNA” component, involving an introduction to the FlyBase database (Dos Santos et al.[22]) and genome annotation with BLAST (Altschul et al.[6]), GeneWise (Birney et al.[23]) and SNAP (Korf[24]). Hard-copy handouts were provided. The handouts for Kilgraston and Forfar Academy were identical in content apart from date, location of the event, staff details and location of files (~/kilgraston or ~/forfar_academy). For the record, the specific handouts used are available as Additional file 1 (Kilgraston) and Additional file 2 (Forfar Academy), but with names and contact details redacted. The latest, open access versions of these will be found in 4273π.

Hard-copy, paired, “before” (prior to the use of the computers) and “after” questionnaires were used for pupils and school staff, involving questions on a 1–5 Likert scale for self-assessment of attitudes and free text (Table 1; Additional file 3). In preparing the questionnaires, for those questions on a Likert scale, the sequence of questions was randomised and the sense of each question (“1” corresponding to “good” on our subjective scale, vs “1” corresponding to “bad”) was randomised. The same sequence and sense were used for each questionnaire handed out; within each group (pupils or staff), the sequence of paired questions was the same “before” and “after”. Results of the questions on the Likert scale were summarised per question as a bar chart, and as a likelihood ratio sign test for evidence of systematic change over the course of the activity. We apply a likelihood approach to statistical inference (Birnbaum [1962]; Edwards [1992]; Royall [1997]; Barker [2015]). In common with other approaches to statistical inference, this provides no absolute threshold beyond which evidence is considered conclusive. By convention, we define “strong” evidence as a log (ln) likelihood ratio, Δℓ, of at least 2, or a likelihood ratio of at least 8 (Edwards [1992], pp. 199–202; Royall [1997]). Were Δℓ converted to a p value under the assumptions of a likelihood ratio test (Wilks [1938]), then for one free parameter Δℓ ≥ 2 corresponds to p ≤ 0.046, approximately the traditional threshold for statistical significance prior to any correction for multiple testing (i.e. p < 0.05). Calculations were performed in R (R Development Core Team [2010]).

References

  1. Hogeweg, P. (2011). "The roots of bioinformatics in theoretical biology". PLOS Computational Biology 7 (3): e1002021. doi:10.1371/journal.pcbi.1002021. PMC PMC3068925. PMID 21483479. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3068925. 
  2. Magana, A.J.; Taleyarkhan, M.; Alvarado, D.R.; Kane, M.; Springer, J.; Clase, K. (2014). "A survey of scholarly literature describing the field of bioinformatics education and bioinformatics educational research". CBE – Life Sciences Education 13 (4): 607–23. doi:10.1187/cbe.13-10-0193. PMC PMC4255348. PMID 25452484. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4255348. 
  3. Benson, D.A.; Clark, K.; Karsch-Mizrachi, I.; Lipman, D.J.; Ostell, J.; Sayers, E.W. (2015). "GenBank". Nucleic Acids Research 43 (Database issue): D30–D35. doi:10.1093/nar/gku1216. PMC PMC4383990. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383990. 
  4. Cunningham, F.; Amode, M.R.; Barrell, D., et al. (2015). "Ensembl 2015". Nucleic Acids Research 43 (Database issue): D662-D669. doi:10.1093/nar/gku1010. 
  5. Galperin, M.Y.; Rigden, D.J.; Fernández-Suárez, X.M. (2015). "The 2015 Nucleic Acids Research Database Issue and Molecular Biology Database Collection". Nucleic Acids Research 43 (Database issue): D1-D5. doi:10.1093/nar/gku1241. PMC PMC4383995. PMID 25593347. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383995. 
  6. 6.0 6.1 Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. (1997). "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs". Nucleic Acids Research 25 (17): 3389-3402. doi:10.1093/nar/25.17.3389. PMC PMC146917. PMID 9254694. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC146917. 
  7. 7.0 7.1 Barker, D.; Ferrier, D.E.K.; Holland, P.W.H.; Mitchell, J.B.O.; Plaisier, H.; Ritchie, M.G.; Smart, S.D. (2013). "4273π: Bioinformatics education on low cost ARM hardware". BMC Bioinformatics 14: 243. doi:10.1186/1471-2105-14-243. PMC PMC3751261. PMID 23937194. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3751261. 
  8. 8.0 8.1 Marques, I.; Almeida, P.; Alves, R.; João Dias, M.; Godinho, A.; Pereira-Leal, J.B. (2014). "Bioinformatics Projects Supporting Life-Sciences Learning in High Schools". PLOS Computational Biology 10 (1): e1003404. doi:10.1371/journal.pcbi.1003404. PMC PMC3900377. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3900377. 
  9. Corpas, M.; Jimenez, R.C.; Bongcam-Rudloff, E.; et al. (2015). "The GOBLET training portal: a global repository of bioinformatics training materials, courses and trainers". Bioinformatics 31 (1): 140–142. doi:10.1093/bioinformatics/btu601. PMC PMC4271145. PMID 25189782. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4271145. 
  10. Gallagher, S.R.; Coon, W.; Donley, K.; Scott, A.; Goldberg, D.S. (2011). "A first attempt to bring computational biology into advanced high school biology classrooms". PLOS Computational Biology 7 (10): e1002244. doi:10.1371/journal.pcbi.1002244. PMC PMC3203055. PMID 22046118. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3203055. 
  11. Lewitter, F.; Bourne, P.E. (2011). "Teaching bioinformatics at the secondary school level". PLOS Computational Biology 7 (10): e1002242. doi:10.1371/journal.pcbi.1002242. PMC PMC3203059. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3203059. 
  12. McQueen, J.; Wright, J.J.; Fox, J.A. (2012). "Design and implementation of a genomics field trip program aimed at secondary school students". PLOS Computational Biology 8 (8): e1002636. doi:10.1371/journal.pcbi.1002636. PMC PMC3431290. PMID 22956895. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431290. 
  13. Kovarik, D.N.; Patterson, D.G.; Cohen, C.; Sanders, E.A.; Peterson, K.A.; Porter, S.G.; Chowning, J.T. (2013). "Bioinformatics education in high school: implications for promoting science, technology, engineering, and mathematics careers". CBE – Life Sciences Education 12 (3): 441–59. doi:10.1187/cbe.12-11-0193. PMC PMC3763012. PMID 24006393. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3763012. 
  14. Machluf, Y.; Yarden, A. (2013). "Integrating bioinformatics into senior high school: design principles and implications". Briefings in Bioinformatics 14 (5): 648-60. doi:10.1093/bib/bbt030. PMID 23665511. 
  15. Wood, L.; Gebhardt, P. (2013). "Bioinformatics goes to school – new avenues for teaching contemporary biology". PLOS Computational Biology 9 (6): e1003089. doi:10.1371/journal.pcbi.1003089. PMC PMC3681668. PMID 23785266. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3681668. 
  16. Toby, I.; Pope, A. (2014). "Bioinformatics tools available for K-12 students to engage in research". Aviation, Space, and Environmental Medicine 85 (4): 484-485. doi:10.3357/ASEM.3953.2014. PMID 24754215. 
  17. College voor Examens (April 2014). "Biologie VWO. Syllabus Centraal Examen 2016" (PDF). pp. 59. http://www.examenblad.nl/examenstof/syllabus-2016-biologie-vwo-nader/2016/vwo/f=/biologie_vwo_2016_def_voor_hervaststelling.pdf. Retrieved 21 October 2015. 
  18. Wefer, S.H.; Sheppard, K. (2008). "Bioinformatics in high school biology curricula: a study of state science standards". CBE – Life Sciences Education 7 (1): 155–162. doi:10.1187/cbe.07-05-0026. PMC PMC2262119. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2262119. 
  19. Stajich, J.E.; Block, D.; Boulez, K.; et al. (2002). "The BioPerl toolkit: Perl modules for the life sciences". Genome Research 12 (10): 1611–1618. doi:10.1101/gr.361602. PMC PMC187536. PMID 12368254. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC187536. 
  20. Stabenau, A.; McVicker, G.; Melsopp, C.; Proctor, G.; Clamp, M.; Birney, E. (2004). "The Ensembl core software libraries". Genome Research 14 (5): 929–933. doi:10.1101/gr.1857204. PMC PMC479122. PMID 15123588. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC479122. 
  21. Scottish Qualifications Authority (2015). "Higher Chemistry Course Support Notes" (PDF). pp. 98. http://www.sqa.org.uk/files_ccc/CfE_CourseUnitSupportNotes_Higher_Sciences_Chemistry.pdf. Retrieved 20 October 2015. 
  22. Dos Santos, G.; Schroeder, A.J.; Goodman, J.L.; Strelets, V.B.; Crosby, M.A.; Thurmond, J.; Emmert, D.B.; Gelbart, W.M.; The FlyBase Consortium (2015). "FlyBase: introduction of the Drosophila melanogaster Release 6 reference genome assembly and large-scale migration of genome annotations". Nucleic Acids Research 43 (Database issue): D690-7. doi:10.1093/nar/gku1099. PMC PMC4383921. PMID 25398896. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383921. 
  23. Birney, E.; Clamp, M.; Durbin, R. (2004). "GeneWise and Genomewise". Genome Research 14 (5): 988–95. doi:10.1101/gr.1865504. PMC PMC479130. PMID 15123596. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC479130. 
  24. Korf, I. (2004). "Gene finding in novel genomes". BMC Bioinformatics 14 (5): 59. doi:10.1186/1471-2105-5-59. PMC PMC421630. PMID 15144565. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC421630. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In several cases citation information was missing and was added to make the reference more useful.