Journal:Generalized procedure for screening free software and open-source software applications/Initial evaluation and selection recommendations
|Full article title||Generalized Procedure for Screening Free Software and Open Source Software Applications|
|Author affiliation(s)||Arcana Informatica; Scientific Computing|
|Primary contact||Email: firstname.lastname@example.org|
|Distribution license||Creative Commons Attribution-ShareAlike 4.0 International|
|Download||PDF (Note: Inline references fail to load in PDF version)|
Free software and open-source software projects have become a popular alternative tool in both scientific research and other fields. However, selecting the optimal application for use in a project can be a major task in itself, as the list of potential applications must first be identified and screened to determine promising candidates before an in-depth analysis of systems can be performed. To simplify this process, we have initiated a project to generate a library of in-depth reviews of free software and open-source software applications. Preliminary to beginning this project, a review of evaluation methods available in the literature was performed. As we found no one method that stood out, we synthesized a general procedure using a variety of available sources for screening a designated class of applications to determine which ones to evaluate in more depth. In this paper, we examine a number of currently published processes to identify their strengths and weaknesses. By selecting from these processes we synthesize a proposed screening procedure to triage available systems and identify those most promising of pursuit. To illustrate the functionality of this technique, this screening procedure is executed against a selected class of applications.
Initial evaluation and selection recommendations
At this point, we'll take a step back from the evaluation methodologies papers and examine some of the more general recommendations regarding evaluating and selecting FLOSS applications. The consistency of their recommendations may provide a more useful guide for an initial survey of FLOSS applications.
In TechRepublic, de Silva recommends 10 questions to ask when selecting a FLOSS application. While he provides a brief discourse on each question in his paper to ensure you understand the point of his question, I've collected the 10 questions from his article into the following list. Once we see what overlap, if any, are amongst our general recommendations, we'll address some of the consolidated questions in more detail.
- Are the open source license terms compatible with my business requirements?
- What is the strength of the community?
- How well is the product adopted by users?
- Can I get a warranty or commercial support if I need it?
- What quality assurance processes exist?
- How good is the documentation?
- How easily can the system be customized to my exact requirements?
- How is this project governed and how easily can I influence the road map?
- Will the product scale to my enterprise's requirements?
- Are there regular security patches?
Similarly, in InfoWorld Phipps lists seven questions you should have answered before even starting to select a software package. His list of questions, pulled directly from his article are:
- Am I granted copyright permission?
- Am I free to use my chosen business model?
- Am I unlikely to suffer patent attack?
- Am I free to compete with other community members?
- Am I free to contribute my improvements?
- Am I treated as a development peer?
- Am I inclusive of all people and skills?
This list of questions shows a moderately different point of view, as it is not only just about someone selecting an open-source system, but also it's about getting involved in its direct development. Padin, of 8th Light, Inc., takes the viewpoint of a developer who might incorporate open-source software into their projects. The list of criteria pulled directly from his blog includes:
- Does it do what I need it to do?
- How much more do I need it to do?
- Easy to review source code
- Tests and specs
Metcalfe of OSS Watch lists his top tips as:
- Ongoing effort
- Standards and interoperability
- Support (Community)
- Support (Commercial)
- Version 1.0
- Skill setting
- Project Development Development Model
In his LIMSexpert blog, Joel Limardo of ForwardPhase Technologies, LLC lists the following as components to check when evaluating an open-source application:
- Check licensing
- Check code quality
- Test setup time
- Verify extensibility
- Check for separation of concerns
- Check for last updated date
- Check for dependence on outdated toolkits/frameworks
Perhaps the most referenced of the general articles on selecting FLOSS applications is David Wheeler's "How to Evaluate Open Source Software / Free Software (OSS/FS) Programs." The detailed functionality to consider will vary with the types of applications being compared, but there are a number of general features that are relevant to almost any type of application. While we will cover them in more detail later, Wheeler categorizes the features to consider as the following:
- System functionality
- System cost – direct and in-direct
- Popularity of application, i.e. its market share for that type of application
- Varieties of product support available
- Maintenance of application, i.e, is development still taking place
- Reliability of application
- Performance of application
- Scalability of application
- Usability of application
- Security of application
- Adaptability/customizability of application
- Interoperability of application
- Licensing and other legal issues
While a hurried glance might suggest a lot of diversity in the features these various resources suggest, a closer look at the meaning of what they are saying shows a repetitive series of concerns. The primary significant differences between the functionality lists suggested is actually due more to how wide a breadth of the analysis process the authors are considering, as well as the underlying features that they are concerned with.
With a few additions, the high-level screening template described in the rest of this communication is based on Wheeler's previously mentioned document describing his recommended process for evaluating open-source software and free software programs. Structuring the items thus will make it easier to locate the corresponding sections in his document, which includes many useful specific recommendations as well as a great deal of background information to help you understand the why of the topic. I highly recommend reading it and following up on some of the links he provides. I will also include evaluation suggestions from several of the previously mentioned procedures where appropriate.
Wheeler defines four basic steps to this evaluation process, as listed below:
- Identify candidate applications.
- Read existing product reviews.
- Compare attributes of these applications to your needs.
- Analyze the applications best matching your needs in more depth.
Wheeler categorizes this process with the acronym IRCA. In this paper we will be focusing on the IRC components of this process. To confirm the efficacy of this protocol we will later apply it to several classes of open-source applications and examine the output of the protocol.
Realistically, before you can perform a survey of applications to determine which ones best match your needs, you must determine what your needs actually are. The product of determining these needs is frequently referred to as the user requirements specification (URS). This document can be generated in several ways, including having all of the potential users submit a list of the functions and capabilities that they feel is important. While the requirements document can be created by a single person, it is generally best to make it a group effort with multiple reviews of the draft document, including all of the users who will be working with the application. The reason for this is to ensure that an important requirement is not missed. When a requirement is missed, it is frequently due to the requirement being so basic that it never occurs to anyone that it specifically needed to be included in the requirements document. Admittedly, a detailed URS is not required at the survey level, but it is worth having if only to identify, by their implications, other features that might be significant.
Needs will, of course, vary with the type of application you are looking for and what you are planning to do with it. Keep in mind that the URS is a living document, subject to change through this whole process. Developing a URS is generally an iterative process, since as you explore systems, you may well see features that you hadn't considered that you find desirable. This process will also be impacted by whether the application to be selected will be used in a regulated environment. If it is, there will be existing documents that describe the minimum functionality that must be present in the system. Even if it is not to be used in a regulated environment, documents exist for many types of systems that describe the recommended functional requirements that would be expected for that type of system.
For a clarifying example, if you were attempting to select a laboratory information management system (LIMS), you can download checklists and standards of typical system requirements from a variety of sources. These will provide you with examples of the questions to ask, but you will have to determine which ones are important to, or required for, your particular effort.
Depending on the use to which this application is to be applied, you may be subject to other specific regulatory requirements as well. Which regulations may vary, since the same types of analysis performed for different industries fall under different regulatory organizations. This aspect is further complicated by the fact that you may be affected by more than one country's regulations if your analysis is applicable to products being shipped to other countries. While some specific regulations may have changed since its publication, an excellent resource to orient you to the diverse factors that must be considered is Siri Segalstad's book International IT Regulations and Compliance. My understanding is that an updated version of this book is currently in preparation. Keep in mind that while regulatory requirements that you must meet will vary, these regulations by and large also describe best practices, or at least the minimal allowed practices. These requirements are not put in place arbitrarily (generally) or to make things difficult for you but to ensure the quality of the data produced. As such, any deviations should be carefully considered, whether working in a regulated environment or not. Proper due diligence would be to determine which regulations and standards would apply to your operation.
For a LIMS, an example of following best practices is to ensure that the application has a full and detailed audit trail. An audit trail allows you to follow the processing of items through your system, determining who did what and when. In any field where it might become important to identify the actions taken during a processing step, an audit trail should be mandatory. While your organization's operations may not fall under the FDA's 21 CFR Part 11 regulations, which address data access and security (including audit trails), it is still extremely prudent that the application you select complies with them. If it does not, then almost anyone could walk up to your system and modify data, either deliberately or accidentally, and you would have no idea of who made the changes or what changes they made. For that matter, you might not even be able to tell a change was made at all, which likely will raise concerns both inside and outside of your organization. This would obviously cause major problems if they became a hinge issue for any type of liability law suit.
For this screening procedure, you do not have to have a fully detailed URS, but it is expedient to have a list of your make-or-break issues. This list will be used later for comparing systems and determining which ones justify a more in-depth evaluation.
To evaluate potential applications against your functional criteria, you must initially generate a list of potential systems. While this might sound easy, generating a comprehensive list frequently proves to be a challenge. When initiating the process, you must first determine the type of system that you are looking for, be it a LIMS, a hospital management system, a database, etc. At this point, you should be fairly open in building your list of candidates. By that, I mean that you should be careful not to select applications based solely on the utilization label applied to them. The same piece of software can frequently be applied to solve multiple problems, so you should cast a wide net and not automatically reject a system because the label you were looking for hadn't been applied to it. While the label may give you a convenient place to start searching, it is much more important to look for the functionality that you need, not what the system is called. In any case, many times the applied labels are vague and mean very different things to different people.
There are a variety of ways to generate your candidate list. A good place to start is simply talking with colleagues in your field. Have they heard of or used a FLOSS application of the appropriate type that they like? Another way is to just flip through journals and trade magazines that cover your field. Any sufficiently promising applications are likely to be mentioned there. Many of the trade magazines will have a special annual issue that covers equipment and software applicable to their field. It is difficult to generate a list of all potential resources, as many of these trade publications are little-known outside of their field. Also keep in mind that with the continued evolution of the World Wide Web, many of these trade publications also have associated web sites that you can scan or search. The table below includes just a minor fraction of these sites that are available. (We would welcome the suggestion of any additional resource sites that you are aware of. Please e-mail the fields covered, the resource name, and either its general URL or the URL of the specific resource section to the corresponding editor.)
Table 2.: Examples of focused FLOSS resource sites available on the web
I also recommend checking some of the general open source project lists, such as the ones generated by Cynthia Harvey at Datamation, which has been covering the computer and data-processing industry since 1957. In particular, you might find their article "Open Source Software List: 2015 Ultimate List" useful. It itemizes over 1,200 open source applications, including some in categories that I didn't even know existed.
It would also be prudent to search the major open source repositories such as SourceForge and GitHub. Wikipedia includes a comparison of source code hosting facilities that would be worth reviewing as well. Keep in mind that you will need to be flexible with your search terms, as the developers might be looking at the application differently than you are. While they were created for a different purpose, an examination of the books in The Architecture of Open Source Applications might prove useful as well. Other sites where you might find interesting information regarding new open-source applications, are the various OpenSource award sites, such as the InfoWorld Best of Open Source Software Awards, colloquially known as the Bossies.
When searching the web, don't rely on just Google or Bing. Don't forget to checkout all of the journal web sites such as SpringerLink, Wiley, ScienceDirect, PubMed, and others as they contain a surprising amount of information on FLOSS. If you don't wish to search each of them individually, there are other search engines out there which can give you an alternate view of the research resources available. To name just two, be sure to try both Google Scholar and Microsoft Academic Search. These tools can also be used to search for masters theses and doctoral dissertations, which likewise contain a significant amount of information regarding open source.
While working on creating your candidate list, be sure to pull any application reviews that you come across. If done well, these reviews can save you a significant amount of time in screening a potential system. However, unless you're familiar with the credentials of the author, be cautious of relying on them for all of your information. While not common, people have been known to post fake reviews online, sometimes when it is not even April 1! Another great resource, both for identifying projects and obtaining information about them, is Open HUB, a web site dedicated to open source and maintained by Black Duck Software, Inc. Open Hub allows you to search by project, person, organization, forums, and code. For example, if I searched for Bika LIMS, it would currently return the results for Bika Open Source LIMS 3 along with some basic information regarding the system. If I were to click on the project's name, a much more detailed page regarding this project is displayed. Moving your mouse cursor over the graphs displays the corresponding information for that date.
Once a list of candidate applications has been generated, the list of entries must be compared. Some of this comparison can be performed objectively, but it also requires subjective analysis of some components. As Stol and Babar have shown, there is no single recognized procedure for either the survey or detailed comparison of FLOSS applications that has shown a marked popularity above the others.
The importance of any one specific aspect of the evaluation will vary with the needs of the organization. General system functionality will be an important consideration, but specific aspects of the contained functionality will have different values to different groups. For instance, interoperability may be very important to some groups, while others may be using this application as their only data system and they have no interest in exchanging data files with others, so interoperability is not a concern to them. While you can develop a weighting system for different aspects of the system, this can easily skew selections, resulting in a system that has a very good rating yet is unable to perform the required function. Keep in mind that though this is a high-level survey, we are asking broad critical questions, not attempting to compare detailed minutia. Also keep in mind that a particular requirement might potentially fall under multiple headings. For example, compliance with 21 CFR Part 11 regulations might be included under functionality or security.
Completing the evaluation
- Silva, Chamindra de (20 December 2009). "10 questions to ask when selecting open source products for your enterprise". TechRepublic. CBS Interactive. http://www.techrepublic.com/blog/10-things/10-questions-to-ask-when-selecting-open-source-products-for-your-enterprise/. Retrieved 13 April 2015.
- Phipps, Simon (21 January 2015). "7 questions to ask any open source project". InfoWorld. InfoWorld, Inc. http://www.infoworld.com/article/2872094/open-source-software/seven-questions-to-ask-any-open-source-project.html. Retrieved 10 April 2015.
- Padin, Sandro (3 January 2014). "How I Evaluate Open-Source Software". 8th Light, Inc. https://blog.8thlight.com/sandro-padin/2014/01/03/how-i-evaluate-open-source-software.html. Retrieved 01 June 2015.
- Metcalfe, Randy (1 February 2004). "Top tips for selecting open source software". OSSWatch. University of Oxford. http://oss-watch.ac.uk/resources/tips. Retrieved 23 March 2015.
- Limardo, J. (2013). "DIY Evaluation Process". LIMSExpert.com. ForwardPhase Technologies, LLC. http://www.limsexpert.com/cgi-bin/bixchange/bixchange.cgi?pom=limsexpert3&iid=readMore;go=1363288315&title=DIY%20Evaluation%20Process. Retrieved 07 February 2015.
- Wheeler, David A. (5 August 2011). "How to Evaluate Open Source Software / Free Software (OSS/FS) Programs". dwheeler.com. http://www.dwheeler.com/oss_fs_eval.html. Retrieved 19 March 2015.
- "User Requirements Specification (URS)". validation-online.net. Validation Online. http://www.validation-online.net/user-requirements-specification.html. Retrieved 08 August 2015.
- O'Keefe, Graham (1 March 2015). "How to Create a Bullet-Proof User Requirement Specification (URS)". askaboutgmp. http://www.askaboutgmp.com/296-how-to-create-a-bullet-proof-urs. Retrieved 08 August 2015.
- "ASTM E1578-13, Standard Guide for Laboratory Informatics". West Conshohocken, PA: ASTM International. 2013. doi:10.1520/E1578. http://www.astm.org/Standards/E1578.htm. Retrieved 14 March 2015.
- "User Requirements Checklist". Autoscribe Informatics. http://www.autoscribeinformatics.com/services/user-requirements. Retrieved 10 April 2015.
- Laboratory Informatics Institute, ed. (2015). "The Complete Guide to LIMS & Laboratory Informatics – 2015 Edition". LabLynx, Inc. http://www.limsbook.com/the-complete-guide-to-lims-laboratory-informatics-2015-edition/. Retrieved 10 April 2015.
- "Part 11, Electronic Records; Electronic Signatures — Scope and Application". U.S. Food and Drug Administration. 26 August 2015. http://www.fda.gov/regulatoryinformation/guidances/ucm125067.htm. Retrieved 10 June 2015.
- Segalstad, Siri H. (2008). International IT Regulations and Compliance: Quality Standards in the Pharmaceutical and Regulated Industries. John Wiley & Sons, Inc. pp. 338. ISBN 9780470758823. http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470758821.html.
- "More articles by Cynthia Harvey". Datamation. QuinStreet, Inc. 2015. http://www.datamation.com/author/Cynthia-Harvey-6460.html. Retrieved 12 April 2015.
- Harvey, Cynthia (5 January 2015). "Open Source Software List: 2015 Ultimate List". Datamation. QuinStreet, Inc. http://www.datamation.com/open-source/open-source-software-list-2015-ultimate-list-1.html. Retrieved 12 April 2015.
- "SourceForge - Download, Develop and Publish Free Open Source Software". SourceForge. Slashdot Media. 2015. https://sourceforge.net/. Retrieved 14 June 2015.
- "GitHub: Where software is built". GitHub. GitHub, Inc. 2015. https://github.com/. Retrieved 14 June 2015.
- "Comparison of source code hosting facilities". Wikipedia. Wikimedia Foundation. 21 September 2015. https://en.wikipedia.org/w/index.php?title=Comparison_of_source_code_hosting_facilities&oldid=682090863. Retrieved 28 September 2015.
- "The Architecture of Open Source Applications". AOSA. AOSA Editors. 2015. http://aosabook.org/en/index.html. Retrieved 08 October 2015.
- Knorr, Eric (28 September 2015). "5 key trends in open source". InfoWorld. InfoWorld, Inc. http://www.infoworld.com/article/2986769/open-source-tools/5-key-trends-in-open-source.html. Retrieved 28 September 2015.
- "Google Scholar". Google, Inc. 2015. https://scholar.google.com/. Retrieved 08 August 2015.
- "Microsoft Academic Search". Microsoft Corporation. 2015. http://academic.research.microsoft.com/. Retrieved 08 August 2015.
- Wasike, Sylvia Nasambu (October 2010). "Selection Process of Open Source Software Component". http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.227.5951. Retrieved 10 August 2015.
- "Open HUB, the open source network". Open HUB. Black Duck Software, Inc. 2015. https://www.openhub.net/. Retrieved 10 August 2015.
This article has not officially been published in a journal. However, this presentation is largely faithful to the original paper. The content has been edited for grammar, punctuation, and spelling. Additional error correction of a few reference URLs and types as well as cleaning up of the glossary also occurred. Redundancies and references to entities that don't offer open-source software were removed from the FLOSS examples in Table 2. DOIs and other identifiers have been added to the references to make them more useful. This article is being made available for the first time under the Creative Commons Attribution-ShareAlike 4.0 International license, the same license used on this wiki.