LII:Laboratory Technology Planning and Management: The Practice of Laboratory Systems Engineering
Title: Laboratory Technology Planning and Management: The Practice of Laboratory Systems Engineering
Author for citation: Joe Liscouski, with editorial modifications by Shawn Douglas
License for content: Creative Commons Attribution 4.0 International
Publication date: December 2020
- 1 Introduction
- 2 Making laboratory informatics and automation work
- 3 Different ways of looking at laboratories
- 4 Labs in transition, from manual operation to modern facilities
- 5 The seven goals of planning and managing lab technologies
- 5.1 First goal: Support an environment that fosters productivity and innovation
- 5.2 Second goal: Develop high-quality data and information
- 5.3 Third goal: Manage K/I/D effectively, putting them in a structure that encourages use and protects value
- 5.4 Fourth goal: Ensure a high level of data integrity at every step
- 5.5 Fifth goal: Addressing security throughout the lab
- 5.6 Sixth goal: Acquiring and developing "products" that support regulatory requirements
- 5.7 Seventh goal: Addressing systems integration and harmonization
- 6 Laboratory systems engineers
- 7 Closing
- 8 Abbreviations, acronyms, and initialisms
- 9 Footnotes
- 10 About the author
- 11 References
What separates successful advanced laboratories from all the others? It's largely their ability to meet their goals, with the effective use of resources: people, time, money, equipment, data, and information. The fundamental goals of laboratory work haven’t changed, but they are under increased pressure to do more and do it faster, with a better return on investment (ROI). Laboratory managers have turned to electronic technologies (e.g., computers, networks, robotics, microprocessors, database systems, etc.) to meet those demands. However, without effective planning, technology management, and education, those technologies will only get labs part of the way to meeting their needs. We need to learn how to close the gap between getting part-way there and getting where we need to be. The practice of science has changed; we need to meet that change to be successful.
This document was written to get people thinking more seriously about the technologies used in laboratory work and how those technologies contribute to meeting the challenges labs are facing. There are three primary concerns:
- The need for planning and management: When digital components began to be added to lab systems, it was a slow incremental process: integrators and microprocessors grew in capability as the marketplace accepted them. That development gave us the equipment we have now, equipment that can be used in isolation or in a networked, integrated system. In either case, they need attention in their application and management to protect electronic laboratory data, ensure that it can be effectively used, and ensure that the systems and products put in place are both the right ones, and that they fully contribute to improvements in lab operations.
- The need for more laboratory systems engineers (LSEs): There is increasing demand for people who have the education and skills needed to accomplish the points above and provide research and testing groups with the support they need.[a]
- The need to collaborate with vendors: In order to develop the best products needed for laboratory work, vendors should be provided more user input. Too often vendors have an idea for a product or modifications to existing products, yet they lack a fully qualified audience to bounce ideas off of. With the planning in the first concern in place, we should be able to approach vendors and say, with confidence, "this is what is needed" and explain why.
If the audience for this work were product manufacturing or production facilities, everything that was being said would have been history. The efficiency and productivity of production operations directly impacts profitability and customer satisfaction; the effort to optimize operations would have been an essential goal. When it comes to laboratory operations, that same level of attention found in production operations must be in place to accelerate laboratory research and testing operations, reducing cost and improving productivity. Aside from a few lab installations in large organizations, this same level of attention isn’t given, as people aren’t educated as to its importance. The purpose of this work is to present ideas of what laboratory technology challenges can be addressed through planning activities using a series of goals.
This material is an expansion upon two presentations:
- "Laboratory Technology Management & Planning," 2nd Annual Lab Asset & Facility Management in Pharma 2019, San Diego, CA, October 22, 2019
- "How Digital Technologies are Changing the Landscape of Lab Operations," Lab Manager webinar, April 2020
Directions in lab operations
The lab of the future
People often ask what the lab of the future (LOF) is going to look like, as if there were a design or model that we should be aspiring toward. There isn’t. Your lab's future is in your hands to mold, a blank sheet of paper upon which you define your lab's future by setting objectives, developing a functional physical and digital architecture, planning processes and implementations, and managing technology that supports both scientific and laboratory information management. If that sound scary, it’s understandable. But you must take the time to educate yourself and bring in people (e.g., LSEs, consultants, etc.) who can assist you.
Too often, if vendors and consultants are asked what the LOF is going to look like, the response lines up with their corporate interests. No one knows what the LOF is because there isn’t a singular future, but rather different futures for different types of labs. (Just think of all the different scientific disciplines that exist; one future doesn’t fit all.) Your lab's future is in your hands. What do you want it to be?
The material in this document isn’t intended to define your LOF, but to help you realize it once the framework has been created, and you are in the best position to create it. As you create that framework, you'll be asking:
- Are you satisfied with your lab's operations? What works and what doesn’t? What needs fixing and how shall it be prioritized?
- Has management raised any concerns?
- What do those working in the lab have to say?
- How is your lab going to change in the next one to five years?
- Does your industry have a working group for lab operations, computing, and automation?
Adding to question five, many companies tend to keep the competition at arm's length, minimizing contact for fear of divulging confidential information. However, if practically everyone is using the same set of test procedures from a trusted neutral source (e.g., ASTM International, United States Pharmacopeia, etc.), there’s nothing confidential there. Instead of developing automated versions of the same procedure independently, companies can join forces, spread the cost, and perhaps come up with a better solution. With that effort as a given, you collectively have something to approach the vendor community with and say “we need this modification or new product.” This is particularly beneficial to the vendor when they receive a vetted product requirements document to work from.
Again, you don’t wait for the lab of the future to happen, you create it. If you want to see the direction lab operations in the future can take, look to the manufacturing industry: it has everything from flexible manufacturing, cooperative robotics, and so on.[b] This is appropriate in both basic and applied research, as well as quality control.
Both manufacturing and lab work are process-driven with a common goal: a high-quality product whose quality can be defended through appeal to process and data integrity.
Lab work can be broadly divided into two activities, with parallels to manufacturing: experimental procedure development (akin to manufacturing process development) and procedure execution (product production). (Note: Administrative work is part of lab operations but not an immediate concern here.) As such, we have to address the fact that lab work is part original science and part production work based on that science, e.g., as seen with quality control, clinical chemistry, and high-throughput screening labs. The routine production work of these and other labs can benefit most from automation efforts. We need to think more broadly about the use of automation technologies—driving their development—instead of waiting to see what vendors develop.
Where manufacturing and lab work differ is in the scale of the work environment, the nature of the work station equipment, the skills needed to carry out the work, and the adaptability of those doing the work to unexpected situations.
My hope is that this guide will get laboratory managers and other stakeholders to begin thinking more about planning and technology management, as well as the need for more education in that work.
Trends in science applications
If new science isn’t being developed, vendors will add digital hardware and software technology to existing equipment to improve capabilities and ease-of-use, separating themselves from the competition. However, there is still an obvious need for an independent organization to evaluate that technology (i.e., the lab version of Consumer Reports); as is, that evaluation process, done properly, would be time consuming for individual labs and would require a consistent methodology. With the increased use of automation, we need to do this better, such that the results can be used more widely (rather than every lab doing their own thing) and with more flexibility, using specialized equipment designed for automation applications.
- Having a system that can bring up all relevant information on a research question—a sort of super Google—or a variation of IBM’s Watson could have significant benefits.
- Analyzing complex data or large volumes of data could be beneficial, e.g., the analysis of radio astronomy data to find fast radio bursts (FRB).
- "[A] team at Glasgow University has paired a machine-learning system with a robot that can run and analyze its own chemical reaction. The result is a system that can figure out every reaction that's possible from a given set of starting materials."
- HelixAI is using Amazon's Alexa as a digital assitant for laboratory work.
However there are problems using these technologies. ML systems have been shown to be susceptible to biases in their output depending on the nature and quality of the training materials. As for AI, at least in the public domain, we really don’t know what that is, and what we think it is keeps changing as purported example emerge. One large problem for lab use is whether or not you can trust the results of an AI's output. We are used to the idea that lab systems and methods have to be validated before they are trusted, so how do you validate a system based on ML or AI?
The major issue in all of this is having people educated to the point where they can successfully handle the planning and management of laboratory technology. One key point: most lab management programs focus on personnel issues, but managers also have to understand the capabilities and limitations of information technology and automation systems.
One result of the COVID-19 pandemic is that we are seeing the limitations of the four-year undergraduate degree program in science and engineering, as well as the state of remote learning. With the addition of information technologies, general systems thinking and modeling[c], statistical experimental design, and statistical process control have become multidisciplinary fields. We need options for continuing education throughout people’s careers so they can maintain their competence and learn new material as needed.
Making laboratory informatics and automation work
Making laboratory informatics and automation work? "Isn’t that a job for IT or lab personnel?" someone might ask. One of the problems in modern science is the development of specialists in disciplines. The laboratory and IT fields have many specialties, and specialists can be very good within those areas while at the same time not having an appreciation of wider operational issues. Topics like lab operations, technology management, and planning aren’t covered in formal education courses, and they're often not well-covered in short courses or online programs.
“Making it work” depends on planning performed at a high enough level in the organization to encompass all affected facilities and departments, including information technology (IT) and facilities management. This wider perspective gives us the potential for synergistic operations across labs, consistent policies for facilities management and IT, and more effective use of outside resources (e.g., lab information technology support staff [LAB-IT], laboratory automation engineers [LAEs][d], equipment vendors, etc.).
We need to apply the same diligence to planning lab operations as we do any other critical corporate resource. Planning provides a structure for enabling effective and successful lab operations.
Introduction to this section
The common view of science laboratories is that of rooms filled with glassware, lab benches, and instruments being used by scientists to carry out experiments. While this is a reasonable perspective, what isn’t as visually obvious is the end result of that work: the development of knowledge, information, and data.
The progress of laboratory work—as well as the planning, documentation, analytical results related to that work—have been recorded in paper-based laboratory notebooks for generations, and people are still using them today. However, these aren't the only paper records that have existed and are still in use; scientists also depend on charts, log books, administrative records, reports, indexes, and reference material. The latter half of the twentieth century introduced electronics into the lab and with it electronic recording in the form of computers and data storage systems. Early adopters of these technologies had to extend their expertise into the information technology realm because there were few people who understood both these new devices and their application to lab work—you had to be an expert in both laboratory science and computer science.
In the 1980s and 90s, computers became commonplace and where once you had to understand hardware, software, operating systems, programming and application packages, you then simply had to know how to turn them on; no more impressive arrays of blinking lights, just a blinking cursor waiting for you to do something.
As systems gained ease-of-use, however, we lost the basic understanding of what these systems were and what they did, that they had faults, and that if we didn’t plan for their effective use and counter those faults, we were opening ourselves to unpleasant surprises. The consequences at times were system crashes, lost data, and a lack of a real understanding of how the output of an instrument was transformed into a set of numbers, which meant we couldn’t completely account for the results we were reporting.
We need to step back, take control, and institute effective technology planning and management, with appropriate corresponding education, so that the various data we are putting into laboratory informatics technologies have the desired outcome. We need to ensure that these technologies are providing a foundation for improving laboratory operations efficiency and a solid return on investment (ROI), while substantively advancing your business' ability to work and be productive. That's the purpose of the work we'll be discussing.
The point of planning
The point of planning and technology management is pretty simple: to ensure ...
- that the right technologies are in people's hands when they need them, and
- that those technologies complement each other as much as possible.
These are straightforward statements with a lot packed into them.
Regarding the first point, the key words are “the right technologies.” In order to define what that means, lab personnel have to understand the technologies in question and how they apply to their work. If those personnel have used or were taught about the technologies under consideration, it should be easy enough to do. However, laboratory informatics doesn’t fall into that basket of things. The level of understanding has to be more than superficial. While personnel don’t have to be software developers, they do have to understand what is happening within informatics systems, and how data processing handles their data and produces results. Determining the “right technologies” depends on the quality and depth of education possessed by lab personnel, and eventually by lab information technology support staff (LAB-IT?) as they become involved in the selection process.
The second point also has a lot buried inside it. Lab managers and personnel are used to specifying and purchasing items (e.g., instruments) as discrete tools. When it comes to laboratory informatics, we’re working with things that connect to each other, in addition to performing a task. When we explore those connections, we need to assess how they are made, what we expect to gain, what compatibility issues exist, how to support them, how to upgrade them, what their life cycle is, etc. Most of the inter-connected devices people encounter in their daily lives are things that were expected to be connected with using a limited set of choices; the vendors know what those choices are and make it easy to do so, or otherwise their products won’t sell. The laboratory technology market, on the other hand, is too open-ended. The options for physical connections might be there, but are they the right ones, and will they work? Do you have a good relationship with your IT people, and are they able to help (not a given)? Again, education is a major factor.
Who is responsible for laboratory technology planning and management (TPM)?
When asking who is responsible for TPM, the question really is "who are the TPM stakeholders," or "who has an invested interest in seeing TPM prove successful?"
- Corporate or organizational management: These stakeholders set priorities and authorize funding, while also rationalizing and coordinating goals between groups. Unless the organization has a strong scientific base, they may not appreciate the options and benefits of TPM in lab work, or the possibilities of connecting the lab into the rest of the corporate data structure.
- Laboratory management: These stakeholders are responsible for developing and implementing plans, as well as translating corporate goals into lab priorities.
- Laboratory personnel: These stakeholders are the ones that actually do the work. However, they are in the best position to understand where technologies can be applied. They would also be relied on to provide user requirements documents for new projects and meet both internal and external (e.g., Food and Drug Administration [FDA], Environmental Protection Agency [EPA], International Organization for Standardization [ISO], etc.) performance guidelines.
- IT management and their support staff: While these stakeholders' traditional role is the support of computers, connected devices (e.g., printers, etc.) and network infrastructure, they may also be the first line of support for computers connected to lab equipment. IT staff either need to be educated to meet that need and support lab personnel, or have additional resources available to them. They may also be asked to participate in planning activities as subject matter experts on computing hardware and software.
- LAB-IT specialists: These stakeholders act as the "additional resources" alluded to in the previous point. These are crossover specialists that span the lab and IT spaces and can provide informed support to both. In most organizations, aside from large science-based companies, this isn’t a real "position," although once stated, its role is immediately recognized. In the past, I’ve also referenced these stakeholders as being “laboratory automation engineers.”
- Facility management: These stakeholders need to ensure that the facilities support the evolving state of laboratory workspace requirements as traditional formats change to support robotics, instrumentation, computers, material flow, power, and HVAC requirements.
Carrying out this work is going to rely heavily on expanding the education of those participating in the planning work; the subject matter goes well beyond material covered in degree programs.
Why put so much effort into planning and technology management?
Earlier we mentioned paper laboratory notebooks, the most common recording device since scientific research began (although for sheer volume, it may have been eclipsed by computer hard drives). Have you ever wondered about the economics of laboratory notebooks? Cost is easy to understand, but the value of the data and information that is recorded there requires further explanation.
The value of the material recorded in a notebook depends on two key factors: the quality of the work and an inherent ability to put that documented work to use. The quality of the work is a function of those doing the work, how diligent they are, and the veracity of what has been written down. The inherent ability to use it depends upon the clarity of the writing, people’s ability to understand it without recourse to the author, and access to the material. That last point is extremely important. Just by glancing at Figure 1, you can figure out where this is going.
As a scientist’s notebook fills with entries, it gains value because of the content. Once filled, it reaches an upper limit and is placed in a library. There it takes a slight drop in value because its ease-of-access has changed; it isn’t readily at hand. As library space fills, the notebooks are moved to secondary storage (in one company I worked at, secondary storage consisted of trailers in a parking lot). Costs go up due to the cost of owning or renting the secondary storage and the space they take. The object's value drops, not because of the content but due to the difficulty in retrieving that content (e.g., which trailer? which box?). Unless the project is still active, the normal turn-over of personnel (e.g., via promotions, movement around the company, leaving the company) mean that institutional memory diminishes and people begin to forget the work exists. If few researchers can remember it, find it, and access it, the value drops regardless of the resources that went into the work. That is compounded by the potential for physical deterioration of the object (e.g., water damage, mice, etc.).
Preventing the loss of access to the results of your investment in R&D projects will rely on information technology. That reliance will be built upon planning an effective informatics environment, which is precisely where this discussion is going. How is putting you lab results into a computer system any different than a paper-based laboratory notebook? There are obvious things like faster searching and so on, but from our previous discussion on them, not much is different; you still have essentially a single point of failure, unless you plan for that eventuality. That is the fundamental difference and what will drive the rest of this writing:
- Planning builds in reliability, security, and protection against loss. (Oh, and it allows us to work better, too!)
You could plan for failure in a paper-based system by making copies, but those copies still represent paper that has to be physically managed. With electronic systems, we can plan for failure by using automated backup procedures that make faithful copies, as many as we’d like, at low cost. This issue isn’t unique to laboratory notebooks, but it is a problem for organizations that depends on paper records.
The difference between writing on paper and using electronic systems isn’t limited to how the document is realized. If you were to use a typewriter, the characters would show up on the paper and you'd be able to read them; all you needed was the ability to read (which could include braille formats) and understand what was written. However, if you were using a word processor, the keystrokes would be captured by software, displayed on the screen, placed in the computer’s memory, and then written to storage. If you want to read the file, you need something—software—to retrieve it from storage, interpret the file contents, determine how to display it, and then display it. Without that software the file is useless. A complete backup process has to include the software needed to read the file, plus all the underlying components that it depends upon. You could correctly argue that the hardware is required as well, but there are economic tradeoffs as well as practical ones; you could transfer the file to other hardware and read it there for example.
That point brings us to the second subject of this writing: technology management. What do I have to do to make sure that I have the right tools to enable me to work? The problem is simple enough when all you're concerned with is writing and preparing accompanying graphics. Upon shifting the conversation to laboratory computing, it gets more complicated. Rather than being concerned with one computer and a few software packages, you have computers that acquire and process data in real-time[e], transmit it to other computers for storage in databases, and systems that control sample processing and administrative work. Not only do the individual computer systems and the equipment and people they support have to work well, but also they have to work cooperatively, and that is why we have to address planning and technology management in laboratory work.
That brings us to a consideration of what lab work is all about.
Different ways of looking at laboratories
When you think about a “laboratory,” a lot depends on your perspective: are you on the outside looking in, do you work in a lab, or are you taking that high school chemistry class? When someone walks into a science laboratory, the initial impression is that of confusing collection of stuff, unless they're familiar with the setting. “Stuff” can consist of instruments, glassware, tubing, robots, incubators, refrigerators and freezers, and even petri dishes, cages, fish tanks, and more depending on the kind of work that is being pursued.
From a corporate point of view, a "laboratory" can appear differently and have different functions. Possible corporate views of the laboratory include:
- A laboratory is where questions are studied, which may support other projects or provide a source of new products, acting as basic and applied R&D. What is expected out of these labs is the development of new knowledge, usually in the form of reports or other documentation that can move a project forward.
- A laboratory acts as a research testing facility (e.g., analytical, physical properties, mechanical, electronics, etc.) that supports research and manufacturing through the development of new test methods, special analysis projects, troubleshooting techniques, and both routine and non-routine testing. The laboratory's results come in the form of reports, test procedures, and other types of documented information.
- A laboratory acts as a quality assurance/quality control (QA/QC) facility that provides routine testing, producing information in support of production facilities. This can include incoming materials testing, product testing, and product certification.
Typically, stakeholders outside the lab are looking for some form of result that can be used to move projects and other work forward. They want it done quickly and at low cost, but also want the work to be of high quality and reliability. Those considerations help set the goals for lab operations.
Within the laboratory there are two basic operating modes or workflows: project-driven or task-driven work. With project-driven workflows, a project goal is set, experiments are planned and carried out, the results are evaluated, and a follow-up course of action is determined. This all requires careful documentation for the planning and execution of lab work. This can also include developing and revising standard operating procedures (SOPs). Task-driven workflows, on the other hand, essentially depend on the specific steps of a process. A collection of samples needs to be processed according to an SOP, and the results recorded. Depending upon the nature of the SOP and the number of samples that have to be processed, the work can be done manually, using instruments, or with partial or full automation, including robotics. With the exception of QA/QC labs, a given laboratory can use a combination of these modes or workflows over time as work progresses and the internal/external resources become available. QA/QC labs are almost exclusively task-driven; contract testing labs are as well, although they may take on project-driven work.
Within the realm of laboratory informatics, project-focused work centers on the electronic laboratory notebook (ELN), which can be described as a lab-wide diary of work and results. Task-driven work is organized around the laboratory information management system (LIMS)—or laboratory information system (LIS) in clinical lab settings—which can be viewed as a workflow manager of tests to be done, results to be recorded, and analyses to be finalized. Both of these technologies replaced the paper-based laboratory notebook discussed earlier, coming with considerable improvements in productivity. And although ELNs are considerably more expensive than paper systems, the short- and long-term benefits of an ELN overshadow that cost issue.
Labs in transition, from manual operation to modern facilities
For the most part, the skills you learned in school were the skills you needed to be successful here as far as technical matters went; management education was another issue. That changed when electronic instrumentation became available. Analog instruments such as scanning spectrophotometers, chromatographs, mass spectrometer, differential scanning calorimeters, tensile testers, and so on introduced a new career path to laboratory work: the instrument specialist, who combined an understanding of the basic science with the an understanding of the instrument’s design, as well as how to use it (and modify it where needed), maintain it, troubleshoot issues, and analyze the results. Specialization created a problem for schools: they couldn’t afford all the equipment, find knowledgeable instructors, and encourage room in the curriculum for the expanding subject matter. Schools were no longer able to educate people to meet the requirements of industry and graduate-level academia. And then digital electronics happened. Computers first became attached to instruments, and then incorporated into the instrumentation.[f]
The addition of computer hardware and software to an instrument increased the depth of specialization in those techniques. Not only did you have to understand the science noted above, but also the use of computer programs used to work with the instrument, how to collect the data, and how to perform the analysis. An entire new layer of skills was added to an already complex subject.
The latest level of complexity added to laboratory operations has been the incorporation of LIMS, ELNs, scientific data management systems (SDMS), and laboratory execution systems (LES) either as stand-alone modules or combined into more integrated packages or "platforms."
There's a plan for that?
It is rare to find a lab that has an informatics plan or strategy in place before the first computer comes through the door; those machines enter as part of an instrument-computer control system. Several computers may use that route to become part of the lab's technology base before people realize that they need to start taking lab computing seriously, including how to handle backups, maintenance, support, etc.
First computers come into the lab, and then the planning begins, often months later, as an incremental planning effort, which is the complete reverse of how things need to be developed. Planning is essential as soon as you decide that a lab space will be created. That almost never happens, in part because no one has told you that is required, let alone why or how to go about it.
Thinking about a model for lab operations
The basic purpose of laboratory work is to answer questions. “How do we make this work?” “What is it?” “What’s the purity of this material?” These questions and others like them occur in chemistry, physics, and the biological sciences. Answering those questions is a matter of gathering data and information through observation and experimental work, organizing it, analyzing it, and determining the next steps needed as the work moves forward (Figure 3). Effective organization is essential, as lab personnel will need to search data and information, extract it, move it from one data system to another for analysis, make decisions, update planning, and produce interim and ultimately final reports.
Once the planning is done, scientific work generally begins with collecting observations and measurements (Data/Information Sources 1–4, Figure 3) from a variety of sources. Lab bench work usually involves instrumentation, and many instruments have computer controls and data systems as part of them. This is the more visible part of lab work and the one that matches people’s expectations for a “scientific lab.” This is where most of the money is spent on equipment, materials, and people’s expertise and time. All that expenditure of resources results in “the pH of the glowing liquid is 6.5,” “the concentration of iron in the icky stuff is 1500 ppm,” and so on. That’s the end result of all those resources, time, and effort put into the scientific workflow. That’s why you built a million-dollar facility (in some spheres of science such as astronomy, high energy physics, and the space sciences, the cost of collection is significantly higher). So what do you do with those results? Prior to the 1970s, the collection points were paper: forms, notebooks, and other document, all with their earlier discussed issues.
The material on those instrument data systems needs to be moved to an intermediate system for long-term storage and reference (the second step of Figure 3). This is needed because those initial data systems may fail, be replaced, or added to as the work continues. After all, the data and information they’ve collected needs to be preserved, organized, and managed to support continued lab work.
The analyzed results need to be collected into a reference system that is the basis of long-term analysis, management/administration work, and reporting. This last system in the flow is the central hub of lab activities; it is also the distribution point for material sent to other parts of the organization (the third and fourth stages of Figure 3). While it is natural for scientists to focus on the production of data and information, the organization and centralized management of the results of laboratory work needs to be a primary consideration. That organization will be focused of short- and long-term data analysis and evaluation. The results of this get used to demonstrate the lab's performance towards meeting its goals, and it will show those investing in your work that you’ve got your management act together, which is useful when looking for continued support.
Today, those systems come in two basic forms: LIMS and ELN. The details of those systems are the subject of a number of articles and books. Without getting into too much detail:
- LIMS are used to support testing labs managing sample workflows and planning, as well as cataloging results (e.g., short text and numerical information).
- ELNs are usually found in research functioning as an electronic diary of lab work for one or more scientists and technicians. The entries may contain extensive textural material, numerical entries, charts, graphics, etc. The ELN is generally more flexible than a LIMS.
That distinction is simplistic; some labs support both activities and need both types of systems, or even a hybrid package. However, the description is sufficient to get us to the next point: the lifespan of systems varies, depending on where you are looking in Figure 3's model. Figure 4 gives a comparison.
The experimental methods/procedures used in lab work will change over time as the needs of the lab change. Older instruments may be updated and new ones introduced. Retirement is a problem, particularly if data systems are part of the equipment. You have to have access to the data. That need will live on long past the equipment's life. That is one reason that moving data and information to an intermediate system like an SDMS is important. However, in some circumstances, even that isn’t going to be sufficient (regulated industries where the original data structures and software that generated them need to be preserved as an operating entity). In those cases, you may have old computers stacked up just in case you need access to their contents. A better way is to virtualize the systems as containers on servers that support a virtualized environment.
Virtualization—making an electronic copy of computer system and running on a server—is potentially a useful technology in lab work; while it won’t participate in day-to-day activities it does have a role. Suppose you have an instrument-data system that is being replaced or retired. Maybe the computer is showing signs of aging or failing. What do you do with the files and software that are on the computer portion of the combination? You can’t dispose of them because you may need access to those data files and software later. On the other hand, do you really want to collect computer systems that have to be maintained just to have access to the data if and when you need it? Instead, virtualization is a software/hardware technology that allows you to make a complete copy of everything that is on that computer—including operating system files, applications, and data files—and stores it in one big file referred to as a “container.” That container can be moved to a computer that is a virtual server and has software that emulates various operating environment, allowing the software in the container to run as if it were on its own computer hardware. A virtual server can support a lot of containers, and the operating systems in those containers can be updated as needed. The basic idea is that you don’t need access to a separate physical computer; you just need the ability to run the software that was on it. If your reaction to that is one of dismay and confusion, it’s time to buy your favorite IT person a cup of coffee and have a long talk. We’ll get into more details when we cover data backup issues.
Why is this important to you?
While the science behind producing results is the primary reason your lab exists, gaining the most value from the results is essential to the organization overall. That value is going to be governed by the quality of the results, ease of access, the ability to find and extract needed information easily, and a well-managed K/I/D architecture. All of that addresses a key point from management’s perspective: return on investment or ROI. If you can demonstrate that your data systems are well organized and maintained, and that you can easily find and use the results from experimental work and contribute to advancing the organization’s goals, you’ll make it easier to demonstrate solid ROI and gain funding for projects, equipment, and people needed to meet your lab's goals.
The seven goals of planning and managing lab technologies
The preceding material described the need for planning and managing lab technologies, and making sure lab personnel are qualified and educated to participate in that work. The next step is the actual planning. There are at least two key aspects to that work: planning activities that are specific and unique to your lab(s) and addressing broader scope issues that are common to all labs. The discussion found in the rest of this guide is going to focus on the latter points.
Effective planning is accomplished by setting goals and determining how you are going to achieve them. The following sections of this guide look at those goals, specifically:
- Supporting an environment that fosters productivity and innovation
- Developing high-quality data and information
- Managing knowledge, information, and data effectively, putting them in a structure that encourages use and protects value
- Ensuring a high level of data integrity at every step
- Addressing security throughout the lab
- Acquiring and developing "products" that support regulatory requirements
- Addressing systems integration and harmonization
The material below begins the sections on goal setting. Some of these goals are obvious and understandable, others like “harmonization” are less so. The goals are provided as an introduction rather than an in-depth discussion. The intent is to offer something suitable for the purpose of this material and a basis for a more detailed exploration at a later point. The intent of these goals is not to tell you how to do things, but rather what things need to be addressed. The content is provided as a set of questions that you need to think about. The answers aren't mine to give, but rather yours to develop and implement; it's your lab. In many cases, developing and implementing those answers will be a joint effort by all stakeholders.
First goal: Support an environment that fosters productivity and innovation
In order to successfully plan for and manage lab technologies, the business environment should ideally be committed to fostering a work environment that encourages productivity and innovation. This requires:
- proven, supportable workflow methodologies;
- educated personnel;
- fully functional, inter-departmental cooperation;
- management buy-in; and
- systems that meet users' needs.
This is one of those statements that people tend to read, say “sure,” and move on. But before you do that, let’s take a look at a few points. Innovation may be uniquely human (not even going to consider AI), and the ability to be “innovative” may not be universal.
People need to be educated, be able to separate true facts from “beliefs,” and question everything (which may require management support). Innovation doesn’t happen in a highly structured environment, you need the freedom to question, challenge, etc. You also need the tools to work with. The inspiration that leads to innovation can happen anywhere, anytime. All of a sudden all the pieces fit. And then what? That is where a discussion of tools and this work come together.
If a sudden burst of inspiration hits, you want to do it now and not after traveling to an office, particularly if it is weekend or vacation. You need access to knowledge (e.g., documents, reports), information, and data (K/I/D). In order to do that, a few things have to be in place:
- Business and operations K/I/D must be accessible.
- Systems security has to be such that a qualified user can gain access to K/I/D remotely, while preventing its unauthorized use.
- Qualified users must have the hardware and software tools required to access the K/I/D, work with it, and transmit the results of that work to whoever needs to see it.
- Qualified users must also be able to remotely initiate actions such as testing.
Those elements depend on a well-designed laboratory and corporate informatics infrastructure. Laboratory infrastructure is important because that is where the systems are that people need access to, and corporate infrastructure is important since corporate facilities have to provide access, controls, and security. Implementation of those corporate components has to be carefully thought through; they must be strong enough to frustrate unwarranted access (e.g., multi-factor logins) while allowing people to get real work done.
All of this requires flexibility and trust in people, an important part of corporate culture. This will become more important as society adjusts to new modes of working (e.g., working online due to a pandemic) and the realization that the fixed format work week isn’t the only way people can be productive. For example, working from home or off-site is increasingly commonplace. Laboratory professionals work in two modes: intellectual, which can be done anywhere, and the lab bench, where physical research tasks are performed. We need to strike a balance between those modes and the need for in-person vs virtual contact.
Let's take another look at the previous Figure 3, which offered one possible structure for organizing lab systems:
This use of an intermediate file storage system like an SDMS and the aggregation of some instruments to a common computer (e.g., one chromatographic data system for all chromatographs vs. one per instrument) becomes more important for two reasons: 1. it limits the number of systems that have to be accessed to search, organize, extract, and work with K/D/I, and 2. it makes it easier to address security concerns. There are additional reasons why this organization of lab systems is advantageous, but we’ll cover those in later installments. The critical point here is a sound informatics architecture is key to supporting innovation. People need access to tools and K/D/I when they are working, regardless of where they are working from. As such, those same people need to be well-versed in the capabilities of the systems available to them, how to access them, use them, and how to recognize “missing technologies,” capabilities they need but don’t have access to or simply don't exist.
Imagine this. A technology expert consults for two large organizations, one tightly controlled (Company A), the other with a liberal view of trusting people to do good work (Company B). In the first case, getting work done can be difficult, with the expert fighting through numerous reviews, sign-offs, and politics. Company A has a stated philosophy that they don’t want to be the first in the market with a new product, but would rather be a strong number two. They justify their position through the cost of developing markets for new products: let someone else do the heavy lifting and follow behind them. This is not a culture that spawns innovation. Company B, however, thrives on innovation. While processes and procedures are certainly in place, the company has a more relaxed philosophy about work assignments. If the expert has a realizable idea, Company B lets them run with it, as long as they complete their assigned workload in a timely fashion. This is what spurs the human side of innovation.
Second goal: Develop high-quality data and information
Asking staff to "develop high-quality data and information" seems like a pretty obvious point, but this is where professional experience and the rest of the world part company. Most of the world treats “data” and “information” as interchangeable words. Not here.
There are three key words that are going to be important in this discussion of goals: knowledge, information, and data (K/I/D). We’ll start with “knowledge”. The type of knowledge we will be looking at is at the laboratory/corporate level, the stuff that governs how a laboratory operates, including reports, administrative material, and most importantly standard operating procedures (SOPs). SOPs tell us how lab work is carried out via its methods, procedures, etc. (This subject parallels the topic of “data integrity,” which will be covered later.) Figure 5 positions K/I/D with respect to each other within laboratory processes.
The diagram in Figure 5 is a little complicated, and we’ll get into the details as the material develops. For the moment, we’ll concentrate on the elements in black.
As noted above, SOPs guide activities within the lab. As work is defined—both research and testing—SOPs have to be developed so that people know how to carry out their tasks consistently. Our first concern then is proper management of SOPs. Sounds simple, but in practice it isn’t. It’s a matter of first developing and updating the procedures, documenting them, and then managing both the documents and the lab personnel using them.
When developing, updating, and documenting procedures, a lab will primarily be looking at the science its working with and how regulatory requirements affect it, particularly in research environments. Once developed, those procedures will eventually need to be updated. But why is an update to a procedure needed? What will the effects of the update be based on the changes that were made, and how do the results of the new version compare to the previous version? That last point is important, and to answer it you need a reference sample that has been run repeatedly under the older version so that you have a solid history of the results (i.e., control chart) over time. You also need the ability to run that same reference sample under the new procedure to show that there are no differences, or that differences can be accounted for. If differences persist, what do you do about the previous test results under the old procedure?
The idea of running one or more stable reference samples periodically is a matter of instituting statistical process control over the analysis process. It can show that a process is under control, detect drift in results, and demonstrate that the lab is doing its job properly. If multiple analysts are doing the same work, it can also reveal how their work compares and if there are any problems. It is in effect looking over their shoulders, but that just comes with the job. If you find that the amount of reference material is running low, then phase in a replacement, running both samples in parallel to get a documented comparison with a clean transition from one reference sample to another. It’s a lot of work and it’s annoying, but you’ll have a solid response when asked “are you confident in these results?” You can then say, “Yes, and here is the evidence to back it up.”
After the SOPs have been documented, they must then be effectively managed and implemented. First, take note of the education and experience required for lab personnel to properly implement any SOP. Periodic evaluation (or even certification) would be useful to ensure things are working as they should. This is particularly true of procedures that aren’t run often, as people may forget things.
Another issue of concern with managing SOPs is how to manage versioning. Consider two labs. Lab 1 is a well-run lab. When a new procedure is issued, the lab secretary visits each analyst, takes their copy of the old method, destroys it, provides a copy of the new one, requires the analyst sign for receipt, and later requires a second signature after the method has been reviewed and understood. Additional education is also provided on an as-needed basis. Lab 2 has good intentions, but it's not as proactive as Lab 1. Lab 2 retains all documents on a central server. Analysts are able to copy a method to their machines and use it. However, there is no formalized method of letting people know when a new method is released. At any given time there may be several analysts running the same method using different versions of the related SOP. The end result is having a mix of samples run by different people according to different SOPs.
This comparison of two labs isn’t electronic versions vs. paper, but rather a formal management structure vs. a loose one. There’s no problem maintaining SOPs in an electronic format, as there are many benefits, but there shouldn’t be any question about the current version, and there should be a clear process for notifying people about updates while also ensuring that analysts are currently educated in the new method's use.
Managing this set of problems—analyst education, versions of SOPs, qualification of equipment, current reagents, etc.— was the foundation for one of the early original ELNs, SmartLab by Velquest, now developed as a LES by Dassault Systèmes as part of the BIOVIA product line. And while Dassault's LES, and much of the Biovia product line, narrowly focuses on their intended market, the product remains suitable for any lab where careful control over procedure execution is warranted. This is important to note, as a LES is designed to guide a person through a procedure from start to finish, making it one step away from engaging in a full robotics system (robotics may play a role in stages of the process). The use of an LES doesn’t mean that personnel aren’t trusted or deemed incompetent; rather, it is a mechanism for developing documented evidence that methods have been executed correctly. That evidence builds confidence in results.
LESs are available from several vendors, often as part of their LIMS or ELN offerings. Using any of these systems requires planning and scripting (a gentler way of saying “programming”), and the cost of implementation has to be balanced against the need (does the execution of a method require that level of sophistication) and ROI.
Up to this point, we’ve looked at developing and managing SOPs, as well as at least one means of controlling experiment/procedure execution. However, there are other ways of going about this, including manual and full robotics systems. Figure 6 takes us farther down the K/I/D model to elaborate further on experiment/procedure execution.[g]
As we move from knowledge development and management (i.e., SOPs), and then on to sample preparation (i.e., pre-experiment), the next step is usually some sort of measurement by an instrument, whether it is pH meter or spectrometer, yielding your result. That brings us to two words we noted earlier: "data" and "information." We'll note the differences between the two using a gas chromatography system as an example (Figure 7), as it and other base chromatography systems are among the most widely used of upper-tier instrumentation and widely found in labs where chemical analysis is performed.
As we look at Figure 7, we notice to the right of the vertical blue line is an output signal from a gas chromatograph. This is what chromatographers analyzed and measured when they carried out their work. The addition of a computer made life easier by removing the burden of calculations, but it also added complexity to the work in the form of having to manage the captured electronic data and information. An analog-to-digital (A/D) converter transformed those smooth curves to a sequence of numbers that are processed to yield parameters that described the peaks, which in turn were used to calculate the amount of substance in the sample. Everything up to that last calculation—left of the vertical blue line—is “data,” a set of numerical values that, taken individually, have no meaning by themself. It is only when we combine it with other data sets that we can calculate a meaningful result, which gives us “information.”
The paragraph above describes two different types of data:
1. the digitized detector output or "raw data," constituting a series of readings that could be plotted to show the instrument output; and
2. the processed digitized data that provides descriptors about the output, with those descriptors depending largely upon the nature of the instrument (in the case of chromatography, the descriptors would be peak height, retention time, uncorrected peak area, peak widths, etc.).
Both are useful and neither of them should be discarded; the fact that you have the descriptors doesn’t mean you don’t need the raw data. The descriptors are processed data that depends on user-provided parameters. Changing the parameter can change the processing and the values assigned to those descriptors. If there are accuracy concerns, you need the raw data as a backup. Since storage is cheap, there really isn’t any reason to discard anything, ever. (And in some regulatory environments, keeping raw data is mandated for a period of time.)
If you want to study the data and how it was processed to yield a result, you need more data, specifically the reference samples (standards) used to evaluate each sample. An instrument file by itself is almost useless without the reference material run with that sample. Ideally, you’d want a file that contains all the sample and reference data that was analyzed in one session. That might be a series of manual samples analyzed or an entire auto-sampler tray.
Everything we've discussed here positively contributes to developing high-quality data and information. When methods are proven and you have documented evidence that they were executed by properly educated personnel using qualified reagents and instruments, you then have the instrument data to support each sample result and any other information gleaned from that data.
You might wonder what laboratorians did before computers. They dealt with stacks of spectra, rolls of chromatograms, and laboratory notebooks, all on paper. If they wanted to find the data (e.g., a pen trace on paper) for a sample, they turned to the lab's physical filing system to locate it.[h] Why does this matter? That has to do with our third goal.
Third goal: Manage K/I/D effectively, putting them in a structure that encourages use and protects value
In the previous section we introduced three key elements of laboratory work: knowledge, information, and data (K/I/D). Each of these is “database” structures (“data” in the general sense). We also looked at SOP management as an example of knowledge management, and distinguished “data” and “information” management as separate but related concerns. We also introduced flow diagrams (Figures 5 and 6) that show the relationship and development of each of those elements.
In order for those elements to justify the cost of their development, they have to be placed in systems that encourage utilization and thus retain their value. Modern informatics tools assist in many ways:
- Document management systems support knowledge databases (and some LIMS and ELNs inherently support document management).
- LIMS and ELNs provide a solid base for laboratory information, and they may also support other administrative and operational functions.
- Instrument data systems and SDMS collect instrument output in the form of reports, data, and information.
You may notice there is significant functional redundancy as vendors try to create the “ultimate laboratory system.” Part of lab management’s responsibility is to define what the functional architecture should look like based on their current and perceived needs, rather than having it defined for them. It’s a matter of knowing what is required and seeing what fits rather than fitting requirements into someone else’s idea of what's needed.
Managing large database systems is only one aspect of handling K/I/D. Another aspect involves the consideration of cloud vs. local storage systems. What option works best for your situation, is the easiest to manage, and is supported by IT? We also have to address the data held in various desktop and mobile computing devices, as well as bench top systems like instrument data systems. There are a number of considerations here, not the least of which is product turnover (e.g., new systems, retired systems, upgrades/updates, etc.). (Some of these points will be covered latter on in other sections.)
What you should think about now is the number of computer systems and software packages that you use on a daily basis, some of which are connected to instruments. How many different vendors are involved? How big are vendors (e.g., small companies/limit staff, large organizations)? How often do they upgrade their systems? What’s the likelihood they’ll be around in two or five years?
Also ask what data file formats the vendor uses; these formats vary widely among vendors. Some put everything in CSV files, others in proprietary formats. In the latter case, you may not be able to use the data files without the vendor's software. In order to maintain the ability to work with instrument data, you will have to manage the software needed to open files and work with it, in addition to just making sure you have copies of the data files. In short, if you have an instrument-computer combination that does some really nice stuff and you want to preserve the ability to gain value from that instrument's data files, you have to make a backup copy of the software environment and the data files. This is particularly important if you're considering retiring a system that you'll still want to access data from, plus you may have to maintain any underlying software license. This is where the previous conversation about virtualization and containers comes in.
If you think about a computer system it has two parts: hardware (e.g., circuit boards, hard drive, memory, etc.) and software (e.g., the OS, applications, data files, etc.). From the standpoint of the computer’s processor, everything is either data or instructions read from one big file on the hard drive, which the operating system has segmented for housing different types of files (that segmentation is done for your convenience; the processor just sees it all as a source of instructions and data). Virtualization takes everything on the hard drive, turns it into a complete file, and places that file onto a virtualization server where it is stored as a file called a “container.” That server allows you to log in, open a container, and run it as though it were still on the original computer. You may not be able to connect it the original instruments to the containerized environment, but all the data processing functions will still be there. As such, a collection of physical computers can become a collection of containers. An added benefit of virtualizations applies when you're worried about an upgrade creating havoc with your application; instead, make a container as a backup.[i]
The advantage of all this is that you continue to have the ability to gain value and access to all of your data and information even if the original computer has gone to the recycle bin. This of course assumes your IT group supports virtualization servers, which provide an advantage in that they are easier to maintain and don’t take up much space. In larger organization this may already be happening, and in smaller organizations a conversation may be had to determine IT's stance. The potential snag in all this is whether or not the software application's vendor license will cover the operation of their software on a virtual server. That is something you may want to negotiate as part of the purchase agreement when you buy the system.
This section has shown that effective management of K/I/D is more than just the typical consideration of database issues, system upgrades, and backups. You also have to maintain and support the entire operating system, the application, and the data file ecosystem so that you have both the files needed and the ability to work with them.
Fourth goal: Ensure a high level of data integrity at every step
“Data integrity” is an interesting couple of words. It shows up in marketing literature to get your attention, often because it's a significant regulatory concern. There are different aspects to the topic, and the attention given often depends on a vendor's product or the perspective of a particular author. In reality, it touches on all areas of laboratory work. The following is an introduction to the goal, with more detail given in later sections.
Definitions of data integrity
There are multiple definitions of "data integrity." A broad encyclopedic definition can be found at Wikipedia, described as "the maintenance of, and the assurance of, data accuracy and consistency over its entire life-cycle" and "a critical aspect to the design, implementation, and usage of any system that stores, processes, or retrieves data."
Another definition to consider is from a more regulatory perspective, that of the FDA. In their view, data integrity focuses on the completeness, consistency, accuracy, and validity of data, particularly through a mechanism called the ALCOA+ principles. This means the data should be:
- Attributable: You can link the creation or alteration of data to the person responsible.
- Legible: The data can be read both visually and electronically.
- Contemporaneous: The data was created at the same time that the activity it relates to was conducted.
- Original: The source or primary documents relating to the activity the data records are available, or certified versions of those documents are available, e.g., a notebook or raw database. (This is one reason why you should collect and maintain as much data and information from an instrument as possible for each sample.)
- Accurate: The data is free of errors, and any amendments or edits are documented.
Plus, the data should be:
- Complete: The data must include all related analyses, repeated results, and associated metadata.
- Consistent: The complete data record should maintain the full sequence of events, with date and time stamps, such that the steps can be repeated.
- Enduring: The data should be able to be retrieved throughout its intended or mandated lifetime.
- Available: The data is able to be accessed readily by authorized individuals when and where they need it.
Both definitions revolve around the same point: the data a lab produces has to be reliable. The term "data integrity" and its associated definitions are a bit misleading. If you read the paragraphs above you get the impression that the focus in on the results of laboratory work, when in fact it is about every aspect of laboratory work, including the methods used and those who conduct those methods.
In order to gain meaningful value from laboratory K/I/D, you have to be assured of its integrity; “the only thing worse than no data, is data you can’t trust.” That is the crux of the matter. How do you build that trust? Building a sense of confidence in a lab's data integrity efforts requires addressing three areas of concern and their paired intersections: science, people, and informatics technology. Once we have successfully managed those areas and intersection points, we are left with the intersection common to all of them: constructed confidence in a laboratory's data integrity efforts (Figure 8).
We’ll begin with a look at the scientific component of the conversation (Figure 9). Regardless of the kinds of questions being addressed, the process of answering them is rooted in methods and procedures. Within the context of this guide, those methods have to be validated or else your first step in building confidence has failed. If those methods end with electronic measurements, then that equipment (including settings, algorithms, analysis, and reporting) have to be fully understood and qualified for use in the validated process. The manufacturer's default settings should either be demonstrated as suitable or avoided.
Another aspect of “people” is the development of a culture that contributes to data integrity. Lab personnel need to be educated on the organization’s expectations of how lab work needs to be managed and maintained. This includes items such as records retention, dealing with erroneous results, and what constitutes original data. They should also be fully aware of corporate and regulatory guidelines and the effort needed to enforce them.[j] This is another instance where education beyond that provided in the undergraduate curriculum is needed.
The implementation and use of informatics technology should be the result of careful product selection and their intentional design—from the lab bench to central database systems such as LIMS, ELN, SDMS, etc.—rather than haphazard approach of an aggregate of lab computers.
Other areas of concern with informatics technology include backups, security, and product life cycles, which will be addressed in later sections. If as we continue onward through these goals it appears like everything touches on data integrity, it's because it does. Data integrity can be considered an optimal result of the sum of well-executed laboratory operations.
The intersection points
Two of the three intersection points deserve minor elaboration (Figure 12). First, the intersection of people and informatics technologies has several aspects the address. The first is laboratory personnel’s responsibility—which may be shared with corporate or LAB-IT—for the selection and management of informatics products. The second is the fact that this requires those personnel to be knowledgeable concerning the application of informatics technologies in laboratory environments. Ensure the selected personnel have the appropriate backgrounds and knowledge to consider, select, and effectively use those products and technologies.
The other intersection point to be addressed is that of science with informatics technology. Here, stakeholders are concerned with product selection, system design (for automated processes), and system integration and communication with other systems and instruments. Again, as noted above, we go into more detail in later sections. The primary point here, however, can be summed up as determining whether or not the products selected for your scientific endeavors are compatible with your data integrity goals.
Addressing the needs of these two intersection points requires deliberate effort and many planning questions regarding vendor support, quality of design, system interoperability, result output, and scientific support mechanisms. Questions to ask include:
- Vendor support: How responsive are vendors to product issues? Do you get a fast, usable response or are you left hanging? A product that is having problems can affect data quality and reliability.
- Quality of design: How easy is the system to use? Are controls, settings, and working parameters clearly defined and easily understood? Do you know what effect changes in those points will have on your results? Has the system been tuned to your needs (not adjusted to give you the answers you want, but set to give results that truly represent the analysis)? Problems with adjusting settings properly can distort results. (This is one area where data integrity may be maintained throughout a process, and then lost because of improper or untested controls on an instrument's operation.)
- System interoperability: Will there be any difficulty in integrating a software product or instrument into a workflow? Problems with sample container compatibility, operation, control software, etc. can cause errors to develop in the execution of a process flow. For example, problems with pipette tips can cause errors in fluid delivery.
- Result output: Is an electronic transfer of data possible, or does the system produce printed output (which means someone typing results into another system)? How effective is the communications protocol; is it based on a standard or does it require custom coding, which could be error prone or subject to interference? Is the format of the data file one that prevents changes to the original data? For example, CSV files allow easy editing and have the potential for corruption, nullifying data integrity efforts.
- Scientific support mechanisms: Does the product fully meet the intended need for functionality, reliability, and accuracy?
The underlying goal in this section goes well beyond the material that is covered in schools. Technology development in instrumentation and the application of computing and informatics is progressing rapidly, and you can’t assume that everything is working as advertised, particularly for your application. Software has bugs and hardware has limitations. Applying healthy skepticism towards products and requiring proof that things work as needed protect the quality of your work.
If you’re a scientist reading this material, you might wonder why you should care. The answer is simply this: it is the modern evolution of how laboratory work gets done and how results are put to use. If you don’t pay attention to the points noted, data integrity may be compromised. You may also find yourself the unhappy recipient of a regulatory warning letter.
While there are some outcomes that could occur that you prefer didn't, there are also positive outcomes to come from your data integrity efforts: your work will be easier and protected from loss, results will be easier to organize and analyze, and you’ll have a better functioning lab. You’ll also have fewer unpleasant surprises when technology changes occur and you need to transition from one way of doing things to another. Yet there's more to protecting the integrity of your K/I/D than addressing the science, people, and information technology of your lab. The security of your lab and its information systems must also be addressed.
Fifth goal: Addressing security throughout the lab
Security is about protection, and there are two considerations in this matter: what are we protecting and how do we enact that protection? The first is easily answered by stating that we're protecting our ability to effectively work, as well as the results of that work. This is largely tied to the laboratory's data integrity efforts. The second consideration, however, requires a few more words.
Broadly speaking, security is not a popular subject in science, as it is viewed as not advancing scientific work or the development of K/I/D. Security is often viewed as inhibiting work by imposing a behavioral structure on people's freedom to do their work how they wish. Given these perceptions, it should be a lab's goal to create a functional security system that provides the protection needed while at the same time minimizing the intrusion in people’s ability to work.
This section will look at a series of topics that address the physical and electronic security of laboratory work. Those major topics are shown in Figure 13 below. The depth of the commentary will vary, with some topics getting discussed at length and others by brief reference to others' work.
Why must security be addressed in the laboratory? There are many reasons, which are best diagramed, as seen in Figure 14:
All of these reasons have one thing in common: they affect our ability to work and access the results of that work. This requires a security plan. In the end, implemented security efforts either preserve those abilities, or they reduce the value and utility of the work and results, particularly if security isn't implemented well or adds a burden to personnel's ability to work. While addressing these reasons and their corresponding protections, we should keep in mind a number of issues when developing and implementing a security plan within the lab (Figure 15). Issues like remote access have taken on particular significance over the course of the COVID-19 pandemic.
When the subject of security comes up, people's minds usually go in one of two directions: physical security (i.e., controlled access) and electronic security (i.e., malware, viruses, ransomware, etc.). We’re going to come at it from a different angle: how do the people in your lab want to work? Instead of looking at a collection of solutions to security issues, we’re going to first consider how lab personnel want to be working and within what constraints, and then we'll see what tools can be used to make that possible. Coming at security from that perspective will impact the tools you use and their selection, including everything from instrument data systems to database products, analytical tools, and cloud computing. The lab bench is where work is executed, and the planning and thinking take place between our ears, something that can happen anywhere. How do we provide people with the freedom to be creative and work effectively (something that may be different for each of us) while maintaining a needed level of physical and intellectual property security? Too often security procedures seem to be designed to frustrate work, as noted in the previous Figure 15.
The purpose of security procedures are to protect intellectual property, data integrity, resources, our ability to work, and lab personnel, all of which can be impacted by the reasons given in the prior Figure 14. However, the planning for how to approach these security procedures requires the coordination with and cooperation of several stakeholders within and tangentially related to the laboratory. Ensure these and any other necessary stakeholders are involved with the security planning efforts of your laboratory:
- Facilities management: These stakeholders manage the physical infrastructure you are working in and have overall responsibility for access control and managing the human security assets in larger companies. In smaller companies and startups, the first line of security may be the receptionist; how well trained are they to deal with the subject?
- IT groups: These stakeholders will be responsible for designing and maintaining (along with facilities management) the electronic security systems, which range from passkeys to networks.
- Legal: These stakeholders may work with human resources to set personnel standards for security, reviewing licensing agreements and contracts/leases for outside contractors and buildings (more later).
- Lab personnel: From the standpoint of this guide, this is all about the people doing the analytical and research work within the laboratory.
- Consultants: Security is a complex and rapidly developing subject, and you will likely need outside support to advise you on what is necessary and possible, as well as how to go about making that a reality.
But what else must be considered during you and your stakeholders' planning efforts? Before we can get into the specific technologies and practices that may be implemented within a facility, we need to look at the facility itself.
Examine aspects of the facility itself
Does your company own the building you are working in? Is it leased? Is it shared with other companies in a single industrial complex? If you own the facility, life is simpler since you control everything. Working in a shared space that is leased or rented requires more planning and thought, preferably before you sign an agreement. You're likely to have additional aspects to seriously consider about your facility. Have the locks and door codes been changed since the last tenant left? Is power shared across other businesses in your building? Is the backup generator—if there is one—sufficient to run your systems? What fire protections are in place? How is networking managed in the facility? Are security personnel attuned to the needs of your company? Let's take a look at some of these and other questions that should be addressed.
Is the physical space well-defined, and does building maintenance have open access to your various spaces?
Building codes vary from place to place. Some are very specific and strict, while others are almost live-and-let-live. One thing you want to be able to do is to define and control your organization's physical space and set up any necessary and additional protective boundaries. Physical firewalls are one way of doing that. A firewall should be a solid structure that acts as a barrier to fire propagation between your space and neighboring spaces, extending from below-ground areas to the roof. If it is a multi-level structure, the levels should be isolated. This may seems obvious, but in some single-level shared buildings (e.g., strip malls) the walls may not go to the roof to make it easier to route utilities like HVAC, power, and fire suppression. This can acts as an access point to your space.
Building maintenance is another issue. Do they have access to your space? Does that access come with formal consent or is that consent assumed as part of the lease or rental agreement? Several problems must be considered. First, know that anyone who has access to your physical space should be considered a weak point in your security. Employees should inherently have a vested interest in protecting your assets, but building maintenance is a different matter. Who vets them? Since these notes are focused on laboratory systems, who trains them about what to touch and what not to? (For example, an experiment could be ruined because maintenance personnel opened a fume hood, disturbing the airflow, despite any signage placed on the hood glass.) Consider more than just office systems in your space analysis, including other equipment that may be running after-hours that doesn’t handle tampering, curiosity, or power outages well. Do you have robotics running multiple shifts or other moving equipment that might attract someone’s curiosity? Security cameras would be useful, as would “Do Not Enter” signs.
Second, most maintenance staff will notify you (hopefully in writing) about their activities so you can plan accordingly, but what about emergency issues? If they have to fix a leak or a power problem, what are the procedures for safely shutting down systems? Do they have a contact person on your staff in case a problem occurs? Is there hazardous material on-site that requires special handling? Are the maintenance people aware of it and how to handle it? Answers to these questions should be formalized in policy and disseminated to both maintenance and security management personnel, and be made available to new personnel who may not be up to speed.
Is power shared across other businesses in your building?
Shared power is another significant issue in any building environment. Unless someone paid careful attention to a lab's needs during construction, it can affect any facility. A number of issues can arise from misconfigured or unsupported power systems. Real-life examples of issues a computer support specialist friend of mine has encountered in the past include computers that:
- were connected to the same circuit box as heavy duty garage doors. Deliveries would come in early in the morning and when the doors opened the computers crashed.
- were on the same circuit as air conditioners. The computers didn’t crash, but the electrical noise and surging power use created havoc with systems operations and disk drives.
- connected to circuits that didn’t have proper grounding or had separate grounding systems in the same room. Some didn’t have external grounding at all. We worked on a problem with one computer-instrument system that had each device plugged into different power outlets. The computer’s was grounded, but the instrument's power supply wasn’t; once that was fixed everything worked.
- were too close to a radio tower. Every night when the radio system changed its antenna configuration, the computer experienced problems. Today, many devices generate radio signals that might interfere with each other. The fact that they are “digital” systems doesn’t matter; they are made of analog components.
Is the power clean, and is the backup generator—if there is one—sufficient to run your systems?
Another problem is power continuity and quality. Laboratories depend on clean, reliable power. What will the impact of power outages—lasting anywhere from seconds to days—be on your ability to function? The longer end of the scale is easy; you stop working or relocate critical operations. Generators are one solution option, and we’ll come back to those. The shorter outages, particularly if they are of the power up-down-up variety, are a separate issue. Networkable sensors with alarms and alerts for monitoring power, refrigeration, etc., permitting remote monitoring, may be required. Considerations for these intermittent outages include:
- Do you know when they happened? What was their duration? How can you tell? (Again, consider sensor-based monitoring.)
- What effect did intermittent outages have on experiments that were running? Did the systems and instruments reset? Was data lost? Were in-process samples compromised?
- What effect did they have on stored samples? If samples had to be maintained under controlled climate conditions, were they compromised?
- Did power loss and power cycling cause any problems with instrumentation? How do you check?
- Did systems fail into a safe mode?
How real are power problems? As Ula Chrobak notes in an August 2020 Popular Science article, infrastructure failures, storms, climate change, etc. are not out of the realm of possibility; if you were in California during that time, you saw the reality first-hand.
If laboratory operations depend on reliable power, what steps can we take to ensure that reliability? First, site selection naturally tops the list. You want to be somewhere that has a reputation for reliable power and rapid repairs if service is lost. A site with buried wiring would be optimal, but that only benefits you a little if the industrial park has buried wiring but is actually fed with overhead wiring. Another consideration is the age of the site: an older established site may have outdated cables that are more likely to fail. The geography is also important. Nearby rivers, lakes, or an ocean might be liable to producing floods, causing water intrusion into wiring. Also, don’t overlook the potential issues associated with earthquakes or nearby industries with hazardous facilities such as chemical plants or refineries. Areas prone to severe weather conditions are an additional consideration.
Second, address the overall quality of the building and its infrastructure. This affects buildings you own as well as lease; however, the difference is in your ability to make changes. How old is the wiring? Has it been inspected? Are the grounding systems well implemented? Do you have your own electrical meters, and is your power supply isolated from other units if you are leasing? Will your computers and instruments be on circuits isolated from heavy equipment and garage doors? Make an estimate of your power requirements, then at least double it. Given that, is there sufficient amperage coming into the site to manage all your instruments, computers, HVAC systems, and freezers? How long will you be occupying that space, and is there sufficient power capacity to support potential future expansion?
Third, consider how to implement generators and battery backup power. These are obvious solutions to power loss, yet they come with their own considerations:
- Who has control over generator implementation? If you own the building, you do. If the building is leased, the owner does, and they may not even provide generator back-up power. If not, your best bet—unless you are planning on staying there for a long time—is to go somewhere else; the cost of installing, permitting, and maintaining a generator on a leased site may be prohibitive. A good whole-house system can run up to $10,000, plus the cost of a fueling system.
- How much power will you need and for how long, and is sufficient fuel available? Large propane tanks may need to be buried. Diesel is another option, though fire codes may limit fuel choices in multi-use facilities. The expected duration of an outage is important, also. Often we think perhaps a few hours, but snow, ice, hurricane, tornados, and earthquakes may push that out to a week or more.
- Is the generator’s output suitable for the computers and instruments in your facility? A major problem to acknowledge is electrical noise: too much and you’ll create more problems than you would have if the equipment had just been shut down.
- What is the startup delay of the generator? A generator can take anywhere from a few seconds to several minutes to get up to speed and produce power. Can you afford that delay? Probably not.
The answer to the problems noted in the last two bullets is battery backup power. These can range from individual units that are used one-per-device, like home battery backups for computers and other equipment, to battery-walls that are being offered for larger applications. The advantage is that they can come online anywhere from instantly (i.e., always-on, online systems) to a few milliseconds for standby systems. The always-on, online options contain batteries that are constantly being charged and at the same time constantly providing power to whatever they are connected to. More expensive than standby systems, they provide clean power even from a source that might otherwise be problematic. On the other hand, standby systems are constantly charging but pass through power without conditioning; noisy power in, noisy power out until a power failure occurs.
Security and the working environment
When we are looking at security as a topic, we have keep in mind that we are affecting people's ability to work. Some of the laboratory's work is done at the lab bench or on instruments (which, depending on the field you’re in, could range from pH meters to telescopes). However, significant work occurs away from bench, thinking and planning wherever a thought strikes. What kind of flexibility do you want people to have? Security will often be perceived as stuff that gets in the way of personnel's ability to work, despite the fact that well-implemented security protects their work.
We need to view security as a support structure enabling flexibility in how people work, not simply as a series of barriers that frustrate people. You can begin by defining the work structure as you’d like it to be, at the same time recognizing that there are two sides to lab work: the intellectual (planning, thinking, evaluating, etc.) and the performed (where you have to work with actual samples, equipment, and instruments). One can be done anywhere, while the other is performed in a specific space. The availability of computers and networks can blur the distinction.
Keeping these things in mind, any security planning should consider the following:
- How much flexibility do personnel want in their work environment vs. what they can actually have? In some areas of work, there may be significant need for lockdown with little room for flexibility, while other areas may be pretty loose.
- Do people want to work from remote places? Does the nature of the work permit it? This can be motivated by anything from “the babysitter is sick” to “I just need to get away and think.”
- While working remotely, do people need access to lab computers for data, files (i.e., upload/download), or to interact with experiments? Some of this can be a matter of respecting people’s time. If you have an experiment running overnight or during the weekend, it would be nice to check the status remotely instead of driving back to work.
- Do people need after-hours access to the lab facilities?
The answers to these planning questions lay the groundwork for hardware, software, and security system requirements. Can you support the needs of personnel, and if so, how is security implemented to make it work? Will you be using gateway systems to the lab network, with additional logins for each system, two-factor authentication, or other mechanisms? The goal is to allow people to be as productive as possible while protecting the organization's resources and meeting regulatory requirements. That said, keep in mind that unless physical and virtual access points are well controlled, others may compromise the integrity of your facility and its holdings.
Employees need to be well educated in security requirements in general and how they are implemented in your facility. They need to be a willing part of the processes and not grudgingly accepting them; that lack of willingness to work within the system is a security weak point, things people will try to circumvent. One obvious problem is with username-password combinations for computer access; rather than typing that information in, biometric features are faster and less error prone.
That said, personnel should readily accept that no system should be open to unauthorized access, and that hierarchical levels of control may be needed, depending on the type of system; some people will have access to some capabilities and not others. This type of "role-based" access shouldn’t be viewed as a matter of trust, but rather as a matter of protection. Unless the company is tiny, senior management, for example, shouldn’t have administrative system level access to database systems or robotics. If management is going to have access to those levels, ensure they know exactly what they are doing. By denying access to areas not needed in a role-based manner, you limit the ability of personnel to improperly interrogate or compromise those systems for nefarious purposes.
What are your security control requirements?
What is your policy on system backups?
When it comes to your computer systems, are you backing up their K/I/D? If so, how often? How much K/I/D can you afford to lose? Look at your backups on at least three levels. First, backing up the hard drive of a computer protects against failure of that computer and drive. Backing up all of a lab’s data systems to an on-site server in a separate building (or virtualized locally) protects against physical damage to the lab (e.g., fire, storms, earthquake, floods, etc.). Backing up all of a lab’s data systems to a remote server (or virtualized remotely) provides even more protection against physical damage to the lab facility, particularly if the server is located someplace that won’t be affected by the same problems your site may be facing. It should also be somewhere that doesn’t compromise legal control over your stuff; if it is on a third-party server farm in another country, that country’s laws apply to access and seizure of your files if legal action is taken against you.
Should your various laboratory spaces and components be internet-connected?
When looking at your lab bench spaces, instruments, database systems, etc., determine whether they should be connected to the internet. This largely depends on what capabilities you expect to gain from internet access. Downloading updates, performing online troubleshooting with vendors, and conducting online database searches (e.g., spectra, images, etc.) are a few useful capabilities, but are they worth the potential risk of intrusion? Does your IT group have sufficient protection in place to allow access and still be protected? Note, however, any requirement for a cloud-based system would render this a moot point.
Lab systems should be protected against any intrusion, including vendors. Vendor-provided files can be downloaded to flash drives, which can then be checked for malware and integrity before being manually transferred to lab systems. Consider what is more important: convenience or data protection? This may give you more to think about when you consider your style of working (e.g., remote access). However, having trusted employees access the lab network is different than third-parties.
We’ve only been able to touch on a few topics; a more thorough review would require a well-maintained document, as things are changing that quickly.[k] In many labs, security is a layered activity, where as the work of the lab is planned, security issues are then considered. We’d be far better off if security planning was instead conducted in concert with lab systems planning; support for security could would then become part of the product selection criteria.
Sixth goal: Acquiring and developing "products" that support regulatory requirements
Products should be supportable. That seems pretty simple, but what exactly does that mean? How do we acquire them, and more importantly, how do we develop them? The methods and procedures you develop for lab use are “products”—we’ll come back to that.
First, here’s an analogy using an automobile. The oil pan on a car may need to be replaced if it is leaking due to damage or a failed gasket; if it isn’t repaired, the oil can leak out. Some vehicles are more difficult to work on than others given their engineering. For example, replacing the oil pan in some cars requires you to lift the engine block out of the car. That same car design could also force you to move the air conditioner compressor to change spark plugs. In the end, some automobile manufactures have built a reputation for cars that are easy to service and maintain, which translates into lower repair costs and longer service life.
How does that analogy translate to the commercial products you purchase for lab use, as well as the processes and procedures you develop for your lab? The ability to effectively (i.e., with ease, a low cost, etc.) support a product has to be baked into the design from the start. It can’t be retrofitted.
Let’s begin with the commercial products you purchase for lab use, including instruments, computer systems, and so on. One of the purchase criteria for those items is how well they are supported: mature products should have a better support infrastructure, built up over time and use. However, that doesn’t always translate to high-quality support; you may find a new product getting eager support because the vendor is heavily invested in market acceptance, working the bugs out, and using referral sites to support their sales. When it comes to supporting these commercial products, we expect to see:
- User Guides – This should tell you how the device works, what the components are (including those you shouldn’t touch), how to use the control functions, what the expected operating environment is, what you need to provide to make the item usable, and so on. For electronic devices with control signal communications and data communications, the vendor will describe how it works and how they expect it to be used, but not necessarily how to use it with third-party equipment. There are limitations of liability and implied support commitments that they prefer not to get involved with. They provide a level of capability, while it’s largely left up to you to make it work in your application.
- Training materials – This will take you from opening the box, setting up whatever you’ve purchased, and walking through all the features and some examples of their use. The intent is to get you oriented and familiar with using it, with the finer details located in user guides. Either this document or the user guide should tell you how to ensure that the device is installed and operating properly, and what to do if it isn’t. This category can also include in-person short courses as well as online courses (an increasingly popular option as something you can do at your convenience).
- Maintenance and troubleshooting manuals – This material describes what needs to be periodically maintained (e.g, installing software upgrades, cleaning equipment, etc.) and what to do if something isn’t working properly.
- Support avenues - Be it telephone, e-mail, or online chat, there are typically many different ways of reaching the vendor for help. Online support can also include a “knowledgebase” of articles on related topics, as well as chat functions.
- User groups – Whether conducted in-person or online, venues for giving users a chance to solve problems and present material together can also prove valuable.
From the commercial side of laboratory equipment and systems, support is an easy thing to deal with. If you have good products and support, people will buy them. If your support is lacking, they will go somewhere else, or you will have fostered the development of a third-party support business if your product is otherwise desirable.
From the system user’s perspective, lab equipment support is a key concern. Users typically don’t want to take on a support role in the lab as that isn’t their job. This brings us to an interesting consideration: product life cycles. You buy something, put it to use, and eventually it has to be upgraded (particularly if it involves software) or possibly replaced (as with software, equipment, instruments, etc.). Depending on how that item was integrated into the lab’s processes, this can be a painful experience or an easy one. Product life cycles are covered in more detail later in this section, but for now know they are important because they apply, asynchronously, to every software system and device in your lab. Upgrade requirements may not be driven by a change in the functionality that is important to the lab, but rather due to a change to an underlying component, e.g., the computer's operating system. The reason that this is important in a discussion about support is this: when you evaluate a vendor's support capabilities, you need to cover this facet of the work. How well do they evaluate changes in the operating system (OS) in relation to the functionality of their product? Can they advise you about which upgrades are critical and those that can be done at a more convenient time? If a change to OS or a database product occurs, how quickly do they respond?
Now that we have an idea what support means for commercial products, let’s consider what support means for the "products"—i.e., the procedures and methods—developed in your lab.
The end result of a typical laboratory-developed method is a product that incorporates a process (Figure 17). This idea is nothing new in the commercial space. Fluid Management Systems, Inc. has complex sample preparation processing systems as products, as do instrument vendors that combine autosamplers, an instrument, and a data system (e.g., some of Agilent’s PAL autosampler systems incorporate sample preparation processing as part of their design). Those lab methods and procedures can range from a few steps to an extensive process whose implementations can include fully manual execution steps, semi-automated steps (e.g., manual plus instrumentation), and fully automated steps. In the first two cases, execution can occur with either printed or electronic documentation, or it can be managed by a LES. However, all of these implementations are subject to regulatory requirements (commercial products are subject to ISO 9000 requirements).
The importance of documentation
Regulatory requirements and guidelines (e.g., from the FDA, EPA, ISO, etc.) have been with production and R&D for decades. However, some still occasionally question those regulations' and guidelines' application to research work. Rather than viewing them as hurdles which a lab must cross to be deemed qualified, they should be viewed as hallmarks of a well-run lab. With that perspective, they remain applicable for any laboratory.
For purposes of this guide, there is one aspect of regulatory requirements that will be emphasized here: process validation, or more specifically the end result, which is a validated process. Laboratory processes, all of which have to be validated, are essentially products for a limited set of customers; in many cases its one customer, in others the same process may be replicated in other labs as-is. The more complex the implementation, and the longer the process is expected to be in use, the more important it is to incorporate some of the tools from commercial developers into lab development (Table 1). However, regardless of development path, complete documentation is of the utmost concern.
Documentation is valuable because:
- It's educational: Quality documentation ensures those carrying out the process or maintaining it are thoroughly educated. Written documentation (with edits and audit trails, as appropriate) acts as a stable reference point for how things should be done. The “follow-me-and-I’ll-show-you” approach is flawed. That method depends on someone accurately remembering and explaining the details while having the time to actually do it, all while hoping bad habits don't creep in and become part of “how it’s done.”
- It informs: Quality documentation that is accessible provides a reference for questions and problems as they occur. The depth of that documentation, however, should be based on the nature of the process. Even manual methods that are relatively simple need some basic elements. To be innformative, it should address numerous questions. Has the instrument calibration been accurately verified? How do you tell, and how do you correct the problem if the instrument is out of calibration? What information is provided about reagents, including their age, composition, strength, and purity? When is a technician qualified to use a reagent? How are reference materials incorporated as part of the process to ensure that it is being executed properly and consistently?
Note that the support documents noted in Table 1 are not usually part of process validation. The intent of process validation is to show that something works as expected once it is installed.
One aspect that hasn’t been mentioned so far is how to address necessary change within processes. Any lab process is going to change over time. There may be a need for increased throughput, lower operating costs, less manual work, the ability to run over multiple shifts, etc. There may also be new technologies that improve lab operations that eventually need to get incorporated into the process. As such, planning and process documentation should describe how processes are reviewed and modified, along with any associated documentation and training. This requires the original project development to be thoroughly documented, from functionality scoping to design and implementation. By including process review and modification as part of a process allows that process to be upgraded without having to rebuild everything from scratch. This level of documentation is rarely included due to the initial cost and impact on the schedule. It will affect both, but it will also show its value once changes have to be made. In the end, by adding process review and modification mechanism, you ensure a process is supportable.
To be clear, the initial design and planning of process and methods has to be done well for a supportable product. This means keeping in mind future process review and modification even as the initial process or method is being developed. It’s the difference between engineering a functional and supportable system and “just making something that works.” Here are three examples:
- One instrument system vendor, in a discussion between sessions of a meeting, described how several of his customers successfully connected a chromatography data system (CDS) to a LIMS. It was a successful endeavor until one of the systems had to be upgraded, then everything broke. The programmer had made programming changes to areas of the packages that they shouldn’t have. When the upgrade occurred, those changes were overwritten. The project had to be scrapped and re-developed.
- A sample preparation robotics system was similarly implemented by creating communications between devices in ways that were less than ideal. When it came time for an upgrade to one device, the whole system failed.
- A consultant was called in to evaluate a project to interface a tensile tester to a computer, as the original developer had left the company. The consultant recommended the project be scrapped and begun anew. The original developer had not left any design documentation, the code wasn’t documented, and no one knew if any of it worked or how it was supposed to work. Trying to understand someone else’s programming without documentation assistance is really a matter of trying to figure out their thinking process, and that can be very difficult.
There are a number of reasons why problems like this exist. Examples include lack of understanding of manual and automated systems design and engineering methodologies and pressure from management (e.g., “how fast can you get it done,” “keep costs down,” and “we’ll fix it in the next version”). Succumbing to these short-term views will inevitably come back to haunt you in the long-term. Upgrades, things you didn’t think of when the original project was planned, and support problems all tend to highlight work that could have been done better. Another saying that frequently comes up is “there is never time to do it right, but there is always time to do it over,” usually at a considerably higher cost.
Additional considerations in creating supportable systems and processes
Single-vendor or multi-vendor?
Let’s assume we are starting with a manual method that works and has been fully validated with all appropriate documentation. And then someone wants to change that method in order to meet a company goal such as increasing productivity or lowering operational costs. Achieving goals like those usually means introducing some sort of automation, anything from automated pipettes to instrumentation depending on the nature of the work. Even more important, if a change in the fundamental science underlying the methodology is proposed, that would also require re-validation of the process.
Just to keep things simple, let’s say the manual process has an instrument in it, and you want to add an autosampler to keep the instrument fed with samples to process. This means you also need something to capture and process the output or any productivity gains on the input may be lost in handling the output. We’ll avoid that discussion because our concern here is supportability. There are a couple directions you can go in choosing an add-on sampler: buy one from the same vendor as the instrument, or buy it from another vendor because it is less expensive or has some interesting features (though unless those features are critical to improving the method, they should be considered “nice but not necessary”).
How difficult is it going to be to make any physical and electronic (control) connections to the autosampler? Granted, particularly for devices like autosamplers, vendors strive for compatibility, but there may be special features that need attention. You need to consider not just the immediate situation but also how things might develop in the future. If you purchase the autosampler from the same vendor as the instrument and control system, they are going to ensure that things continue to work properly if upgrades occur or new generations of equipment are produced (see the product life cycle discussion in the next section). If the two devices are from different vendors, compatibility across upgrades is your issue to resolve. Both vendors will do what they can to make sure their products are operating properly and answer questions about how they function, but making them work together is still your responsibility.
From the standpoint of supportability, the simpler approach is the easiest to support. Single-vendor solutions put the support burden on them. If you use multi-vendor implementations, then all the steps in making the project work have to be thoroughly documented from the statement of need, through to the functional specifications and the finished product. The documentation may not be very long, but any assistance you can give someone who has to work with the system in the future—including yourself (i.e., “what was I thinking when I did this?”)—will be greatly appreciated.
On-device programming or supervisory control?
Another consideration is for semi- or fully-automated systems where a component is being added or substituted. When we are looking at programmable devices, one approach is to make connections between devices via on-device programming. For example, say Device A needs to work with Device B, so programming changes are made to both to accomplish the task. While this can be made to work and be fully documented, it isn’t a good choice since changing one of them (via upgrade or swap) will likely mean the programming has to be re-implemented. A better approach is to use a supervisory control system to control them both, and others that may be part of the same process. It allows for a more robust design, easier adaptations, and smoother implementation. It should be easier to support since programming changes will be limited to communications codes.
Third-party developers and contractors?
Frequently, third parties are brought in to provide services that aren’t available through the lab or onsite IT staff. For example, a functional specification usually describes what you want as the end result and what their product is supposed to do, not how it is supposed to do it. This is left to the developer to figure out. You need to add supportability as a requirement, that the end result not only meet regulatory requirements, but that it also is designed and documented with sufficient information to have someone unfamiliar with the project understand what would have to be done if a change were made, which also requires you to think about where changes might be made in the future. This includes considering what components might be swapped for newer technologies, handling software upgrades (and what might break as a result of them), and knowing what to do if components reach their supported end-of-life and have to be replaced.
Consulting firms may respond with “if something needs to be changed or fixed, just call us, we built it,” which sounds reasonable. However, suppose the people who built it aren’t there anymore or aren't available because they're working on other projects. The reality is the “company” didn’t build the product, people working for them did.
Product life cycles
When discussing product life cycles, whether it's digital products or hardware systems, the bottom line problem is this: what do you do when a critical product needs to be updated or replaced? This can be an easy issue or a very painful one, depending on how much thought went into the design of the original procedure using that device. It's generally easy if you had the forethought of noting “someday this is going to be replaced, so how do we simplify that?" It's more difficult if you, through wiring or software, linked devices and systems together and then can’t easily separate them, particularly if no one documented how that might be accomplished. It’s all a matter of systems engineering done well.
Note: This material was originally published in Computerized Systems in the Modern Laboratory: A Practical Guide.
An analog pH meter—as long as it has been maintained—will still work today. So will double-pan balances, again as long as they have been maintained and people are proficient in their use. Old lab equipment that will still function properly has been replaced with more modern equipment due to better accuracy, ease of use, and other factors. Analog instruments can still be found operating decades after their design end-of-life. It is in the digital realm that equipment that should work (and probably does) but can’t be used after a few years of service. The rate of technology change is such that tools become obsolete on the order of a half-decade. For example, rotating disks, an evolving computer staple that replaced magnetic tape drives, are now being replaced with solid-state storage.
Digital systems require two components to work: hardware (e.g., the computer, plus interfaces for the hard disk, and ports for cable connections) and software (e.g., operating systems plus software drivers to access the hardware). Both hardware packaging and operating systems are changing at an increasing rate. Hardware systems are faster, with more storage, and operating systems are becoming more complex to meet consumer demands, with a trend toward more emphasis on mobile or social computing. Those changes mean that device interfaces we rely on may not be there in the next computer you have to purchase. The RS-232 serial port, a standard for instrument connections, is being replaced with USB, Firewire, and Thunderbolt connections that give support to a much wider range of devices and simplify computer design, with more usable and less costly devices. It also means that the instrument with the RS-232 interface may not work with a new computer due to there being no RS-232 ports, and the operating system may also no longer be compatible with the instrumentation.
One aspect of technology planning in laboratory work is change management, specifically the impact of technology changes and product life cycles on your ability to work. The importance of planning around product life cycles has taken on an added dimension in the digital laboratory. Prior to the use of computers, instruments were the tools used to obtain data, which was recorded in paper laboratory notebooks. End result: getting the data and recording and managing it were separate steps in lab work. If the tools were updated or replaced, the data recorded wasn’t affected. In the digital realm, changes in tools can affect your ability to work with new and old data and information. The digital-enabled laboratory requires planning, with a time horizon of decades to meet legal and regulatory requirements. The systems and other tools you use may not last for decades; in fact, they will probably change several times. However, you will have to plan for the transfer of the data and information they contain and address the issue of database access and file formats. The primary situation to avoid is having data in files that you can’t read.
While we are going to begin looking at planning strategies for isolated products as a starting point, please keep in mind that in reality products do not exist in isolation. The laboratory’s K/I/D is increasingly interconnected, and changes in one part of your overall technology plan can have implications across your lab's working technology landscape. The drive toward integration and paperless laboratories has consequences that we are not fully prepared to address. We’ll start with the simpler cases and build upon that foundation.
Digital products change for a number of reasons:
- The underlying software that support informatics applications could change (e.g., operating systems, database systems), and the layers of software that build on that base have to be updated to work properly.
- Products could see improvements due to market research, customer comments, and competitive pressures.
- Vendors could get acquired, merge with another company, or split up, resulting in products merging or one product being discarded in favor of another.
- Your company could be acquired, merge with another company, or split into two or more organizations.
- Products could simply fail.
- Your lab could require a change, replacing older technologies with systems that provide more capability.
In each of these cases, you have a decision to make about how K/I/D is going to be managed and integrated with the new system(s).
But how often do digital products change? Unfortunately, there isn't much detailed information published about changes in vendor products. Historically, operating systems used to be updated with new versions on an annual basis, with updates (e.g., bug fixes, minor changes) occurring more frequently. With a shift toward subscription services, version changes can occur more frequently. The impact of an OS version change will vary depending on the OS. Some vendors take responsibility and control for the hardware and software, and as a result, upgrades support both the hardware and OS until the vendor no longer supports new OS versions on older systems. Other computer systems, where the hardware and software components come from different vendors, can result in the inability to access hardware components due to an upgrade. The OS upgrade only supports certain hardware features. Support for specific add-on equipment (including components provided by the computer vendor) may require finding and reinstalling drivers from the original component vendor. As for the applications that run on operating systems, they will need to be tested with each OS version change.
Applications tend to be updated on an irregular basis, for both direct installs and for cloud-hosted solutions. Microsoft Office and Adobe’s Creative Cloud products may be updated as they see a need. Since both product suites are now accessed via the internet on a subscription basis (software as a service or SaaS), user action isn’t required. Lab-specific applications may be upgraded or updated as the vendor sees a need; SaaS implementations are managed by the vendor according to the vendor's internal planning. Larger, stable vendors may provide upgrades on a regular, annual basis for on-site installations. Small vendors may only update when a significant change is made, which might include new features, or when forced to because of OS changes. If those OS compatible changes aren’t made, you will see yourself running software that is increasingly out-of-date. That doesn’t necessarily mean it will stop working (for example, Microsoft dropped support for Windows XP in the Spring of 2014, and computers running it didn’t suddenly stop). What it does mean is that if your computer hardware has to be replaced, you may not be able to re-install a working copy of the software. The working lifetime of an application, particularly a large one, can be on the order of a decade or more. Small applications depend upon market acceptance and the vendor’s ability to stay in business. Your need for access to data may exceed the product's life.
The perception of the typical product life cycle runs like this: a need is perceived; product requirements are drafted; the product is developed, tested, and sold, based on market response to the product; new product requirements are determined; and the cycle continues. The reality is a bit more complicated. Figure 18 shows a more realistic view of a product’s life cycle. The letters in the circles refer to key points where decisions can have an impact on your lab ("H" = high, "M" = medium, "L" = low).
The process begins with an initial product concept, followed by the product's development, introduction, and marketing programs, and finally its release to customers. If the product is successful, the vendor gathers customer comments, analyzes competitive technologies and any new technologies that might be relevant, and determines a need for an upgrade.
This brings us to the first decision point: is an upgrade possible with the existing product? If it is, the upgrade requirements are researched and documented and the process moves to development, with generally a low impact on users. “Generally” because it depends on the nature of the product and what modifications, changes, and customizations have been made by the user. If it is an application that brings a data file in, processes it, and then saves the result in an easily accessible file format, allowing no user modifications to the application itself, “low impact” is a fair assessment. Statistical analysis packages, image processing, and others such applications fall into this set. Problems can arise when user modifications are overwritten by the upgrade and have to be reinstalled (only a minor issue if it is a plug-in with no programming changes) or re-implemented by making programming changes (a major problem since it requires re-validation). Normally any customization (e.g., naming database elements) and data held within an application's database should be transferred without any problems, though you do need to make some checks and evaluations to ensure that this is the case. This is the inner loop of Figure 18.
Significant problems can begin if the vendor determines that the current product generation needs to be replaced to meet market demands. If this is a hardware product (e.g., pH meter, balance, instrument, computer, etc.) there shouldn’t be any immediate impact (the hardware will continue to work). However, once there is a need for equipment replacement, it becomes a different matter; we’ll pick this thread up later when we discuss product retirement.
Software is a different situation. The new generation of software may not be compatible with the hardware you currently have. What you have will still work, but if there are features in the new generation that you’d like to have, you may find yourself having to re-implement any software changes that you’ve made to the existing system. It will be like starting over again, unless the vendor takes pains to ensure that the upgrade installation is compatible with the existing software system. This includes assurances that all the data, user programming, settings, and all the details you’ve implemented to make a general product work in your environment can successfully migrate. You will also have to address user education, and plan for a transition from the old system to the new one.
One problem that often occurs with new generations of software is change in the underlying data file structures. The vendor may have determined that in order to make the next generation work and to be able to offer the features they want included, the file structure and storage formats will change. This will require you to re-map the existing file structure into the new one. You may also find that some features do not work the same as they did before and your processes have to be modified. In the past, even new versions of Microsoft Office products have had compatibility issues with older versions. In large applications such as informatics or instrument data systems (e.g., multi-user chromatography data system), changes in formats can be significant. It can have an effect on importing instrument data into informatics products. For example, some vendors and users use a formatted PDF file as a means of exchanging instrument data with a LES, SDMS, or ELN. If the new version of an instrument data system changes its report formatting, the PDF parsing routine will have be updated.
At this point, it's important to note that just because a vendor comes out with a new software or hardware package doesn’t mean that you have to upgrade. If what you have is working and the new version or generation doesn’t offer anything of significant value (particularly when the cost of upgrading and the impact on lab operations is factored in) then bypass the upgrade. Among the factors that can tug you into an upgrade is the potential loss of support for the products you are concerned with.
What we’ve been discussing in the last few paragraphs covers the outer loop to the right of Figure 18. The next point we need to note in that figure is the “No” branch from “New Product Generation Justified?” and “Product Fails,” both of which lead to product retirement. For both hardware and software, you face the loss of customer support and the eventual need for product replacement. In both cases, there are steps that you can take to manage the potential risks.
To begin with, unless the vendor is going out of business, they are going to want to maintain a good relationship with you. You are a current and potential future customer, and they’d like to avoid bad press and problems with customer relationships. Explain how the product retirement is going to affect you and get them to work with you on managing the issue; you aren’t the only one affected by this (see the commentary on user groups later). If you are successful, they will see a potential liability turn into a potential asset: you can now be a referral for the quality of their customer service and support. Realistically, however, your management of product retirement or major product changes has to occur much earlier in the process.
Your involvement begins at the time of purchase. At that point you should be asking what the vendors update and upgrade policies are, how frequently they occur, what the associated costs are, how much advanced notice they give for planning, and what level of support is provided. In addition, determine where the product you are considering lies in the product’s life cycle. Ask questions such as:
- Is it new and potentially at-risk for retirement due to a lack of market acceptance? If it is and the vendor is looking for reference sites, use that to drive a better purchase agreement. Make sure that the product is worth the risk, and be prepared in case the worst-case scenario occurs.
- Is it near the end-of-life with the potential for retirement? Look at the frequency of updates and upgrades. Are they tailing off or is the product undergoing active development?
- What is the firm’s financial position? Is it running into declining sales or are customers actively seeking it? Is there talk of acquisitions or mergers, either of which can put the product's future into question?
You should also ask for detailed technical documents that describe where programming modifications are permitted and preserved against vendor changes, and how data will be protected, along with any tools for data migration. Once you know what the limitations are for coding changes, device additions, and so on are, the consequences of deviating from them are your responsibility; whatever you do should be done deliberately and with full awareness of their impact in the future.
One point that should be clarified during the purchase process is whether you are purchasing a product or a product license. If you a purchasing a product, you own it and can do what you like with it, at least for hardware products. Products that are combinations of hardware and software may be handled differently since the hardware won’t function without the software. Licenses are “rights to use” with benefits and restrictions. Those should be clearly understood, as well as what you can expect in terms of support, upgrades, the ability to transfer products, how products can be used, etc. If there are any questions, the time to get them answered is before you sign purchase agreements. You have the best leverage for gaining information and getting reasonable concessions that are important to you while the vendor is trying to sell you something. If you license a product, the intellectual property within the product belongs to the vendor while you own your K/I/D; if you decide to stop using a product, you should have the ability to extract your K/I/D in a usable form.
Another point: if the product is ever retired, what considerations are provided to you? For a large product, they may not be willing to offer documented copies of the code so that you can provide self-support, but a small company trying to break into the market might. It doesn’t hurt to ask and get any responses in writing, don’t trust someone’s verbal comments; they may not be there when upgrades or product retirement occurs. Additionally, it's always beneficial to conduct negotiations on purchase and licenses in cooperation with your company's IT and legal groups. IT can advise on industry practices, and the legal department’s support will be needed for any agreements.
Another direction you should take is participating in user groups. Most major vendor and products have user groups that may exist as virtual organizations on LinkedIn, Yahoo, or other forums. Additionally, they often have user group meetings at major conferences. Company-sponsored group meetings provide a means for learning about product directions, raising issues, discussing problems, etc. Normally these meeting are divided into private (registered users only) and public sessions, the former being the most interesting since they provide a means of unrestricted comments. If a new version or upgrade is being considered, it will be announced and discussed at group meetings. These will also provide a mechanism for making needs known and if a product is being retired, lobbying for support. The membership contact list will provide a resource for exchanging support dialogue, particularly if the vendor is reluctant to address points that are important to you.
If a group doesn’t exist, start a virtual conference and see where it goes. If participation is active, let the vendor know about it; they may take an interest and participate or make it a corporate function. It is in a company's best interest to work with its customers rather than antagonize them. Your company’s support may be needed for involvement in, or starting, user groups because of the potential for liability, intellectual property protection, or other issues. Activities performed in these types of groups can be wide-ranging, from providing support (e.g., trading advice, code, tutorials, etc.) and sharing information (e.g., where to get parts, information for out-of-warranty products) to identifying and critiquing repair options and meeting informally for conferences.
The key issue is to preserve your ability to carry out your work with as little disruption as possible. That means you have to protect your access to the K/I/D you’ve collected, along with the ability to work with it. In this regards, software systems have one possible advantage: virtualization.
Virtualization: An alternative to traditional computing models
There are situations in laboratory computing that are similar to the old joke “your teeth are in great shape but the gums have to go.” The equivalent situation is running a software package and finding out that the computer hardware is failing and the software isn’t compatible with new equipment. That can happen if the new computer uses a different processor than the one you are working with. An answer to the problem is a technology called virtualization. In the context of the joke, it lets you move your teeth to a new set of gums; or to put it another way, it allows you to run older software packages on new hardware and avoid losing access to older data (there are some limitations).
Briefly put, virtualization allows you to run software (including the operating system) designed for one computer on an entirely different system. An example: the Windows XP operating system and applications running on a Macintosh computer using the MAC OS X operating system via VMware’s Fusion product. In addition to rescuing old software, virtualization can:
- reduce computing costs by consolidating multiple software packages on servers;
- reduce software support issues by preventing operating system upgrades from conflicting with lab software;
- provide design options for multiple labs using informatics products without incurring hardware costs and giving up lab space to on-site computers; and
- reduce interference between software packages running on the same computer.
Regarding that last benefit, it's worth noting that with virtualization, adding software packages means each gets its own “computer” without additional hardware costs. Product warrantees may state the software warrantee is limited to instances where the software is installed on a “clean machine” (just the current operation system and that software package, nothing else). Most people put more than one application on a computer, technically voiding the warrantee. Virtualized containers let you go back to that clean machine concept without buying extra hardware.
In order to understand virtualization, we have to discuss computing, but just the basics. Figure 19 shows an arrangement of the elements. When the computer is first turned on, there are three key items engaged: the central processing unit (CPU), memory, and mass storage. The first thing that happens (after the hardware startup sequence) is that portions of the operating system are placed in the memory where the CPU can read instructions and begin working. The key point is that the operating system, applications, and files are a collection of binary data elements (words) that are passed on to the CPU.
The behavior of the CPU can be emulated by a software program. We can have a program that acts like an Intel processor for example, or a processor from another vendor. If we feed that program the instructions from an application, it will execute that application. There are emulators for example, that will allow your computer to emulate an Atari 2600 game console and run Asteroids. There are also emulators for other game consoles, so your computer can behave like any game console you like, as long as you have an emulator for it. Each emulator has all the programming needed to execute copies of the original game programming. They don’t wear out or break. This configuration is shown in Figure 20.
You have a series of game emulators, each with its collection of games. Any game emulator can be loaded into memory and execute a game from its collection; games for other emulators won’t work. Each game emulator and game collection is called a container. When you want to play, you access the appropriate container and go. If the mass storage is on a shared server, other people can access the same containers and run the games on their computers without interfering with each other.
How does this apply to a computer with failing hardware that is running a data analysis program? Virtualized systems allow you to make a copy of mass storage on the computer, create a container containing the CPU emulator, OS, applications, and data files, and place it on a server for later access. The hardware no longer matters because it is being replaced with the CPU emulator. Your program’s container can be copied, stored, backed up, etc. It will never wear out or grow old. When you want to run it, access the container and run it. A server can support many containers.
There are some restrictions, however. First, most computers that are purchased from stores or online come pre-loaded with an operating system. Those operating systems are OEM (original equipment manufacturer) copies of the OS whose license cost is buried in the purchase price. They can’t be copied or transferred legally, and some virtual servers will recognize OEM copies and not transfer them. As a result, in order to make a virtualized container, you need a fully licensed copy of the OS. Your IT group may have corporate licenses for widely used operating systems, so that may not pose a problem. Next, recognize that some applications will require a separate license for use on a virtualized system. As frequently noted, planning ahead is key: explore this option as part of the purchase agreement: you may get a better deal. Third, it's important to note that virtualized systems cannot support real-time applications such as direct analog, clock-driven, time-critical data acquisition from an instrument. The virtualized software shares resources with other containers in a time-sharing mode, and as a result the close coordination for data acquisition will not work. Fortunately, direct data acquisition (as contrasted with computer-to-instrument communications via RS-232, USB, Ethernet, etc.) is occurring less often in favor of buffered data communications with dedicated data acquisition controllers, so this is becoming less of a problem. If you need direct computer-controlled data acquisition and experiment control, this isn’t the technology for you. Finally, containerized software running on virtualized systems cannot access hardware that wasn’t part of the original configuration. If the computer you are using has a piece of hardware that you’d like to use but wasn’t on the original virtualized computer, it won’t be able to use it since it doesn’t have the software drivers to access it.
If the applications software permits it, applications can have shared access to common database software. A virtualized LIMS may be a good way to implement the application since it doesn’t require hardware in the lab and uses servers that are probably under IT control, and as a result the systems are backed up regularly. The major hang-up on these installations is instrument connections. IT groups tend to get very conservative about that subject. Middleware could help isolate actual instrument connections from the network and could potentially resolve the situation. The issue is part technical, part political. However, virtualized LIMS containers still prove beneficial for educational purposes. A student can work with the contents of a container, experiment as needed, and when done dismiss the container without saving it; the results of the experiments are gone, mistakes and all.
There are different types of virtualization. One has containers sharing a common emulator and operating system. As a result, you update or upgrade the emulator software and/or operating system once and the change is made across all containers. That can cause problems for some applications; however, they can be moved to a second type of virtualization in which each container has it own copy of the operating system and they can be excluded from updates.
If you find this technology appealing, check with your vendor to see if the products of interest will function in a virtualized environment (not all will). Carefully ask questions, perhaps asking if their software will run under VMware’s products or Microsoft’s Desktop Virtualization products, or even Microsoft’s Hyper-V server. Some vendors don’t understand the difference between virtualization and client-server computing. Get any responses in writing.
Retirement of hardware
Replacing retired hardware can be a challenge. If it is a stand-alone, isolated product (not connected to anything else), the problem can be resolved by determining the specifications for a replacement, conducting due diligence, etc. It is when data systems, storage, and connections to computers enter the picture that life gets interesting. For example, replacing an instrument sans data system, such as a chromatograph or spectrometer with analog and digital I/O (sense switches not data) connections to a computer, is essentially just a hardware replacement.
Hardware interfaced to a computer has issues because of software controls and data exchanges. What appears to be the simplest and most common situation is with serial communications (RS-232, RS-xxx). Complications include:
- Wiring: Serial communications products do not always obey conventions for wiring, so wiring changes have to be considered and tested.
- Control functions and data exchange: Interacting with serial devices via a computer requires both control functions and data exchange. There are no standards for these, so a new purchase will likely require software changes to these. That may be avoided if the replacement device (e.g., a balance) is from the same vendor as the one you currently have, and is part of a family of products. The vendor may preserve the older command set and add new commands to access new features. If that is the case, you still have a plug-compatible replacement that needs to be tested and qualified for use.
- Interfaces: Moving from an RS-232 or similar RS- device to another interface such as USB will require a new interface (although USB ports are on almost every computer) and software changes.
If you are using a USB device, the wiring problems go away but the command structure and data transfer issues remain. Potential software problems are best addressed when the software is first planned and designed; good design means planning ahead for change. The primary issue is control of the external device, as data formats may also change. Those points can be addressed by device-independent programming. That means placing all device-dependent commands in one place—a subroutine—and formatting data into a device-independent format. Doing this makes changes, testing, and other aspects easier.
Let’s take a single pan balance that has two functions: tare and get_weight. Each of those has a different command sequence of characters that are sent to the balance, and each returns either a completed code or a numerical value that maybe encoded in ASCII, BCD, or binary depending on the vendor’s choice. If the commands to work with the balance are scattered throughout a program, you have a lot of changes to find, make, test, and certify as working. Device-independent programming puts them in two areas: one for the tare command and one for the get_weight command, which returns a floating-point value (e.g., 1.67).
If you have to replace the device with a new one, the command codes are changed in two places, and the returned numeric code reformatted into a standard floating point value in one place. The rest of the program works with the value without any concern for its source. That allows for a lot of flexibility in choosing balances in the lab, as different units can be used for different applications with minor software adjustments.
As noted when we first started talking about this goal, the ability to support a product has to be designed and built in, not added on. The issues can be difficult enough when you are working with one vendor. When a second or third vendor is added to the mix, you have an entirely new level of issues to deal with. This is a matter of engineering, not just science. Supportable systems and methods have to be designed, documented, engineered, and validated to be supportable. A system or method isn’t supportable simply because its individual components or steps are.
Seventh goal: Addressing systems integration and harmonization
In the 1960s, audio stereo systems came in three forms: a packaged, integrated product that combined an AM/FM radio tuner, turntable, and speakers; all those components but purchasable individually; and a do-it-yourself, get-out-the-soldering-iron format. The integrated products were attractive because you could just plug them into a power source and they worked. The packaging was attractive, and you didn’t have to know much beyond choosing the functions you wanted to use. In terms of component quality, that was up to the manufacturer; it was rarely top-quality as the components still met the basic needs of the application at a particular price point.
Component systems appealed to a different type of customer. They wanted to pick the best components that met their budgets, trading off the characteristics of one item against another, always with the idea of upgrading elements as needed. Each components manufacturer would guarantee that their product worked, but making the entire system work was your issue. In the 1960s, everything was analog so it was a matter of connecting wires. Some went so far as to build the components from kits (lower cost), or design something and get it to work just for the fun of it. HDTV’s have some of the same characteristics as component systems, as they can work out of the box, but if you want better sound or want to add a DVD or streaming box, you have to make sure that the right set of connectors exist on the product and that there is enough of them.
In the product cases noted above, there are a limited set of choices for mixing components, so user customization isn't that much of a problem. Most things work together from a hardware standpoint, but software apps are another matter. The laboratory world isn’t quite so neat and tidy.
From a lab standpoint, integrated systems are attractive for a number of reasons:
- It suggests that someone has actually thought about how the system should function and what components need to be present, and they put a working package together (may be more than one component).
- It’s installed as a package: when the installation is done, all of it works, both hardware and software.
- It’s been tested as a system and all the components work together.
- You have a single point of contact for support.
- If an upgrade occurs, someone (hopefully) has made sure that upgrading some portions of the system doesn’t mean others are not working; an upgrade embraces the functionality of the entire system.
- The documentation addresses the entire system, from training and support to maintenance, etc.
- It should be easier to work with because the system’s functionality and organization have been thought through.
- It looks nice. Someone designed a packaged system that doesn’t have a number of separate boxes with wires exposed. In the lab, that may be pushing it.
Achieving that on a component-by-component basis may be a bit of a challenge. Obtaining an integrated system comes down to a few considerations, not the least of which is what you define as a “system.” A system is a collection of elements that are used to accomplish a task. Figure 21 shows a view of overall laboratory operations.
Laboratory operations have three levels of systems: the corporate level, the lab administrative level (darker grey, including office work), and the lab bench level. Our concern is going to be primarily with the latter two; however, we have to be aware that the results of lab work (both research and testing) will find their way into the corporate sphere. If the grey box is our system, where does integration fit in and what strategies are available for meeting that goal? Integration is about connecting devices and systems so that there is a smooth flow of command/control messages, data, and information that does not depend on human intervention. We are not, for example, taking a printout from one instrument system and then manually entering it into another; that transfer should be electronic and bi-directional where appropriate. Why is this important? Improving the efficiency of lab operations, as well as ROI, has long been on the list of desired outcomes from the use of lab automation and computing. Developing integrated systems with carefully designed mechanisms for the flow, storage, and management of K/I/D is central to achieving those goals.
There are different strategies for building integrated systems like these. One is to create the all-encompassing computer system that does and controls everything. Think HAL in the movie 2001, or popular conceptions of an advanced AI. However, aside from the pitfalls in popular sci-fi, that isn’t an advisable strategy. First, it will most likely never be finished. Trying to come up with a set of functional specifications would take years if they were ever completed. People would be constantly adding features, some conflicting, and that alone (called scope creep) would doom the process, as it has in similar situations. Even if somehow the project were completed, it would reflect the thinking of those involved at the start. In the time that the project was underway, the needs of the lab would change, and the system would be out-of-date as soon as it was turned on. If you were to develop an adaptable system, you'd again still be dealing with scope creep. Other problems would crop up too. If any component needed maintenance, the entire system could be brought to a halt and nothing would get done. Additionally, staff turn-over would be a constant source of delays as new people were brought on board and trained, and as this system would be unique, you couldn't find people with prior experience. Finally, the budget would be hard to deal with, from the initial estimate to the likely overruns.
Another approach is to redefine the overall system as a cooperative set of smaller systems, each with its own integration strategy, with the entire unit interconnected. Integrated systems in the lab world are difficult to define as a product set, since the full scope of a lab's processes is highly variable, drawing on a wide range of instruments and equipment. We can define a functional instrument system (e.g., titrators, chromatographic equipment, etc.) but sample prep variability frustrates a complete package. One place that has overcome this is the clinical chemistry market.
At this point, we have to pan back a bit and take a few additional aspects of laboratory operations into consideration. Let's look back at Figure 17 again:
We are reminded that processes are what rule lab work, and the instrumentation and other equipment play an important but subservient role. We need to define the process first and then move on from there; this is the standard validation procedure. In regards to the integration of that instrumentation and equipment, any integration has to support the entire process, not just the individual instrument and equipment packages. This is one of the reasons integration is as difficult as it is. A vendor can create an instrument package, but their ability to put components together is limited to their understanding of their usage.
Unless the vendor is trying to fit in a well-defined process with enough of a market to justify the work, there is a limit to what they can do. This is why pre-validated products are available in the clinical chemistry market: the process is pre-defined and everyone does the same thing. Note that there is nothing to prevent the same thing happening in other industries; if there were a market for a product that fully implemented a particular ASTM or USP method, vendors might take notice. There is one issue with that, however. When a lab purchases laboratory instrumentation, they often buy general-purpose components. You may purchase an infrared spectrophotometer that covers a wide spectral range that can be used in a variety of applications to justify its cost. Yet you rarely, if ever, purchase one instrument for each lab process that it might be used in unless there was sufficient demand to justify it. And that's the run: if a vendor were to create an equipment package for a specific procedure, the measuring instrument would be stripped down and tailored to the application, and it may not be usable in another process. Is there enough demand for testing to warrant the development of a packaged system? If you were doing blood work, yes, because all blood testing is done the same way; it’s just a question of whether or not your lab is getting enough samples. If it’s ASTM xxx, maybe not.
Additionally, the development of integrated systems needs to take the real world into account. Ask whether or not the same equipment can be used for different processes. If the same equipment setup is being used in different processes with different reagents or instrument setups (e.g., different columns in chromatography), developing an integrated electro-mechanical computer system may be a viable project since all the control and electro-mechanical systems would be the same. Returning to the cookie analogy, it's the same equipment, different settings, and different dough mixes (e.g., chocolate chip, sugar cookies, etc.). You just have to demonstrate that the proper settings are being used to ensure that you are getting consistent results.
If this sounds a little confusing, it’s because “integration” can occur on two levels: the movement of data and information and the movement of material. On one hand, we may be talking about integration in the context of merging the sources of data and information (where they are generated) into a common area where they can be used, managed, and accessed according to the lab's and corporate's needs. Along the way, as we’ve seen in the K/I/D discussions, different types of data and information will be produced, all of which has to be organized and coordinated. This flow is bi-directional: generated data and information in one direction, and work lists in the other. On the other hand—regarding the movement of materials—we may be talking about automated devices and robotics. Those two hands are joined at the measuring instrument.
We’ll begin by discussing the movement of data and information: integration is a matter of bringing laboratory-generated data and information into a structured system where it can be accessed, used, and managed. That “system” may be a single database or a collection of interconnected data structures. The intent in modern lab integration is that connections between the elements of that structure are electronic, not manual, and transfers may be initiated by user commands or automated processes. Yet note that someone may consider a completely manual implementation process to be “integrated,” and be willing to accept slower and less efficient data transfers (that’s what we had in the last century). However, that methodology doesn’t give us the improvements in productivity and ROI that are desired.
The idea that integration is moving all laboratory K/I/D into a structured system is considerably different than the way we viewed things in the past. Previously, the goal was to accumulate lab results into a LIMS or an ELN, with the intermediate K/I/D (e.g., instrument data files, etc.) placed in an SDMS. That was a short-sighted attempt at considering only data storage, without fully considering the topic of data utilization. The end goal of using a LIMS or ELN to accumulate lab results was still valid—particularly for further research, reporting, planning, and administrative work—but it didn’t deal effectively with the material those results were based on or the potential need to revisit that work.
In this discussion we’ll consider the functional hub of the lab to be a LIMS and/or ELN, with an optional SDMS; it’s your lab, and you get to choose. We’re referring to this as a hub for a couple of reasons. First, it is the center of K/I/D management, as well as planning and administrative efforts. Second, these should be the most stable information systems in the lab; durable and slow to change. Instruments and their data systems will change and be replaced as the lab’s operational needs progress. As a result, the hub is where planning efforts have to begin since decisions made here have a major impact on a lab's ability to meet its goals. The choice of a cloud-based system vs. an on-site system is just one factor to consider.
Historically, laboratories have begun by putting the data- and information-generating capability in place first. That’s not unusual, since new companies need data and information to drive their business development. However, what they really need is a mechanism for managing the data and information. Today, however, the place that the development of a laboratory electronic infrastructure needs to begin is with the systems that are used to collect and manage data and information. Then we can put in place the data and information generators. It’s a bit like starting a production line without considering what you’re going to do with all the material you’re producing.
The types of data and information generators can vary greatly. Examples include:
- a human-based reading that is recorded manually;
- a reading recorded by an instrument with limited storage and communication abilities, e.g., balances and pH meters;
- a reading recorded by a limited-functionality device, where data is recorded and stored but must be transmitted out of the machine to be analyzed; and
- a reading recorded by a combination instrument-computer, which has the ability to record, store, analyze, and output data in various forms.
The issue we need to deal with for each of those generators is how to plan for where the output should be stored so that it is accessible and useful. That was the problem with earlier thinking. We focused too much on where the K/I/D could be stored and maintained over the long term, but not enough on its ability to be used and managed. Once the analysis was done, we recognized the need to have access to the backup data to support the results, but not the ability to work with it. The next section will look at some of the ramifications of planning for those data types.
Planning the integration of your data generators
There are several ramifications of planning for your data generators that need to be discussed. Before we begin, though, we need to add two additional criteria for the planning stage:
- You should avoid duplication of K/I/D unless there is a clear need for it (e.g., a backup).
- In the progression from sample preparation to sample processing, to measurement, to analysis, and then to reporting, there should not be any question of both the provenance and the location of the K/I/D generated from that progression.
That said, let's look at each generator in greater detail to better understand how we plan for their integration and the harmonization of their resulting K/I/D.
1. A human-based reading that is recorded manually, or recorded from an instrument with limited storage and communication abilities
Examples: The types of devices we are looking at for these generators are balances, pH meters, volt meters, single-reading spectrophotometers, etc.
Method of connection: Both the manual and digital generators don’t leave much choice: the integration is direct data entry into a hub system unless they are being used as part of a LES. Manual modes mean typing (with data entry verification), while digital systems provide for an electronic transfer as the method of integration.
Issues: There are problems with these generators being directly tied to a hub component (see “A,” Figure 22). Each device or version has its own communications protocol, and the programming is specific to that model. If the item is replaced, the data transfer protocols may differ and the programming has to change. These devices are often used for single readings or weighing a sample, with the result stored in the hub, to be use in later calculations or reporting. However, even though these are single-reading instruments, things can get complicated. Problems may crop up if their measurements are part of a time series, require a weight measurement at specific time intervals, are used for measuring the weights of similar items in medical tablet uniformity testing, or used for measuring pH during a titration. Those applications wouldn’t work well with a direct connection to a hub and would be better served through a separate processor (see “B,” Figure 22) that builds a file of measurements that could be processed, with the results sent to the hub. This creates a need to manage the file and its link to the transmitted results. An SDMS would work well, but the hub system just became more complex. Instead of a device being directly connected to a hub, we have an intermediate system connected to an SDMS and the HUB. Integration is still easily feasible, but more planning is required. Should the intermediate system take on the role of an SDMS (all the files are stored in its file structure), you would also have to provide backup and security facilities to ensure that the files weren’t tampered with and secured against loss. The SDMS would be responsible for entering the results of the work into the hub. (Remember that access to the files is needed to support any questions about the results; printed versions would require re-entering the data to show that the calculations were done properly, which is time-consuming and requires verification.)
2. A reading recorded by a limited-functionality device
Examples: Measurements are made on one or more samples, with results stored locally, which have to be transmitted to another computer for processing or viewing. Further use may be inhibited until that is done (e.g., the device may have to transmit one set of measurements before the next set can begin). Such devices include microplate readers and spectrophotometers. Some devices can be operated manually from front panel controls, or via network connections through a higher-level controller.
Method of connection: Some devices may retain the old RS-232/422 scheme for serial transmission of measurements and receiving commands, though most have transitioned to USB, Ethernet, or possibly wireless networking.
Issues: Most of these devices do not produce final calculated results; that work is left to an intermediate process placed between the device and the hub system (Figure 23). As a result, integration depends on those intermediate processes controlling one or more devices, sometimes coordinating with other equipment, calculating final results, and communicating them to the hub system. Measurement files need to be kept in electronic form to make them easier to back up, copy, transmit, and generally work with. If only printed output is available, it should be scanned and an equivalent machine-readable version created and verified. Each experimental run, which may include one or more samples, should have the associated files bundled together into an archive so that all data is maintained in one place. That may be the intermediate processor’s storage or an SDMS with the appropriate organization and indexing capabilities, including links back from the hub. The electronic files may be used to answer questions about how results were produced, re-run an analysis using the original or new set of algorithms, or have the results analyzed as part of a larger study.
3. A reading recorded by a combination instrument-computer
Examples: These generators include one-to-one instrument-to-instrument data systems (IDS), many-to-one instrument-to-computer systems, NMR spectroscopy, chromatography, mass spectrometry, thermal analysis, spectrophotometers, etc.
Method of connection: There are several means of connection: 1) detector to computer (A/D), 2) computer to instrument control and accessory devices such as autosamplers (via, e.g., digital I/O, USB), and 3) computer to centralized hub systems (via, e.g., USB, Ethernet, wireless networks). Integration between the IDS is accomplished through vendor-supported application programming interfaces (APIs) on both sides of the connection.
Issues: The primary with these generators is managing the data structures that consist of captured detector output files, partially processed data (e.g., descriptors such as peak size, width, area, etc.) computed results, sample information, worklists, and processing algorithms. Some of this material will get transmitted to the hub, but the hub isn't generally designed to incorporate all of it. A portion of it— the content of a printed report, for example—could be sent to an SDMS. However, the bulk of it has to stay with the IDS since the software needs to interpret and present the sample data contents; the data files by themselves are useless without software to unpack and make sense of them.
From a planning standpoint, you want to reduce the number of IDSs as much as possible. While chromatography is presumably the only technique that offers a choice between one-to-one and many-to-one instruments to computers, hopefully over time that list will expand to provide better data management. Consider three chromatographs, each with its own IDS. If you are looking for data, you have three systems to check, and hopefully each has its own series of unique sample IDs. Three instruments on one IDS is a lot easier to manage and search. You also have to consider backups, upgrades, general maintenance, and cost.
Moving the instrument data files to an SDMS may not be effective unless the vendor has made provision for it. The problem is data integrity. If you have the ability to move data out of the system and then re-import it, you open up the possibility of importing data that has been edited. Some vendors prohibit this sort of activity.
The above looked at generator types in isolation; however, in reality, devices and instruments are used in combinations, each producing results that have to be maintained and organized. We must look at the data sets that are generated in the course of executing an experiment or method.
The procedures associated with an experiment or method can be executed three ways: manually, using a LES, or using a robotics implementation. The real world isn’t so neatly separated; manual and LES implementations may have some steps that use automated tools. The issue we need to address in planning is the creation of an “experiment data set” that brings all the results produced into one package. Should questions arise about an experiment, you have a data set that can be used as a reference. That “package” may be pages in a notebook, a word processor file, or some other log. It should contain all data recorded during a procedure, or, in the case of IDS capture data, file references or pointers to that instrument data or information. You want to be able to pull up that record and be able to answer any questions that may arise about the work.
All of that may seem pretty obvious, but there is one point that needs to be addressed: the database structure, including hub systems, IDS file structures, and SDMS all have to be well defined before you accumulate a number of experiment packages. You don’t want to find yourself in a situation where you have a working system of data or information storage and then have to make significant changes to it. That could mean that all previous packages have to be updated to reflect the new system, or, worse, have to deal with an “old” and “new” system of managing experimental work.
LES systems come in two forms: stand-alone software packages, and script-based systems that are part of a LIMS or ELN. The stand-alone systems should produce the experiment record automatically with all data and pointers to IDS captured data or information. For script-based systems, the programming for the LES function has to take that into account. As for laboratory robotics, they can be viewed as an extension of a LES: instead of a person following instructions, a robot or a collection of robotic components follows its programming to carry out a process. Developing an experimental record is part of that process.
The bottom line in all of this is simple: the management architecture for your K/I/D has to be designed deliberately and put in place early in a lab's development. If it is allowed to be created on an as-needed basis, the resulting collection of computers and storage will be difficult to maintain, manage, and expand in an orderly fashion. At some point, someone is going to have to reorganize it, and that will be an expensive and perhaps painful process.
Harmonization is a companion goal to integration. Approached with the right mindset it can reduce:
- installation costs,
- support costs,
- education and training requirements, and
- development effort.
Harmonization efforts, if used inappropriately, can create strife, increasing inter-departmental friction and conflict. The general idea of harmonization is to use common hardware and software platforms to implement laboratory systems, while ensuring that move toward commonality doesn’t force people to use products that are force-fits, that don’t really meet the lab's needs, but serve some other agenda. The purpose of computing systems is to help people get their work done; if they need a specific product to do it, end of story. If it can be provided using common hardware and software platforms, great, but that should not be a limiting factor. If used as a guide in the development of database systems, harmonization can make it easier to access laboratory K/I/D across labs. It may slow down implementations because more people’s opinions have to be taken into account, but the end result will be the ability to gain more use out of your K/I/D.
Figure 25 shows a common LIMS server supporting three different labs. Each lab has its own database structure, avoiding conflicts and unnecessary compromises in the conduct of lab work. It does benefit reduced implementation costs and support costs. While some vendors support this, others may not; see if they are willing to work a deal since there are multiple labs systems involved. If we couple this with the common structure definitions of K/I/D noted earlier, accessing information across labs will be more productive.
An alternative is to force everyone into one data structure, usually to reduce costs. Savings on licensing costs may be offset by development delays as multiple labs resolve conflicts in database organization, security, access control, etc. In short, keep it simple; things will work smoother and in the long run be less costly from an implementation, maintenance, and support perspective. If there is a need or desire to go through the databases for accounting purposes or other organizational requirements, the necessary material can be exported into another file structure that can be analyzed as needed. This provides a layer of security between the lab and the rest of the organization. It’s basically a matter of planning how database contents are being managed with the lab and what has to be accessed from other parts of the organization.
Part of harmonization planning process involves examining how computers are paired with instruments. You may not have multiple instances of higher-priced equipment such as mass spectrometers, NMRs, or other instruments, and having a computer dedicated to each device makes sense. However there is one instrument that you may have several of: chromatographs. You can purchase a computer for each instrument, but in this case, most CDS can support multiple instruments (Figure 26).
There are advantages to having multiple instruments on one computer:
- You have only one system to support, maintain, and back up.
- All the K/I/D is in one system.
- The qualification or validation process is performed once, rather than having it repeated for each system.
- Overall cost is reduced.
This "multiple instruments to one computer" configuration is the result of a low data collection rate, the modest computing requirements needed to process the instrument data, and user demands on the vendors. Given the developments in computing power and distributed data acquisition and control, this many-to-one configuration should be extended to other instrument techniques, reducing costs and bringing more efficiency to the management of K/I/D.
Regarding computer systems...
Harmonization doesn't mean that everything should run the same OS or the same version of the OS. It means doing it where possible, but not at the expense of doing lab work effectively.
With the wide diversity of products in the laboratory market, you’re going to find a mix of large and small vendors. Some may be small, growing companies that are managed by a few people, and as a result, keeping up with the latest versions of operating systems and underlying software may not be critical if it doesn’t affect their product's usability or performance. Their product certification on the latest version of a software platform may lag larger vendors. That means that requiring all systems to be at the same operating system level isn’t realistic. Upgrading the OS may disable the software that lab personnel depend upon.
Regarding the data...
During November 19-20, 2019 Pharma IQ’s Laboratory Informatics Summit held a meeting on "Data Standardization for Lab Informatics." The meeting highlighted the emerging FAIR Guiding Principles, which state that K/I/D should be findable, accessible, interoperable, and reusable (FAIR). The point of mentioning this is to highlight the growing, industry-wide importance of protecting the value of the K/I/D that you are collecting. No matter how much it costs to produce, if you can’t find the K/I/D you need, it has no value because it isn’t usable. The same holds true if the data supporting information can’t be found.
Utilization is at the core of much of what we’ve been discussing. Supporting the FAIR Guiding Principles should be part of every discussion about products and what they produce, how the database is designed, and what the interoperability between labs in your organization looks like.
Another aspect of this subject is harmonizing data definitions across your organization. The same set of terms should be used to describe an object or aspect, and their database representation should be compatible, etc. The point is to make it easier to find something and make use of it.
Putting this all to use
How do you apply all of this to your new lab (an easier task) or existing lab (more challenging)? This is going to be a broad-brush discussion since every lab has their own way of handling things, from its overall mission to its equipment and procedures, so you’re going to have to take these points and adjust them to fit your requirements.
To start, assume you have a hub system (either a LIMS or ELN) as the center of gravity for all your K/I/D collection. You build your lab's K/I/D management infrastructure from this center of gravity outward; effectively everything revolves around the hub and radiates out from it.[l]
For each K/I/D generator, ask:
- What does it produce, which of the K/I/D generator types noted earlier matches, and which model is appropriate?
- Does it generate a file that has to be processed, or is it the final measurement?
- Does the device or the system supporting it have all the information needed to move it on to the next phase of the process? For example if the device is a pH meter, what is going to key the result into the next step? It will need a sample or experiment reference ID so that it knows where the result should go.
For each device output, ask:
- What happens to the generated K/I/D and how is it used? And remember, nothing should ever get deleted.
- Is the device output a single measurement or part of a set? Will it be combined with measurements from other devices, sample IDs, and calibration information?
- Where is the best place to put it? In an intermediate server, SDMS, or hub?
- If it is a final result of that process, should it be in the hub?
- If it is an intermediate file or result, then where?
- How might it be used in the future and what is the best way to prepare for that? Files may need to be examined in audits, transferred to another group or organization[m], or recalculated with new algorithms. Does your system provide trace-back from final results to source data?
For all devices, ask:
- Does every device that has storage and communications capability have back up procedures put in place?
Depending on your point of view, whether it is on the science, laboratory operations management, or lab administration, your interest in lab computing may range from “necessary evil” to “makes life easier” to “needed to make the lab function,” or some other perspective. You may be of the opinion that all this is interesting but not your responsibility. If not yours, then who? That topic will be covered in the next write-up.
Laboratory systems engineers
If you believe that the technology planning and management considerations noted so far in this guide are important to your laboratory, it's time to ask to whom that responsibility falls upon?
The purpose of this guide has been to highlight that the practice of science has changed, become more complex, and become more dependent on technologies that demand a lot of attention. Those technologies are not only the digital systems we’ve covered, but also the scientific methodologies and instrumentation whose effective use can take—through increasing specialization and depth of material—an entire career to learn and apply. The user of scientific computing typically views it as a tool for getting work done and not another career. Once upon a time, the scientist knowledgeable in both laboratory work and computing was necessary; if you wanted to use computers you had to understand how they worked. Today, if you tried to do that, you’d find yourself spread thin across your workload, with developments happening faster in science and computers than you a single individual can keep up with.
Let's look at what a laboratory scientist would need to be able to do in order to also support their laboratory's scientific computing needs, in addition to their normal tasks (Figure 27).
On top of those tasks, the lone scientist would also have to have the following technological knowledge and personal capabilities (Figure 28):
Looking at these two figures, we're realistically considering two levels of expertise: a high, overview level that can look at the broader issues and see how architectures can be constructed and applied, and specialists in areas such as robotic, etc. However, the current state of undergraduate—and to a lesser extent graduate—education doesn’t typically have room for the depth of course work needed to cover the material noted above. Expanding your knowledge base into something that is synergistic with your current course work is straightforward; doing it with something that is from a separate discipline creates difficulties. Where do digital systems fit into your life and laboratory career?
Let's look at what the average laboratory scientist does today. Figure 29 shows the tasks that are found in modern laboratory operations in both research and testing facilities.
Now let's compare that set of tasks in Figure 29 with the task emphasis provided in today's undergraduate laboratory science courses and technology courses (Figure 30 and 31).
In cases where science students have access to instrumentation-computer systems, the computers are treated as “black boxes” that acquire the data (data capture), process it (data processing) and report it. How those things happen is rarely if ever discussed, with no mention of analog-digital converters, sampling rates, analysis algorithms, etc. “Stuff happens,” yet that “stuff,” if not properly ran with tested parameters, can turn good bench science into junk data. How would they know? Students may or may not get exposure to LIMS or ELN systems even though it would be useful for students to capture and work with their lab results, but schools may not be willing to invest in them.
IT students will be exposed to data and information management through database courses, but not at the level that LIMS and ELNs require (e.g., instrument communications and control); the rest of the tasks in Figure 31 is practically unknown to them. They’d be happy to work on the computer in Figure 31, but the instrument and the instrument connections—the things that justify the computer's role—aren’t something they’d be exposed to.
What we need are people with a foot in both fields, able to understand and be conversant in both the laboratory science and IT worlds, relating them to each other to the benefit of lab operation effectiveness while guiding IT in performing their roles. We need “laboratory systems engineers” (LSEs).
Previously referred to as "laboratory automation engineers" (LAEs) and "LAB-IT" specialists, we now realize both titles fall short of the mark. "Laboratory automation engineer" emphasizes automation too strongly when the work is much broader than that. And "LAB-IT" is a way of nudging IT personnel into lab-related work without really addressing the full scope of systems that exist in labs, including robotics and data acquisition and control.
Laboratory information technology support differs considerably from classical IT work (Figure 32). The differences are primarily two-fold. First, the technologies used in lab work, including those in which instruments are attached to computers and robotics, are different than those commonly encountered in classical IT. The computers are the same, but the added interface and communications requirements imposed by instrument-computer connections change the nature of the work. When troubleshooting, it can be difficult to separate computer issues from those resulting from the connection to instruments and digital control systems. Second, the typical IT specialist, maybe straight out of school, doesn’t have a frame of reference for understanding what they are dealing with in a laboratory setting. The work is foreign, the discussions involve terminology they may not understand, and there may be no common ground for discussing problems. In classical IT, the IT personnel may be using the same office software as the people they support, but they can't say the same for the laboratory software used by scientists.
Having noted the differences between classing IT and laboratory IT, as well as the growing need for competent LSEs, we need to take a closer look at some of the roles that classic IT and LSE personnel can take. Figure 33 provides a sub-set of the items from Figure 32 and reflects tasks that IT groups could be comfortable with. Typical IT backgrounds with no lab tech familiarity won’t get you beyond the basic level of support. To be effective, IT personnel need to become familiar with the lab environment, the applications and technologies used, and the language of laboratory work. It isn’t necessary for IT support to become experts in instrumental techniques, but they should understand the basic "instrument to control system to computer" model as well as the related database applications, to the point where they can provide support, advise people on product selections, etc. We need people who can straddle the IT-laboratory application environment. They could be lab people with an interest in computing or IT people with a strong interest in science.
There are ways of bridging that education gap (Figure 34), but today they depend upon individual initiative more than corporate direction to educate people to the level needed. On-the-job training is not an effective substitute for real education; on the surface it is cheaper, but you lose out in the long run because people really don’t understand what is going on, which limits their effectiveness and prevents them from being innovative or even catching problems in the early stages before they become serious. A big issue is this: due to a lack of education, are people developing bad K/I/D and aren't aware of it? The problem isn’t limited to the level of systems we are talking about here. It also extends to techniques such as pipetting.
It is also a matter of getting people to understand the breadth of material they have to be familiar with. In 2018, a webinar series was created (Figure 35) to educate management on the planning requirements for implementing lab systems. The live sessions were well attended. The chart shows the viewing rate for the individual topics through early December 2020. Note that the highest viewed items were technology-specific; people wanted to know about LIMS, ELN, etc. The details about planning, education, support, etc. haven’t received near the amount of attention they need. People want to know about product classes but aren’t willing to learn about what it takes to be successful. Even if you are relying on vendors or consultants, lab management is still accountable for the success of planning, implementation, and effectiveness of lab systems.
Prior to the COVID-19 pandemic of 2020, undergraduate education depended on the standard model of in-person instruction. With the challenges of COVID-19 spreading, online learning took on a new importance and stronger acceptance, building on the ground established by online universities and university programs. This gives us an acceptable model for two types of course development: a fully-dedicated LSE program or an expanded program that would expand student’s backgrounds in both the laboratory sciences and IT. One issue that would need to be addressed, however, is bridging the gap between presentation material and hands-on experience with lab systems. Videos and evaluation tests will only get you so far; you need the hands-on experience to make it real and provide the confidence that what you've learned can be effectively applied.
There are several steps that can be taken to build an LSE program. The first is to develop a definition of a common set of skills and knowledge that an LSE should have, recognizing that people will come from two different backgrounds (i.e., laboratory science and IT), and those have to be built up to reach a common balanced knowledge base. Those with a strong laboratory science background need to add information technology experience, while those from IT will need to gain an understanding of how laboratory science is done. Remember, however, that those from IT experiences don’t need to be educated in chemistry, biology, and physics, etc. After all, they aren’t going to be developing methods; they will be helping to implement them. There are things common to all sciences that they need to understand such as record keeping, the workflow models of testing and research, data acquisition and processes, instrumentation, and so on. That curriculum should also help people who want to specialize in particular subject areas such as laboratory database systems, robotics, etc. The second step is to build a curriculum that allows students to meet those requirements. This requires solid forethought in the development and curation of course materials. A lot of material already exists and is spread over the internet on university, government, and company web sites. A good first step would be to collect and organize those references into a single site (the actual courses need not be moved to the site, just their descriptions, access requirements, and links). Presentation and organization of the content is also important. Someone visiting the site will need a guide of what LSE is about, how to find material appropriate for different subject areas, and how to get access to it. Consider your site audience a visitor that knows nothing about the field: where do they start and how do we facilitate their progress? Providing clear orientation and direction are key. First give them an understanding of what LSE is all about, and then a map to whatever interests them. With the curriculum built, you can then identify areas that need more material and then move to further develop the program. Of course, you'll also want to make it possible to take advantage of online demonstration systems and simulators to give people a feel for working with the various laboratory systems. This is a half-step to what is needed: there’s no substitute for hands-on work with equipment.
As it stands today, we’ve seemingly progressed from manual methods, to computer-assisted methods, and then to automated systems in the course of developing laboratory technologies over the years, and yet our educational programs are a patchwork of courses largely driven by individual needs. We need to take a new look at lab technologies and their use and how best to prepare people for their work with solid educational opportunities.
This guide has addressed the following:
- Why technology planning and management needs to be addressed: Because integrated system need attention in their application and management to protect electronic laboratory K/I/D, ensure that it can be effectively used, and ensure that the systems and products put in place are both the right ones, and that they fully contribute to improvements in lab operations.
- What's changed about that planning and management since the introduction of computers in the lab: As technology in the lab expanded, we lost the basic understanding of what the new computer and instrument system were and what they did, that they had faults, and that if we didn’t plan for their effective use and counter those faults, we were opening ourselves to unpleasant surprises. The consequences at times were system crashes, lost data, and a lack of a real understanding of how the output of an instrument was transformed into a set of numbers, which meant we couldn’t completely account for the results we were reporting. A more purposeful set of planning and management activities, at the earliest point possible, have become increasingly more important.
- Why developing an environment that fosters productivity and innovation is important: Innovation doesn’t happen in a highly structured environment: you need the freedom to question, challenge, etc. You also need the tools to work with. The inspiration that leads to innovation can happen anywhere, anytime. All of a sudden all the pieces fit. This requires flexibility and trust in people, an important part of corporate culture.
- Why developing high-quality K/I/D is desirable: There are different types of data structures that are used in lab work, and careful attention is needed to work with and manage them. This includes the effective management of K/I/D, putting it in a structure that encourages its use and protects its value. When methods are proven and you have documented evidence that they were executed by properly educated personnel using qualified reagents, instruments, and methods, you should then have high-quality K/I/D to support each sample result and any other information gleaned from that data.
- Why fostering a culture around data integrity is important to lab operations, addressing both technical and personnel issues: Positive outcomes will come from your data integrity efforts: your work will be easier and protected from loss, results will be easier to organize and analyze, and you’ll have a better functioning lab. You’ll also have fewer unpleasant surprises when technology changes occur and you need to transition from one way of doing things to another.
- How to address digital, facility, and backup security: Preventing unauthorized electronic and physical intrusion is critical to data integrity and meeting regulatory requirements. It also ensures that access to K/I/D is protected against loss from a wide variety of threats to the organization's facilities, all while securing your ability to work. This included addressing power backup, continuity of operations, systems backup, and more.
- How to acquire and develop "products" that support regulatory requirements: Careful engineering and well-planned and -documented internal processes are needed to ensure that systems and methods that are being used can remain in use and be supported over the life span of a lab. This means recognizing the initial design and planning of processes and methods has to be done well for a supportable product, and keeping in mind the potential for future process review and modification even as the initial process or method is being developed. Additionally, the lab must also recognize the complete product life cycle and how that affects the supportability of systems and methods.
- The importance of system integrations and the harmonization of K/I/D: Integrated systems can benefit a lab's operations and the planning needed to work with different types of database systems as the results of lab work becoming more concentrated in LIMS and ELNs, including making decisions about how K/I/D is stored and distributed over multiple databases. At the same time, harmonization efforts using common hardware and software platforms to implement laboratory systems is important, but those efforts must also ensure that the move toward commonality doesn’t force people to use products that are forced fits, that don’t really meet the lab's needs, but serve some other agenda.
- Why the development of comprehensive higher-education courses dedicated to the laboratory systems engineer or lab science-IT hybrid is a must with today's modern laboratory technology: In today's world, typical IT backgrounds with no lab tech familiarity, or typical laboratory science backgrounds with no IT familiarity won’t get you beyond the basic level of support for your laboratory systems. To be effective, IT personnel need to become familiar with the lab environment, the applications and technologies used, and the language of laboratory work, while scientists must become more familiar with the management of K/I/D from the technical perspective. This gap must be closed through new and improved higher-education programs.
And thus we return back to the start to close this guide. First, there's a definite need for better planning and management of laboratory technologies. Careful attention is required in to protect electronic laboratory knowledge, information, and data (K/I/D), ensure that it can be effectively used, and ensure that the systems and products put in place are both the right ones, and that they fully contribute to improvements in lab operations. Second, seven clear goals highlight this apparent need for laboratory technology planning and management and improve how it's performed. From supporting an environment that fosters productivity and innovation all the way to ensuring proper systems integration and harmonization, planning and management is a multi-step process with many clear benefits. And finally, there's a definitive need for more laboratory systems engineers (LSEs) who have the education and skills needed to accomplish all that planning and management in an effective manner, from the very start. This will require a more concerted effort in academia, and perhaps even among professional organizations catering to laboratories. All of this together hopefully means a more thoughtful, modern, and deliberate approach to implementing laboratory technologies in your lab.
Abbreviations, acronyms, and initialisms
AI: Artificial intelligence
ALCOA: Attributable, legible, contemporaneous, original, and accurate
API: Application programming interface
CDS: Chromatography data system
CPU: Central processing unit
ELN: Electronic laboratory notebook
EPA: Environmental Protection Agency
FAIR: Findable, accessible, interoperable, and reusable
FDA: Food and Drug Administration
FRB: Fast radio bursts
IT: Information technology
ISO: International Organization for Standardization
K/D/I: Knowledge, data, and information
LAB-IT: Laboratory information technology support staff
LAE: Laboratory automation engineering (or engineer)
LES: Laboratory execution system
LIMS: Laboratory information management system
LIS: Laboratory information system
LOF: Laboratory of the future
LSE: Laboratory systems engineer
ML: Machine learning
OS: Operating system
QA/QC: Quality assurance/quality control
ROI: Return on investment
SDMS: Scientific data management system
SOP: Standard operating procedure
TPM: Technology planning and management
- See Elements of Laboratory Technology Management and the LSE material in this document.
- See the "Scientific Manufacturing" section of Elements of Laboratory Technology Management.
- By “general systems” I’m not referring to simply computer systems, but the models and systems found under “general systems theory” in mathematics.
- Regarding LAB-IT and LAEs, my thinking about these titles has changed over time; the last section of this document “Laboratory systems engineers” goes into more detail.
- “Real-time” has a different meaning inside the laboratory than it does in office applications. Instead of a response of a couple seconds between an action and response, lab “real-time” is often a millisecond or faster precision; missing a single sampling timing out of thousands can invalidate an entire sample analysis.
- See Notes on Instrument Data Systems for more on this topic.
- For a more detailed description of the K/I/D model, please refer to Computerized Systems in the Modern Laboratory: A Practical Guide.
- For more detailed discussion on this, see Notes on Instrument Data Systems.
- For more information on virtualization, particularly if the subject is new to you, look at Next-Gen Virtualization for Dummies. The for Dummies series is designed to educate people new to a topic, getting away from jargon and presenting material in clear, easy-to-understand language. This book is particularly good at that.
- One good reference on this subject is a presentation Building A Data Integrity Strategy To Accompany Your Digital Enablement by Julie Spirk Russom of BioTherapeutics Pharmaceutical Science.
- Though it may not see significant updates, consider reading the Comprehensive Guide to Developing and Implementing a Cybersecurity Plan for a much more comprehensive look at security in the lab.
- Yes the scientific work you do is essential to the lab’s purpose, but our focus is on one element of the lab’s operations: what happens after the scientific work is done.
- For example, a product line is sold to another company or transferred to another division, and they then want copies of all relevant information. Meeting regulatory requirements is another example.
Initially educated as a chemist, author Joe Liscouski (joe dot liscouski at gmail dot com) is an experienced laboratory automation/computing professional with over forty years of experience in the field, including the design and development of automation systems (both custom and commercial systems), LIMS, robotics and data interchange standards. He also consults on the use of computing in laboratory work. He has held symposia on validation and presented technical material and short courses on laboratory automation and computing in the U.S., Europe, and Japan. He has worked/consulted in pharmaceutical, biotech, polymer, medical, and government laboratories. His current work centers on working with companies to establish planning programs for lab systems, developing effective support groups, and helping people with the application of automation and information technologies in research and quality control environments.
- Bourne, D. (2013). "My boss the robot". Scientific American 308 (5): 38–41. doi:10.1038/scientificamerican0513-38. PMID 23627215.
- Cook, B. (2020). "Collaborative Robots: Mobile and Adaptable Labmates". Lab Manager 15 (11): 10–13. https://www.labmanager.com/laboratory-technology/collaborative-robots-mobile-and-adaptable-labmates-24474.
- Hsu, J. (24 September 2018). "Is it aliens? Scientists detect more mysterious radio signals from distant galaxy". NBC News MACH. https://www.nbcnews.com/mach/science/it-aliens-scientists-detect-more-mysterious-radio-signals-distant-galaxy-ncna912586. Retrieved 04 February 2021.
- Timmer, J. (18 July 2018). "AI plus a chemistry robot finds all the reactions that will work". Ars Technica. https://arstechnica.com/science/2018/07/ai-plus-a-chemistry-robot-finds-all-the-reactions-that-will-work/5/. Retrieved 04 February 2021.
- "HelixAI - Voice Powered Digital Laboratory Assistants for Scientific Laboratories". HelixAI. http://www.askhelix.io/. Retrieved 04 February 2021.
- Liscouski, J.G. (2006). "Are You a Laboratory Automation Engineer?". SLAS Technology 11 (3): 157-162. doi:10.1016/j.jala.2006.04.002.
- Liscouski, J. (2015). "Which Laboratory Software Is the Right One for Your Lab?". PDA Letter (November/December 2015): 38–41. https://www.researchgate.net/publication/291971749_Which_Laboratory_Software_is_the_Right_One_for_Your_Lab.
- "Data integrity". Wikipedia. 03 February 2021. https://en.wikipedia.org/wiki/Data_integrity. Retrieved 07 February 2021.
- Harmon, C. (20 November 2020). "What Is Data Integrity?". Technology Networks. https://www.technologynetworks.com/informatics/articles/what-is-data-integrity-343068. Retrieved 07 February 2021.
- Chrobak, U. (17 August 2020). "The US has more power outages than any other developed country. Here’s why". Popular Science. https://www.popsci.com/story/environment/why-us-lose-power-storms/. Retrieved 08 February 2021.
- Tulsi, B.B. (04 September 2019). "Greater Awareness and Vigilance in Laboratory Data Security". Lab Manager. https://www.labmanager.com/business-management/greater-awareness-and-vigilance-in-laboratory-data-security-776. Retrieved 09 February 2021.
- Riley, D. (21 May 2020). "GitLab runs phishing test against employees – and 20% handed over credentials". Silicon Angle. https://siliconangle.com/2020/05/21/gitlab-runs-phishing-test-employees-20-handing-credentials/. Retrieved 09 February 2021.
- "Automated Sample Preparation". Fluid Management Systems, Inc. https://www.fms-inc.com/sample-prep/. Retrieved 09 February 2021.
- "PAL Auto Sampler Systems". Agilent Technologies, Inc. https://www.agilent.com/en/product/gas-chromatography/gc-sample-preparation-introduction/pal-auto-sampler-systems. Retrieved 09 February 2021.
- Bradshaw, J.T. (30 May 2012). "The Importance of Liquid Handling Details and Their Impact on Your Assays" (PDF). European Lab Automation Conference 2012. Artel, Inc. https://d1wfu1xu79s6d2.cloudfront.net/wp-content/uploads/2013/10/The-Importance-of-Liquid-Handling-Details-and-Their-Impact-on-Your-Assays.pdf. Retrieved 11 February 2021.