Difference between revisions of "Journal:Evaluating health information systems using ontologies"

From LIMSWiki
Jump to navigationJump to search
(Added content. Saving and adding more.)
(Added content. Saving and adding more.)
Line 89: Line 89:


===The UVON Method for unifying the evaluation aspects===
===The UVON Method for unifying the evaluation aspects===
Methodical capture of a local ontology<ref name="UscholdCreat00">{{cite book |chapter=Creating, integrating and maintaining local and global ontologies |title=Proceedings of the First Workshop on Ontology Learning (OL-2000) |author=Uschold, M. |publisher=CEUR Proceedings |volume=31 |year=2000 |location=Berlin}}</ref> from the quality attributes, that is, evaluation aspect ontology and reaching unification by the nature of its tree structure is the primary strategy behind our method. Therefore, the UVON method is introduced, so named to underline Unified eValuation of aspects as the target and ONtology construction or integration as the core algorithm. The ontology construction method presented in this paper is a simple, semiautomated method, configured and tested against FI-STAR project use cases. The UVON method does not try to introduce a new way of ontology construction; rather, it focuses on how to form a local ontology<ref name="UscholdCreat00" /><ref name="ChoiASurv06">{{cite journal |title=A survey on ontology mapping |journal=ACM SIGMOD Record |author=Choi, N.; Song, I.-Y.; Han, H. |volume=35 |issue=3 |pages=34–41 |year=2006 |doi=10.1145/1168092.1168097}}</ref> out of the quality attributes of a system and use it for the purpose of finding out what to evaluate. In this regard, the ontology construction in the UVON method is a reorganization of common practices, such as those introduced by.<ref name="NoyOnto05" />
Methodical capture of a local ontology<ref name="UscholdCreat00">{{cite book |chapter=Creating, integrating and maintaining local and global ontologies |title=Proceedings of the First Workshop on Ontology Learning (OL-2000) |author=Uschold, M. |publisher=CEUR Proceedings |volume=31 |year=2000 |location=Berlin}}</ref> from the quality attributes, that is, evaluation aspect ontology and reaching unification by the nature of its tree structure is the primary strategy behind our method. Therefore, the UVON method is introduced, so named to underline "Unified eValuation" of aspects as the target and "ONtology" construction or integration as the core algorithm. The ontology construction method presented in this paper is a simple, semiautomated method, configured and tested against FI-STAR project use cases. The UVON method does not try to introduce a new way of ontology construction; rather, it focuses on how to form a local ontology<ref name="UscholdCreat00" /><ref name="ChoiASurv06">{{cite journal |title=A survey on ontology mapping |journal=ACM SIGMOD Record |author=Choi, N.; Song, I.-Y.; Han, H. |volume=35 |issue=3 |pages=34–41 |year=2006 |doi=10.1145/1168092.1168097}}</ref> out of the quality attributes of a system and use it for the purpose of finding out what to evaluate. In this regard, the ontology construction in the UVON method is a reorganization of common practices, such as those introduced by.<ref name="NoyOnto05" />
 
The ontology structure, in its tree form, is the backbone of the UVON method. Modern ontology definition languages can show different types of relations, but for the sake of our method here, we only use the "is of type" relation, which can also describe pairs such as parent and child, superclass and subclass, or general and specific relations. This kind of relation creates a direct acyclic graph structure, which is or can be converted to a tree form. In this tree, the terms and concepts are nodes of the tree. The branches consist of those nodes connected by "is of type" relations. The tree has a root, which is the superclass, parent, or the general form of all other nodes. Traditionally, this node has been called the "thing."<ref name="NoyOnto05" />
 
Figure 1 is an example of how this ontology structure can look. All the nodes in this picture are quality attributes, except the leaf nodes at the bottom, which are instances of health information systems. While going up to the top layers in the ontology, the quality attributes become more generic, at the same time aggregating and unifying their child nodes.
 
[[File:Fig1 Eivazzadeh JMIRMedInformatics2016 4-2.png|700px]]
{{clear}}
{|
| STYLE="vertical-align:top;"|
{| border="0" cellpadding="5" cellspacing="0" width="700px"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Fig. 1''' An example snapshot of the output ontology while running the UVON method</blockquote>
|-
|}
|}
 
The UVON method is composed of three phases: α, β, and γ (Figure 2). In the first phase, all quality attributes elicited by the requirement engineering process are collected in an unstructured set that is respectively called α set. In the next phase (β), based on the α set, an ontology is developed by the UVON method, which is called β (beta) ontology. In the next step, if the ontology is extended by an external evaluation framework (as discussed in the method), then it is called γ (gamma) ontology.
 
[[File:Fig2 Eivazzadeh JMIRMedInformatics2016 4-2.png|900px]]
{{clear}}
{|
| STYLE="vertical-align:top;"|
{| border="0" cellpadding="5" cellspacing="0" width="900px"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Fig. 2''' Ontology construction for a health information system</blockquote>
|-
|}
|}
 
The β ontology construction begins with a special initial node (ie, quality attribute) that is called "thing." All the collected quality attributes are going to begin a journey to find their position in the ontology structure, beginning from the "thing" node and going down the ontology structure to certain points specified by the algorithm. This journey is actually a depth-first tree traversal algorithm<ref name="TarjanDepth72">{{cite journal |title=Depth-First Search and Linear Graph Algorithms |journal=SIAM Journal on Computing |author=Tarjan, R. |volume=1 |issue=2 |pages=146–160 |year=1972 |doi=10.1137/0201010}}</ref> with some modifications. To avoid confusion in the course of this algorithm, a quality attribute that seeks to find its position is called a "traveling quality attribute" or Q_t.
 
The first quality attribute simply needs to add itself as the child of the "thing" root node. For the remaining quality attributes, each checks to see if there exists any child of the "thing" node, where the child is a superclass (superset, super concept, general concept, more abstract form, etc) with regard to the traveling quality attribute (Q_t). If such a child node (quality attribute) exists (let’s say Q_n) then the journey continues by taking the route through that child node. The algorithm examines the children of Q_n (if any exist) to see if it is a subclass to any of them (or they are superclass to Q_t).
 
The journey ends at some point because of the following situations: If there is no child for a new root quality attribute (Q_n), then the traveling quality attribute (Q_t) should be added as a child to this one and its journey ends. That is the same if there exist children to a new root quality attribute (Q_n), but any of them is neither a superclass nor a subclass to our traveling quality attribute. Beside these two situations, it is possible that no child is a superclass, but one or more of them are the subclass of the traveling quality attribute (Q_t). In this situation, the traveling quality attribute (Q_t) itself becomes a child of that new root quality attribute, and those child quality attributes move down to become children of the traveling quality attribute (Q_t).
 
To keep the ontology as a tree, if a traveling quality attribute (Q_t) finds more than one superclass child of itself in a given situation, then it should replicate (fork) itself into instances, as many as the number of those children, and go through each branch separately. It is important to note that, logically, this replication cannot happen over two disjoint (mutually exclusive) branches. It is also possible to inject new quality attributes in between a parent node and children, but only if it does not break subclass or superclass relations. This injection can help to create ontologies in which the nodes at each level of the tree have a similar degree of generality, and each branch of the tree grows from generic nodes to more specific ones.
 
This customized depth-first tree traversal algorithm, which actually constructs a tree-style ontology instead of just traversing one, is considered semiautomated, as it relies on human decision in two cases. The first case is when it is needed to consider the superclass to subclass relations between two quality attributes. The gradual development of the ontology through the UVON method spreads the decision about superclass to subclass relations across the course of ontology construction. The unification of heterogeneous quality attributes (nodes) is the result of accumulating these distributed decisions, which are embodied as superclass to subclass relations. Each of these relations (i.e. decisions) makes at least two separate quality attributes closer together by representing them through more generic quality attributes.
 
In addition, one can inject a new quality attribute to the ontology tree, although that quality attribute is not explicitly mentioned in the requirement documents. This injection is only allowed when that quality attribute summarizes or equals a single or a few sibling quality attributes that are already in the ontology. The injection can improve clarity of the ontology. It can also help adjust the branches of the ontology tree to grow to a certain height, which can be helpful when a specific level of the tree is going to be considered as the base for creating a questionnaire. This adjustment of branch height might be needed if a branch is not tall enough to reach a specific level, meaning none of the quality attributes in that branch gets presented in the questionnaire. In addition, if a quality attribute is very specific compared with other quality attributes in that level of the tree, the questions in the questionnaire become inconsistent in their degree of generality. This inconsistency can be handled by injecting more generic quality attributes above the existing leaf node in the branch. All the previously mentioned benefits come with the cost of subjectivity in introducing a new quality attribute.
 
The γ phase ontology is constructed the same as the β phase, but it adds materials (quality attributes) from external sources. In this sense, the quality attributes specified in an external evaluation framework, probably a model-based one, should be extracted first. Those quality attributes should be fed into the β ontology the same as other quality attributes during the β phase. The UVON method does not discriminate between quality attribute by the origin, but it might be a good practice to mark those quality attributes originally from the external evaluation framework if we need later to make sure they are used by their original names in the summarizing level (to be discussed in the following paragraphs).
 
Each level of the resulting ontology tree(s) — except those that are deeper than the length of the shortest branch — represents or summarizes quality attributes of the whole system in some degree of generality or specificity. That of the root node is the most general quality attribute, which is too general to be useful for any evaluation; as for the levels below, each gives a view of the quality attributes in the whole system. As each parent node represents a general form of its children, each level summarizes the level below. We refer to one of these levels of the ontology tree that is considered for creating a questionnaire as the "summarizing level."
 
The quality attributes in each of the other levels (such as L_1 in Figure 3) can be evaluation aspects (ie, the answer to "what to evaluate") that can be measured by a questionnaire or other measurement methods. In addition, depending on the measuring method, the level below the summarizing level can be used to give details for each of the evaluation aspects. The practicalities of measurement in a case determine which summarizing level to choose. Levels closer to the root can be too abstract, whereas deeper levels can be too detailed. In addition, the number of quality attributes in a level can impact which level is appropriate. In the FI-STAR project, the limitation on the number of questions in the questionnaire was a determinant for selecting the summarizing level, where only level two fit the project limitations (although level three helped to make each question more detailed). It is possible to grow a short branch by adding a chain of children that are the same as their parents to make the branch reach a specific level, thereby making that level selectable as a summarizing level.
 
[[File:Fig3 Eivazzadeh JMIRMedInformatics2016 4-2.png|700px]]
{{clear}}
{|
| STYLE="vertical-align:top;"|
{| border="0" cellpadding="5" cellspacing="0" width="700px"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Fig. 3''' More details can be evaluated by looking at deeper nodes in the ontology structure.</blockquote>
|-
|}
|}


==References==
==References==
Line 124: Line 181:


==Notes==
==Notes==
This presentation is faithful to the original, with only a few minor changes to presentation. In several cases the PubMed ID was missing and was added to make the reference more useful. The URL to the Health Information Technology Evaluation Toolkit was dead and not archived; an alternative version of it was found on the AHRQ site and the URL substituted.
This presentation is faithful to the original, with only a few minor changes to presentation. In several cases the PubMed ID was missing and was added to make the reference more useful. The URL to the Health Information Technology Evaluation Toolkit was dead and not archived; an alternative version of it was found on the AHRQ site and the URL substituted. Figure 2 has been moved up closer to its reference.


Per the distribution agreement, the following copyright information is also being added:  
Per the distribution agreement, the following copyright information is also being added:  

Revision as of 21:31, 20 June 2016

Full article title Evaluating health information systems using ontologies
Journal JMIR Medical Informatics
Author(s) Eivazzadeh, Shahryar; Anderberg, Peter; Larsson, Tobias C.; Fricker, Samuel A.; Berglund, Johan
Author affiliation(s) Blekinge Institute of Technology; University of Applied Sciences and Arts Northwestern Switzerland
Primary contact Email: shahryar.eivazzadeh [at] bth.se; Phone: 46 765628829
Editors Eysenbach, G.
Year published 2016
Volume and issue 4 (2)
Page(s) e20
DOI 10.2196/medinform.5185
ISSN 2291-9694
Distribution license Creative Commons Attribution 2.0
Website http://medinform.jmir.org/2016/2/e20/
Download http://medinform.jmir.org/2016/2/e20/pdf (PDF)

Abstract

Background: There are several frameworks that attempt to address the challenges of evaluation of health information systems by offering models, methods, and guidelines about what to evaluate, how to evaluate, and how to report the evaluation results. Model-based evaluation frameworks usually suggest universally applicable evaluation aspects but do not consider case-specific aspects. On the other hand, evaluation frameworks that are case specific, by eliciting user requirements, limit their output to the evaluation aspects suggested by the users in the early phases of system development. In addition, these case-specific approaches extract different sets of evaluation aspects from each case, making it challenging to collectively compare, unify, or aggregate the evaluation of a set of heterogeneous health information systems.

Objective: The aim of this paper is to find a method capable of suggesting evaluation aspects for a set of one or more health information systems — whether similar or heterogeneous — by organizing, unifying, and aggregating the quality attributes extracted from those systems and from an external evaluation framework.

Methods: On the basis of the available literature in semantic networks and ontologies, a method (called Unified eValuation using Ontology; UVON) was developed that can organize, unify, and aggregate the quality attributes of several health information systems into a tree-style ontology structure. The method was extended to integrate its generated ontology with the evaluation aspects suggested by model-based evaluation frameworks. An approach was developed to extract evaluation aspects from the ontology that also considers evaluation case practicalities such as the maximum number of evaluation aspects to be measured or their required degree of specificity. The method was applied and tested in Future Internet Social and Technological Alignment Research (FI-STAR), a project of seven cloud-based eHealth applications that were developed and deployed across European Union countries.

Results: The relevance of the evaluation aspects created by the UVON method for the FI-STAR project was validated by the corresponding stakeholders of each case. These evaluation aspects were extracted from a UVON-generated ontology structure that reflects both the internally declared required quality attributes in the sevem eHealth applications of the FI-STAR project and the evaluation aspects recommended by the Model for ASsessment of Telemedicine applications (MAST) evaluation framework. The extracted evaluation aspects were used to create questionnaires (for the corresponding patients and health professionals) to evaluate each individual case and the whole of the FI-STAR project.

Conclusions: The UVON method can provide a relevant set of evaluation aspects for a heterogeneous set of health information systems by organizing, unifying, and aggregating the quality attributes through ontological structures. Those quality attributes can be either suggested by evaluation models or elicited from the stakeholders of those systems in the form of system requirements. The method continues to be systematic, context-sensitive, and relevant across a heterogeneous set of health information systems.

Keywords: health information systems; ontologies; evaluation; technology assessment; biomedical

Introduction

In one aspect at least, the evaluation of health information systems matches well with their implementation: they both fail very often.[1][2][3] Consequently, in the absence of an evaluation that could deliver insight about the impacts, an implementation cannot gain the necessary accreditation to join the club of successful implementations. Beyond the reports in the literature on the frequent accounts of this kind of failure[3], the reported gaps in the literature[4], and newly emerging papers that introduce new ways of doing health information system evaluation[5], including this paper, can be interpreted as a supporting indicator that the attrition war on the complexity and failure-proneness of health information systems is still ongoing.[6] Doing battle with the complexity and failure-proneness of evaluation are models, methods, and frameworks that try to address what to evaluate, how to evaluate, or how to report the result of an evaluation. On this front, this paper tries to contribute to the answer of what to evaluate.

Standing as a cornerstone for evaluation is our interpretation of what things constitute success in health information systems. A body of literature has developed concerning the definition and criteria of a successful health technology, in which the criteria for success go beyond the functionalities of the system.[7][8] Models similar to the Technology Acceptance Model (TAM), when applied to health technology context, define this success as the end-users’ acceptance of a health technology system.[9] The success of a system, and hence, the acceptance of a health information system, can be considered the use of that system when using it is voluntary or it can be considered the overall user acceptance when using it is mandatory.[10][11]

To map the definition of success of health information systems onto real-world cases, certain evaluation frameworks have emerged.[12][6] These frameworks, with their models, methods, taxonomies, and guidelines, are intended to capture parts of our knowledge about health information systems. This knowledge enables us to evaluate those systems, and it allows for the enlisting and highlighting of the elements of evaluation processes that are more effective, more efficient, or less prone to failure. Evaluation frameworks, specifically in their summative approach, might address what to evaluate, when to evaluate, or how to evaluate.[6] These frameworks might also elaborate on evaluation design, the way to measure the evaluation aspects, or how to compile, interpret, and report the results.[13]

Evaluation frameworks offer a wide range of components for designing, implementing, and reporting an evaluation, among which are suggestions or guidelines for finding out the answer to "what to evaluate." The answer to what to evaluate can range from the impact on structural or procedural qualities to more direct outcomes such as the overall impact on patient care.[14] For example, in the STARE-HI statement, which provides guidelines for the components of a final evaluation report of health informatics, the "outcome measures or evaluation criteria" parallel the what to evaluate question.[13]

To identify evaluation aspects, evaluation frameworks can take two approaches: top-down or bottom-up. Frameworks that take a top-down approach try to specify the evaluation aspects through instantiating a model in the context of an evaluation case. Frameworks that focus on finding, selecting, and aggregating evaluation aspects through interacting with users, that is, so-called user-centered frameworks, take a bottom-up approach.

In the model-based category, TAM and TAM2 have wide application in different disciplines including health care.[7] Beginning from a unique dimension of behavioral intention to use (acceptance), as a determinant of success or failure, the models go on to expand it to perceived usefulness and perceived ease of use[15][7], where these two latter dimensions can become the basic constructs of the evaluation aspects. The Unified Theory of Acceptance and Use of Technology (UTAUT) framework introduces 4 other determinants: performance expectancy, effort expectancy, social influence, and facilitating conditions.[7] Of these, the first two can become basic elements for evaluation aspects, but the last two might need more adaptation to be considered as aspects of evaluation for a health information system.

Some model-based frameworks extend further by taking into consideration the relations between the elements in the model. The Fit between Individuals, Task and Technology model includes the "task" element beside the "technology" and "individual" elements. It then goes on to create a triangle of "fitting" relations between these three elements. In this triangle, each of the elements or the interaction between each pair of elements is a determinant of success or failure[11]; therefore, each of those six can construct an aspect for evaluation. The Human, Organization, and Technology Fit (HOT-fit) model builds upon the DeLone and McLean Information Systems Success Model[16] and extends further by including the "organization" element beside the "technology" and "human" elements.[5] This model also creates a triangle of "fitting" relations between those three elements.

Outcome-based evaluation models, such as the Health IT Evaluation Toolkit provided by the Agency for Healthcare Research and Quality, consider very specific evaluation measures for evaluation. For example, in the previously mentioned toolkit, measures are grouped in domains, such as "efficiency," and there are suggestions or examples for possible measures for each domain, such as "percent of practices or patient units that have gone paperless."[17]

In contrast to model-based approaches, bottom-up approaches are less detailed on about the evaluation aspects landscape; instead, they form this landscape by what they elicit from stakeholders. Requirement engineering, as a practice in system engineering and software engineering disciplines, is expected to capture and document, in a systematic way, user needs for a to-be-produced system.[18] The requirements specified by requirement documents, as a reflection of user needs, determine to a considerable extent what things need to be evaluated at the end of the system deployment and usage phase, in a summative evaluation approach. Some requirement engineering strategies apply generic patterns and models to extract requirements[18], thereby showing some similarity, in this regard, to model-based methods.

The advantages of elicitation-based approaches, such as requirement engineering, result from an ability to directly reflect the case-specific user needs in terms of functionalities and qualities. Elicitation-based approaches enumerate and detail the aspects that need to be evaluated, all from the user perspective. Evaluation aspects that are specified through the requirement engineering process can be dynamically added, removed, or changed due to additional interaction with users or other stakeholders at any time. The adjustments made, such as getting more detailed or more generic, are the result of new findings and insights, new priorities, or the limitations that arise in the implementation of the evaluation.

The advantages in the requirement engineering approach come at a cost of certain limitations compared with model-based methods. Most of the requirement elicitation activities are accomplished in the early stages of system development, when the users do not have a clear image of what they want or do not want in the final system.[19] However, a model-based approach goes beyond the requirements expressed by the users of a specific case by presenting models that are summaries of past experiences in a wide range of similar cases and studies.

Being case-specific by using requirement engineering processes has a side effect: the different sets of evaluation aspects elicited from each case, which can even be mutually heterogeneous. Model-based approaches might perform more uniformly in this regard, as they try to enumerate and unify the possible evaluation aspects through their models imposing a kind of unification from the beginning. However, there still exists a group of studies asking for measures to reduce the heterogeneity of evaluation aspects in these approaches.[12]

Heterogeneity makes evaluation of multiple cases or aggregation of individual evaluations a challenge. In a normative evaluation, comparability is the cornerstone of evaluation[20]), in the sense that things are supposed to be better or worse than one another or than a common benchmark, standard, norm, average, or mode, in some specific aspects. Without comparability, the evaluation subjects can, at best, only be compared with themselves in the course of their different stages of life (longitudinal study).

In health technology, the challenge of heterogeneity for comparing and evaluation can be more intense. The health technology assessment literature applies a very inclusive definition of health technology, which results in a heterogeneous evaluation landscape. The heterogeneity of evaluation aspects is not limited to the heterogeneity of actors and their responses in a health setting; rather, it also includes the heterogeneity of health information technology itself. For example, the glossary of health technology assessment by the International Network of Agencies for Health Technology Assessment (INAHTA) describes health technology as the "pharmaceuticals, devices, procedures, and organizational systems used in health care."[21] This description conveys how intervention is packaged in chemicals, supported by devices, organized as procedures running over time, or structured or supported by structures in organizational systems. Similarly, inclusive and comprehensive definitions can be found in other studies.[22][23] This heterogeneous evaluation context can create problems for any evaluation framework that tries to stretch to accommodate a diverse set of health technology implementations. This heterogeneity can present challenges for an evaluation framework in comparing evaluation aspects[24] and, consequently, in summing up reports[25] as well as in the creation of unified evaluation guidelines, and even in the evaluation of the evaluation process.

By extracting the lowest common denominators from among evaluation subjects, thereby creating a uniform context for comparison and evaluation, we can tackle the challenge of heterogeneity via elicitation-based evaluation approaches. Vice versa, the evaluation aspects in an evaluation framework suggest the common denominators between different elements. The lowest common denominator, as its mathematical concept suggests, expands to include elements from all parties, where the expansion has been kept to the lowest possible degree.

Usually, there are tradeoffs and challenges around the universality of an evaluation aspect related to how common it is and its relativeness (i.e. how low and close to the original elements it lies). When the scopes differ, their non-overlapped areas might be considerable, making it a challenge to find the common evaluation aspects. Furthermore, the same concepts might be perceived or presented differently by different stakeholders.[26] In addition, different approaches usually target different aspects to be evaluated, as a matter of focus or preference.

It is possible to merge the results of model-centered and elicitation-centered approaches. The merged output provides the advantages of both approaches while allowing the approaches to mutually cover for some of their challenges and shortcomings.

The aim of this paper is to address the question of "what to evaluate" in a health information system by proposing a method (called Unified eValuation using Ontology; UVON) which constructs evaluation aspects by organizing quality attributes in ontological structures. The method deals with the challenges of model-based evaluation frameworks by eliciting case-specific evaluation aspects, adapting and integrating evaluation aspects from some model-based evaluation frameworks and accommodating new cases that show up over time. The method can address heterogeneity by unifying different quality attributes that are extracted from one or more evaluation cases. This unification is possible with some arbitrary degree of balance between similarities and differences with respect to the needs of evaluation implementation. As a proof of the applicability of the proposed method, it has been instantiated and used in a real-world case for evaluating health information systems.

The structure of the rest of this paper is as follows. The research method that resulted in the UVON method is described in Methods section. The result, that is, the UVON method, is covered in The UVON Method for Unifying the Evaluation Aspects section, whereas its application in the context project is covered in Result of the UVON Method Application in the FI-STAR Project section. The rationale behind the method is discussed in Discussion section and the possible extensions and limitations are found in Extending the Evaluation Using the Ontology and Limitations of the UVON Method sections. The Conclusions section summarizes the conclusions of the paper.

Methods

The FI-STAR case

The FI-STAR project is a pilot project in eHealth systems funded by the European Union (EU). The evaluation of the FI-STAR project has been the major motive, the empirical basis, and the test bed for our proposed evaluation method, that is, the UVON method (to be described in Results section). FI-STAR is a project within the Future Internet Public-Private Partnership Programme (FI-PPP) and relates to the Future Internet (FI) series of technology platforms. The project consists of seven different eHealth cloud-based applications being developed and deployed in seven pilots across Europe. Each of these applications serves a different community of patients and health professionals[27] and has different expected clinical outcomes. FI-STAR and its seven pilot projects rose to the challenge of finding an evaluation mechanism that can be used both to evaluate each project and to aggregate the result of those evaluations as an evaluation of the whole FI-STAR project.

Research method

A general review of the existing evaluation frameworks was done. Existing model-based evaluation frameworks, which usually suggest universal quality attributes for evaluation, could not cover all the quality attributes (ie, evaluation aspects) reflected by the requirement documents of the pilot projects in FI-STAR. Even if there was a good coverage of the demanded evaluation aspects, there was still no guarantee that they could maintain the same degree of good coverage for the future expansions of the FI-STAR project. On the other hand, the requirement documents from the FI-STAR project were not expected to be the ultimate sources for identifying those quality attributes. It was speculated that there could exist other relevant quality attributes that were captured in the related literature or embedded in other, mostly model-based, health information system evaluation frameworks. For these reasons, it was decided to combine quality attributes both from the FI-STAR sources and a relevant external evaluation framework. To find other relevant evaluation aspects, a more specific review of the current literature was performed that was more focused on finding an evaluation framework of health information systems that sufficiently matched the specifications of the FI-STAR project. The review considered the MAST framework[28] as a candidate evaluation framework. This evaluation framework was expected to cover the quality attributes that were not indicated in the FI-STAR requirement documents but that were considered necessary to evaluate in similar projects. These extra quality attributes are suggested by expert opinions and background studies.[28] Nevertheless, it was necessary to integrate the quality attributes extracted from this framework with the quality attributes extracted from the FI-STAR requirement documents.

Regarding the heterogeneity of FI-STAR’s seven pilot projects, an evaluation mechanism was needed to extract common qualities from different requirement declarations and unify them. A review of the related literature showed that the literature on ontologies refers to the same functionalities, that is, capturing the concepts (quality attributes in our case) and their relations in a domain.[29] It was considered that subclass and superclass relations and the way they are represented in ontology unify the heterogeneous quality attributes that exist in our evaluation case. For the purposes of the possible future expansions of the FI-STAR project, this utilization of ontological structures needed to be systematic and easily repeatable.

Results

A method was developed to organize and unify the captured quality attributes via requirement engineering into a tree-style ontology structure and to integrate that structure with the recommended evaluation aspects from another evaluation framework. The method was applied for the seven pilots of the FI-STAR project, which resulted in a tree-style ontology of the quality attributes mentioned in the project requirement documents and the MAST evaluation framework. The top 10 nodes of the tree-style ontology were chosen as the 10 aspects of evaluation relevant to the FI-STAR project and its pilot cases.

The UVON Method for unifying the evaluation aspects

Methodical capture of a local ontology[30] from the quality attributes, that is, evaluation aspect ontology and reaching unification by the nature of its tree structure is the primary strategy behind our method. Therefore, the UVON method is introduced, so named to underline "Unified eValuation" of aspects as the target and "ONtology" construction or integration as the core algorithm. The ontology construction method presented in this paper is a simple, semiautomated method, configured and tested against FI-STAR project use cases. The UVON method does not try to introduce a new way of ontology construction; rather, it focuses on how to form a local ontology[30][31] out of the quality attributes of a system and use it for the purpose of finding out what to evaluate. In this regard, the ontology construction in the UVON method is a reorganization of common practices, such as those introduced by.[29]

The ontology structure, in its tree form, is the backbone of the UVON method. Modern ontology definition languages can show different types of relations, but for the sake of our method here, we only use the "is of type" relation, which can also describe pairs such as parent and child, superclass and subclass, or general and specific relations. This kind of relation creates a direct acyclic graph structure, which is or can be converted to a tree form. In this tree, the terms and concepts are nodes of the tree. The branches consist of those nodes connected by "is of type" relations. The tree has a root, which is the superclass, parent, or the general form of all other nodes. Traditionally, this node has been called the "thing."[29]

Figure 1 is an example of how this ontology structure can look. All the nodes in this picture are quality attributes, except the leaf nodes at the bottom, which are instances of health information systems. While going up to the top layers in the ontology, the quality attributes become more generic, at the same time aggregating and unifying their child nodes.

Fig1 Eivazzadeh JMIRMedInformatics2016 4-2.png

Fig. 1 An example snapshot of the output ontology while running the UVON method

The UVON method is composed of three phases: α, β, and γ (Figure 2). In the first phase, all quality attributes elicited by the requirement engineering process are collected in an unstructured set that is respectively called α set. In the next phase (β), based on the α set, an ontology is developed by the UVON method, which is called β (beta) ontology. In the next step, if the ontology is extended by an external evaluation framework (as discussed in the method), then it is called γ (gamma) ontology.

Fig2 Eivazzadeh JMIRMedInformatics2016 4-2.png

Fig. 2 Ontology construction for a health information system

The β ontology construction begins with a special initial node (ie, quality attribute) that is called "thing." All the collected quality attributes are going to begin a journey to find their position in the ontology structure, beginning from the "thing" node and going down the ontology structure to certain points specified by the algorithm. This journey is actually a depth-first tree traversal algorithm[32] with some modifications. To avoid confusion in the course of this algorithm, a quality attribute that seeks to find its position is called a "traveling quality attribute" or Q_t.

The first quality attribute simply needs to add itself as the child of the "thing" root node. For the remaining quality attributes, each checks to see if there exists any child of the "thing" node, where the child is a superclass (superset, super concept, general concept, more abstract form, etc) with regard to the traveling quality attribute (Q_t). If such a child node (quality attribute) exists (let’s say Q_n) then the journey continues by taking the route through that child node. The algorithm examines the children of Q_n (if any exist) to see if it is a subclass to any of them (or they are superclass to Q_t).

The journey ends at some point because of the following situations: If there is no child for a new root quality attribute (Q_n), then the traveling quality attribute (Q_t) should be added as a child to this one and its journey ends. That is the same if there exist children to a new root quality attribute (Q_n), but any of them is neither a superclass nor a subclass to our traveling quality attribute. Beside these two situations, it is possible that no child is a superclass, but one or more of them are the subclass of the traveling quality attribute (Q_t). In this situation, the traveling quality attribute (Q_t) itself becomes a child of that new root quality attribute, and those child quality attributes move down to become children of the traveling quality attribute (Q_t).

To keep the ontology as a tree, if a traveling quality attribute (Q_t) finds more than one superclass child of itself in a given situation, then it should replicate (fork) itself into instances, as many as the number of those children, and go through each branch separately. It is important to note that, logically, this replication cannot happen over two disjoint (mutually exclusive) branches. It is also possible to inject new quality attributes in between a parent node and children, but only if it does not break subclass or superclass relations. This injection can help to create ontologies in which the nodes at each level of the tree have a similar degree of generality, and each branch of the tree grows from generic nodes to more specific ones.

This customized depth-first tree traversal algorithm, which actually constructs a tree-style ontology instead of just traversing one, is considered semiautomated, as it relies on human decision in two cases. The first case is when it is needed to consider the superclass to subclass relations between two quality attributes. The gradual development of the ontology through the UVON method spreads the decision about superclass to subclass relations across the course of ontology construction. The unification of heterogeneous quality attributes (nodes) is the result of accumulating these distributed decisions, which are embodied as superclass to subclass relations. Each of these relations (i.e. decisions) makes at least two separate quality attributes closer together by representing them through more generic quality attributes.

In addition, one can inject a new quality attribute to the ontology tree, although that quality attribute is not explicitly mentioned in the requirement documents. This injection is only allowed when that quality attribute summarizes or equals a single or a few sibling quality attributes that are already in the ontology. The injection can improve clarity of the ontology. It can also help adjust the branches of the ontology tree to grow to a certain height, which can be helpful when a specific level of the tree is going to be considered as the base for creating a questionnaire. This adjustment of branch height might be needed if a branch is not tall enough to reach a specific level, meaning none of the quality attributes in that branch gets presented in the questionnaire. In addition, if a quality attribute is very specific compared with other quality attributes in that level of the tree, the questions in the questionnaire become inconsistent in their degree of generality. This inconsistency can be handled by injecting more generic quality attributes above the existing leaf node in the branch. All the previously mentioned benefits come with the cost of subjectivity in introducing a new quality attribute.

The γ phase ontology is constructed the same as the β phase, but it adds materials (quality attributes) from external sources. In this sense, the quality attributes specified in an external evaluation framework, probably a model-based one, should be extracted first. Those quality attributes should be fed into the β ontology the same as other quality attributes during the β phase. The UVON method does not discriminate between quality attribute by the origin, but it might be a good practice to mark those quality attributes originally from the external evaluation framework if we need later to make sure they are used by their original names in the summarizing level (to be discussed in the following paragraphs).

Each level of the resulting ontology tree(s) — except those that are deeper than the length of the shortest branch — represents or summarizes quality attributes of the whole system in some degree of generality or specificity. That of the root node is the most general quality attribute, which is too general to be useful for any evaluation; as for the levels below, each gives a view of the quality attributes in the whole system. As each parent node represents a general form of its children, each level summarizes the level below. We refer to one of these levels of the ontology tree that is considered for creating a questionnaire as the "summarizing level."

The quality attributes in each of the other levels (such as L_1 in Figure 3) can be evaluation aspects (ie, the answer to "what to evaluate") that can be measured by a questionnaire or other measurement methods. In addition, depending on the measuring method, the level below the summarizing level can be used to give details for each of the evaluation aspects. The practicalities of measurement in a case determine which summarizing level to choose. Levels closer to the root can be too abstract, whereas deeper levels can be too detailed. In addition, the number of quality attributes in a level can impact which level is appropriate. In the FI-STAR project, the limitation on the number of questions in the questionnaire was a determinant for selecting the summarizing level, where only level two fit the project limitations (although level three helped to make each question more detailed). It is possible to grow a short branch by adding a chain of children that are the same as their parents to make the branch reach a specific level, thereby making that level selectable as a summarizing level.

Fig3 Eivazzadeh JMIRMedInformatics2016 4-2.png

Fig. 3 More details can be evaluated by looking at deeper nodes in the ontology structure.

References

  1. Littlejohn, P.; Wyatt, J.C.; Garvican, L. (2003). "Evaluating computerised health information systems: Hard lessons still to be learnt". BMJ 326 (7394): 860–3. doi:10.1136/bmj.326.7394.860. PMC PMC153476. PMID 12702622. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC153476. 
  2. Kreps, D.; Richardson, H. (2007). "IS Success and Failure — The Problem of Scale". The Political Quarterly 78 (3): 439–46. doi:10.1111/j.1467-923X.2007.00871.x. 
  3. 3.0 3.1 Greenhalgh, T.; Russell, J. (2010). "Why do evaluations of eHealth programs fail? An alternative set of guiding principles". PLoS Medicine 7 (11): e1000360. doi:10.1371/journal.pmed.1000360. PMC PMC2970573. PMID 21072245. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2970573. 
  4. Chaudhry, B.; Wang, J.; Wu, S. et al. (2006). "Systematic review: Impact of health information technology on quality, efficiency, and costs of medical care". Annals of Internal Medicine 144 (10): 742-52. doi:10.7326/0003-4819-144-10-200605160-00125. PMID 16702590. 
  5. 5.0 5.1 Yusof, M.M.; Kuljis, J.; Papazafeiropoulou, A.; Stergioulas, L.K. (2008). "An evaluation framework for health information systems: Human, organization and technology-fit factors (HOT-fit)". International Journal of Medical Informatics 77 (6): 386-98. doi:10.1016/j.ijmedinf.2007.08.011. PMID 17964851. 
  6. 6.0 6.1 6.2 Yusof, M.M.; Papazafeiropoulou, A.; Paul, R.J.; Stergioulas, L.K. (2008). "Investigating evaluation frameworks for health information systems". International Journal of Medical Informatics 77 (6): 377-85. doi:10.1016/j.ijmedinf.2007.08.004. PMID 17904898. 
  7. 7.0 7.1 7.2 7.3 Holden, R.J.; Karsh, B.T. (2010). "The technology acceptance model: Its past and its future in health care". Journal of Biomedical Informatics 43 (1): 159–72. doi:10.1016/j.jbi.2009.07.002. PMC PMC2814963. PMID 19615467. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2814963. 
  8. Berg, M. (2001). "Implementing information systems in health care organizations: Myths and challenges". International Journal of Medical Informatics 64 (2–3): 143–56. doi:10.1016/S1386-5056(01)00200-3. PMID 11734382. 
  9. Hu, P.J.; Chau, P.Y.K.; Liu Sheng, O.R.; Tam, K.Y. (1999). "Examining the Technology Acceptance Model Using Physician Acceptance of Telemedicine Technology". Journal of Management Information Systems 16 (2): 91–112. doi:10.1080/07421222.1999.11518247. 
  10. Goodhue, D.L.; Thompson, R.L. (1995). "Task-Technology Fit and Individual Performance". MIS Quarterly 19 (2): 213–236. doi:10.2307/249689. 
  11. 11.0 11.1 Ammenwerth, W.; Iller, C.; Mahler, C. (2006). "IT-adoption and the interaction of task, technology and individuals: A fit framework and a case study". BMC Medical Informatics and Decision Making 6: 3. doi:10.1186/1472-6947-6-3. PMC PMC1352353. PMID 16401336. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1352353. 
  12. 12.0 12.1 Ekeland, A.G.; Bowes, A.; Flottorp, S. (2012). "Methodologies for assessing telemedicine: A systematic review of reviews". International Journal of Medical Informatics 81 (1): 1–11. doi:10.1016/j.ijmedinf.2011.10.009. PMID 22104370. 
  13. 13.0 13.1 Talmon, J.; Ammenwerth, E.; Brender, J. et al. (2009). "STARE-HI—Statement on reporting of evaluation studies in Health Informatics". International Journal of Medical Informatics 78 (1): 1–9. doi:10.1016/j.ijmedinf.2008.09.002. PMID 18930696. 
  14. Ammenwerth, E.; Brender, J.; Nykänen, P. et al. (2004). "Visions and strategies to improve evaluation of health information systems: Reflections and lessons based on the HIS-EVAL workshop in Innsbruck". International Journal of Medical Informatics 73 (6): 479–91. doi:10.1016/j.ijmedinf.2004.04.004. PMID 15171977. 
  15. Davis, F.D. (1989). "Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology". MIS Quarterly 13 (3): 319–340. doi:10.2307/249008. 
  16. DeLone, W.H.; McLean, E.R. (2004). "Measuring e-Commerce Success: Applying the DeLone & McLean Information Systems Success Model". International Journal of Electronic Commerce 9 (1): 31–47. doi:10.1080/10864415.2004.11044317. 
  17. Cusack, C.M.; Byrne, C.M.; Hook, J.M. et al. (June 2009). "Health Information Technology Evaluation Toolkit: 2009 Update" (PDF). Agency for Healthcare Research and Quality, HHS. https://healthit.ahrq.gov/sites/default/files/docs/page/health-information-technology-evaluation-toolkit-2009-update.pdf. Retrieved 01 April 2016. 
  18. 18.0 18.1 Cheng, B.H.C.; Atlee, J.M. (2007). "Research Directions in Requirements Engineering". FOSE '07: Future of Software Engineering: 285–383. doi:10.1109/FOSE.2007.17. 
  19. Friedman, C.P.; Wyatt, J. (2006). Evaluation Methods in Biomedical Informatics. Springer-Verlag New York. pp. 386. ISBN 9781441920720. 
  20. Bürkle, T.; Ammenwerth, E.; Prokosch, H.U.; Dudeck, J. (2001). "Evaluation of clinical information systems: What can be evaluated and what cannot?". Journal of Evaluation in Clinical Practice: 373-85. doi:10.1046/j.1365-2753.2001.00291.x. PMID 11737529. 
  21. "Welcome to the HTA Glossary". Institut national d’excellence en santé et en services sociaux (INESSS). http://htaglossary.net/HomePage. Retrieved 30 September 2015. 
  22. Kristensen, F.B. (2009). "Health technology assessment in Europe". Scandinavian Journal of Public Health 37 (4): 335-9. doi:10.1177/1403494809105860. PMID 19493989. 
  23. Draborg, E.; Gyrd-Hansen, D.; Poulsen, P.B.; Horder, M. (2005). "International comparison of the definition and the practical application of health technology assessment". International Journal of Technology Assessment in Health Care 21 (1): 89-95. doi:10.1017/S0266462305050117. PMID 15736519. 
  24. Busse, R.; Orvain, J.; Velasco, M. et al. (2002). "Best practice in undertaking and reporting health technology assessments". International Journal of Technology Assessment in Health Care 18 (2): 361–422. doi:10.1017/S0266462302000284. PMID 12053427. 
  25. Lampke, K.; Mäkelä, M.; Garrido, M.V. et al. (2009). "The HTA core model: A novel method for producing and reporting health technology assessments". International Journal of Technology Assessment in Health Care 25 (S2): 9–20. doi:10.1017/S0266462309990638. PMID 20030886. 
  26. Ammenwerth, E.; Gräber, S.; Herrmann, G. et al. (2003). "Evaluation of health information systems: Problems and challenges". International Journal of Medical Informatics 71 (2–3): 125–35. doi:10.1016/S1386-5056(03)00131-X. PMID 14519405. 
  27. "About FI-STAR". Eurescom GmbH. https://www.fi-star.eu/about-fi-star.html. Retrieved 29 September 2015. 
  28. 28.0 28.1 Kidholm, K.; Ekeland, A.G.; Jensen, L.K. et al. (2012). "A model for assessment of telemedicine applications: MAST". International Journal of Technology Assessment in Health Care 28 (1): 44–51. doi:10.1017/S0266462311000638. PMID 22617736. 
  29. 29.0 29.1 29.2 Noy, N. (18 July 2005). "Ontology Development 101" (PDF). Stanford University. http://bmir-stage.stanford.edu/conference/2005/slides/T1_Noy_Ontology101.pdf. 
  30. 30.0 30.1 Uschold, M. (2000). "Creating, integrating and maintaining local and global ontologies". Proceedings of the First Workshop on Ontology Learning (OL-2000). 31. Berlin: CEUR Proceedings. 
  31. Choi, N.; Song, I.-Y.; Han, H. (2006). "A survey on ontology mapping". ACM SIGMOD Record 35 (3): 34–41. doi:10.1145/1168092.1168097. 
  32. Tarjan, R. (1972). "Depth-First Search and Linear Graph Algorithms". SIAM Journal on Computing 1 (2): 146–160. doi:10.1137/0201010. 

Abbreviations

EU: European Union

FI: Future Internet

FI-STAR: Future Internet Social and Technological Alignment Research

FI-PPP: Future Internet Public-Private Partnership Programme

FITT: Fit between Individuals, Task and Technology

HOT-fit: Human, Organization, and Technology Fit

INAHTA: International Network of Agencies for Health Technology Assessment

MAST: Model for Assessment of Telemedicine applications

OWL: Web Ontology Language

STARE-HI: Statement on the Reporting of Evaluation studies in Health Informatics

TAM: Technology Acceptance Model

TAM2: Technology Acceptance Model 2

UTAUT: Unified Theory of Acceptance and Use of Technology

UVON: Unified eValuation using Ontology

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In several cases the PubMed ID was missing and was added to make the reference more useful. The URL to the Health Information Technology Evaluation Toolkit was dead and not archived; an alternative version of it was found on the AHRQ site and the URL substituted. Figure 2 has been moved up closer to its reference.

Per the distribution agreement, the following copyright information is also being added:

©Shahryar Eivazzadeh, Peter Anderberg, Tobias C. Larsson, Samuel A. Fricker, Johan Berglund. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 16.06.2016.