Journal:DAQUA-MASS: An ISO 8000-61-based data quality management methodology for sensor data

From LIMSWiki
Revision as of 23:11, 1 April 2019 by Shawndouglas (talk | contribs) (Saving and adding more.)
Jump to navigationJump to search
Full article title DAQUA-MASS: An ISO 8000-61-based data quality management methodology for sensor data
Journal Sensors
Author(s) Perez-Castillo, Ricardo; Carretero, Ana G.; Caballero, Ismael; Rodriguez, Moises; Piattini, Mario; Mate, Alejandro; Kim, Sunho; Lee, Dongwoo
Author affiliation(s) University of Castilla-La Mancha, AQC Lab, University of Alicante, Myongji University, GTOne,
Primary contact Email: ricardo dot pdelcastillo @ uclm dot es
Year published 2018
Volume and issue 18(9)
Page(s) 3105
DOI 10.3390/s18093105
ISSN 1424-8220
Distribution license Creative Commons Attribution 4.0 International
Website https://www.mdpi.com/1424-8220/18/9/3105/htm
Download https://www.mdpi.com/1424-8220/18/9/3105/pdf (PDF)

Abstract

The internet of things (IoT) introduces several technical and managerial challenges when it comes to the use of data generated and exchanged by and between various smart, connected products (SCPs) that are part of an IoT system (i.e., physical, intelligent devices with sensors and actuators). Added to the volume and the heterogeneous exchange and consumption of data, it is paramount to assure that data quality levels are maintained in every step of the data chain/lifecycle. Otherwise, the system may fail to meet its expected function. While data quality (DQ) is a mature field, existing solutions are highly heterogeneous. Therefore, we propose that companies, developers, and vendors should align their data quality management mechanisms and artifacts with well-known best practices and standards, as for example, those provided by ISO 8000-61. This standard enables a process-approach to data quality management, overcoming the difficulties of isolated data quality activities. This paper introduces DAQUA-MASS, a methodology based on ISO 8000-61 for data quality management in sensor networks. The methodology consists of four steps according to the Plan-Do-Check-Act cycle by Deming.

Keywords: data quality; data quality management processes; ISO 8000-61; data quality in sensors; internet of things; IoT; smart, connected products; SCPs

Introduction

“Our economy, society, and survival aren’t based on ideas or information—they’re based on things.”[1] This is one of the core foundations of the internet of things (IoT) as stated by Ashton, who coined the term. IoT is an emerging global internet-based information architecture facilitating the exchange of goods and services.[2] IoT systems are inherently built on data gathered from heterogeneous sources in which the volume, variety, and velocity of data generation, exchanging and processing are dramatically increasing.[3] Furthermore, there is a certain emergence of IoT semantic-oriented vision which needs ways to represent and manipulate the vast amount of raw data expected to be generated from and exchanged between the “things.”[4]

The vast amount of data in IoT environments, gathered from a global-scale deployment of smart-things, is the basis for making intelligent decisions and providing better services (e.g., smart mobility, as presented by Zhang et al.[5]). In other words, data represents the bridge that connects cyber and physical worlds. Despite of its tremendous relevance, if data are of inadequate quality, decisions from both humans and other devices are likely to be unsound.[6][7] As a consequence, data quality (DQ) has become one of the key aspects in IoT.[6][8][9][10] IoT devices, and in particular smart, connected products (SCPs), have concrete characteristics that favor the apparition of problems due to inadequate levels of data quality. Mühlhäuser[11] defines SCPs as “entities (tangible object, software, or service) designed and made for self-organized embedding into different (smart) environments in the course of its lifecycle, providing improved simplicity and openness through improved connections.” While some of the SCP-related characteristics might be considered omnipresent (i.e., uncertain, erroneous, noisy, distributed, and voluminous), other characteristics are more specific and highly dependent on the context and monitored phenomena (i.e., smooth variation, continuous, correlation, periodicity, or Markovian behavior).[6]

Also, outside of the IoT research area, DQ has been broadly studied during last years, and it has become a mature research area capturing the growing interest of the industry due to the different types of values that companies can extract from data.[12] This fact is reflected by the standardization efforts like ISO/IEC 25000 series addressing systems and software quality requirements and evaluation (SQuaRE)[13] processes, and specific techniques for managing data concerns. We pose that such standards can be tailored and used within the IoT context, not only bring benefits standardizing solutions and enabling a better communication between partners. Also, the number of problems and system fails on the IoT environment is reduced, better decisions can be taken due to a better quality of data, all stakeholders are aligned and can take benefit of the advances on the standard used, and it is easier to apply data quality solutions in a global way because the heterogeneity is reduced.

Due to the youth of IoT, and despite DQ standards, frameworks, management techniques, and tools proposed in the literature, DQ for IoT has not been yet widely studied. However, and prior to this research line, it is possible to cite some works that had addressed some DQ concerns in sensor wireless networks[8][14], or in data streaming[15][16] among other proposals.[6] However, these works have not considered the management of DQ in a holistic way in line with existing DQ-related standards. In our attempt to align the study of DQ in IoT to international standards, this paper provides practitioners and researchers with DAQUA-MASS, a methodology for managing data quality in SCP environments, which considers some of the DQ best practices for improving quality of data in SCP environments aligned to ISO 8000-61.[17] Due to the intrinsic distributed nature of IoT systems, using such standards will enable the various organizations to be aligned to the same foundations, and in the end, to work in a seamless way, what will undoubtedly improve the performance of the business processes.

The remainder of this paper is organized as follows: the next section presents the most challenging data quality management concerns in the context of the SCP environments; afterwards. related work is explored. Then the data quality model in which our methodology is based on is presented. The last two sections propose a methodology for managing data quality in SCP environments and discuss conclusions and implications of this work.

References

  1. Ashton, K. (2009). "That 'Internet of Things' Thing". RFID Journal 22: 97–114. 
  2. Weber, R.H. (2013). "Internet of things – Governance quo vadis?". Computer Law & Security Review 29 (4): 341-347. doi:10.1016/j.clsr.2013.05.010. 
  3. Hassanein, H.S.; Oteafy, S.M.A. (2017). "Big Sensed Data Challenges in the Internet of Things". Proceedings from the 13th International Conference on Distributed Computing in Sensor Systems: 207–8. doi:10.1109/DCOSS.2017.35. 
  4. Atzori, L.; Iera, A.; Morabito, G. (2010). "The Internet of Things: A survey". Computer Networks 54 (15): 2787-2805. doi:10.1016/j.comnet.2010.05.010. 
  5. Zhang, W.; Zhang, Z.; Chao, H.-C. (2017). "Cooperative Fog Computing for Dealing with Big Data in the Internet of Vehicles: Architecture and Hierarchical Resource Management". IEEE Communications Magazine 55 (12): 60–7. doi:10.1109/MCOM.2017.1700208. 
  6. 6.0 6.1 6.2 6.3 Karkouch, A.; Mousannif, H.; Al Moatassime, H. et al. (2016). "Data quality in internet of things: A state-of-the-art survey". Journal of Network and Computer Applications 73: 57–81. doi:10.1016/j.jnca.2016.08.002. 
  7. Merino, J.; Caballero, I.; Rivas, B. et al. (2016). "A Data Quality in Use model for Big Data". Future Generation Computer Systems 63: 123–30. doi:10.1016/j.future.2015.11.024. 
  8. 8.0 8.1 Jesus, G.; Casimiro, A.; Oliveira, A. (2017). "A Survey on Data Quality for Dependable Monitoring in Wireless Sensor Networks". Sensors 17 (9): E2010. doi:10.3390/s17092010. PMC PMC5620495. PMID 28869505. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5620495. 
  9. Rodríguez, C.C.G.; Servigne, S. (2017). "Managing Sensor Data Uncertainty: A Data Quality Approach". International Journal of Agricultural and Environmental Information Systems 4 (1): 3. doi:10.4018/jaeis.2013010103. 
  10. Klein, A.; Hackenbroich, G.; Lehner, W. (2009). "How to Screen a Data Stream - Quality-Driven Load Shedding in Sensor Data Streams". Proceedings of the 2009 International Conference on Information Quality: 1–15. http://mitiq.mit.edu/iciqpapers.aspx?iciqyear=2009. 
  11. Mühlhäuser, M. (2007). "Smart Products: An Introduction". Proceedings from the 2007 European Conference on Ambient Intelligence: 158–64. doi:10.1007/978-3-540-85379-4_20. 
  12. Laney, Douglas B. (2017). Infonomics: How to Monetize, Manage, and Measure Information as an Asset for Competitive Advantage. Routledge. ISBN 9781138090385. 
  13. "ISO/IEC 25000:2014". International Organization for Standardization. March 2014. https://www.iso.org/standard/64764.html. Retrieved 13 September 2018. 
  14. Qin, z.; Han, Q.; Mehrotra, S. et al. (2014). "Quality-Aware Sensor Data Management". In Ammari, H.M.. The Art of Wireless Sensor Networks. Springer. pp. 429–64. ISBN 9783642400094. 
  15. Campbell, J.L.; Rustad, L.E.; Porter, J.H. et al. (2013). "Quantity is Nothing without Quality: Automated QA/QC for Streaming Environmental Sensor Data". BioScience 63 (7): 574–85. doi:10.1525/bio.2013.63.7.10. 
  16. Klein, A.; Lehner, W. (2009). "Representing Data Quality in Sensor Data Streaming Environments". Journal of Data and Information Quality 1 (2): 10. doi:10.1145/1577840.1577845. 
  17. "ISO 8000-61:2016". International Organization for Standardization. November 2016. https://www.iso.org/standard/63086.html. Retrieved 13 September 2018. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. Grammar was cleaned up for smoother reading. In some cases important information was missing from the references, and that information was added.