Journal:Security architecture and protocol for trust verifications regarding the integrity of files stored in cloud services

From LIMSWiki
Revision as of 23:01, 26 March 2019 by Shawndouglas (talk | contribs) (Saving and adding more.)
Jump to navigationJump to search
Full article title Security architecture and protocol for trust verifications regarding the integrity of files stored in cloud services
Journal Sensors
Author(s) Pinheiro, Alexandre; Canedo, Edna Dias; De Sousa Junior, Rafael Timoteo;,
De Oliveira Albuquerque, Robson; Villalba, Luis Javier Garcia; Kim, Tai-Hoon
Author affiliation(s) University of Brasília, Universidad Complutense de Madrid, Sungshin Women’s University
Primary contact Email: javiergv at fdi dot ucm dot es
Year published 2018
Volume and issue 18(3)
Page(s) 753
DOI 10.3390/s18030753
ISSN 1999-5903
Distribution license Creative Commons Attribution 4.0 International
Website https://www.mdpi.com/1424-8220/18/3/753/htm
Download https://www.mdpi.com/1424-8220/18/3/753/pdf (PDF)

Abstract

Cloud computing is considered an interesting paradigm due to its scalability, availability, and virtually unlimited storage capacity. However, it is challenging to organize a cloud storage service (CSS) that is safe from the client point-of-view and to implement this CSS in public clouds since it is not advisable to blindly consider this configuration as fully trustworthy. Ideally, owners of large amounts of data should trust their data to be in the cloud for a long period of time, without the burden of keeping copies of the original data, nor of accessing the whole content for verification regarding data preservation. Due to these requirements, integrity, availability, privacy, and trust are still challenging issues for the adoption of cloud storage services, especially when losing or leaking information can bring significant damage, be it legal or business-related. With such concerns in mind, this paper proposes an architecture for periodically monitoring both the information stored in the cloud and the service provider behavior. The architecture operates with a proposed protocol based on trust and encryption concepts to ensure cloud data integrity without compromising confidentiality and without overloading storage services. Extensive tests and simulations of the proposed architecture and protocol validate their functional behavior and performance.

Keywords: cloud computing; cloud data storage; proof of integrity; services monitoring; trust

Introduction

Companies, institutions, and government agencies generate large amounts of digital information every day, such as documents, projects, and transaction records. For legal or business reasons, this information needs to remain stored for long periods of time.

Due to the popularization of cloud computing (CC), its cost reduction, and an ever-growing supply of cloud storage services (CSS), many companies are choosing these services to store their sensitive information. Cloud computing’s advantages include scalability, availability, and virtually unlimited storage capacity. However, it is a challenge to build safe storage services, mainly when these services run in public cloud infrastructures and are managed by service providers under conditions that are not fully trustworthy.

Data owners often need to keep their stored data for a long time, though it is possible that they rarely will have to access it. Furthermore, some data could be stored in a CSS without its owner having to keep the original copy. However, in these situations, the storage service reliability must be considered, because even the best services sometimes fail[1], and since the loss of these data or their leakage can bring significant business or legal damage, the issues of integrity, availability, privacy, and trust need to be answered before the adoption of the CSS.

Data integrity is defined as the accuracy and consistency of stored data. These two properties indicate that the data have not changed and have not been broken.[2] Moreover, besides data integrity, a considerable number of organizations consider both confidentiality and privacy requirements as the main obstacles to the acceptance of public cloud services.[2] Hence, to fulfill these requirements, a CSS should provide mechanisms to confirm data integrity, while still ensuring user privacy and data confidentiality.

Considering these requirements, this paper proposes an architecture for periodically monitoring both the information stored in the cloud infrastructure and the contracted storage service behavior. The architecture is based on the operation of a proposed protocol that uses a third party and applies trust and encryption means to verify both the existence and the integrity of data stored in the cloud infrastructure without compromising these data’s confidentiality. Furthermore, the protocol was designed to minimize the overload that it imposes on the cloud storage service.

To validate the proposed architecture and its supporting protocol, a corresponding prototype was developed and implemented. Then, this prototype was submitted to testing and simulations by means of which we verified its functional characteristics and its performance.

This paper addresses all of this and is structured as follows. The "Background" section reviews the concepts and definitions of cloud computing, encryption, and trust, then we present works related to data integrity in the cloud. Then we describe the proposed architecture, while its implementation is discussed in the following section. Afterwards, the "Experimental validation" section is devoted to the experiments and respective results, while the main differences between related works and the proposed architecture follow it. The paper ends with our conclusions and outlines future works.

Background

Cloud computing (CC) is a model that allows convenient and on-demand network access to a shared set of configurable computational resources. These resources can be quickly provisioned with minimal management effort and without the service provider’s intervention.[3] Since it constitutes a flexible and reliable computing environment, CC is being gradually adopted in different business scenarios using several available supporting solutions.

Relying on different technologies (e.g., virtualization, utility computing, grid computing, and service-oriented architecture) and proposing a new computational services paradigm, CC requires high-level management activities, which include: (a) selection of the service provider, (b) selection of virtualization technology, (c) virtual resources’ allocation, and (d) monitoring and auditing procedures to comply with service level agreements (SLAs).[4]

A particular CC solution comprises several components such as client modules, data centers, and distributed servers. These elements form the three parts of the cloud solution[4][5], each one with a specific purpose and specific role in delivering working applications based on the cloud.

The CC architecture is basically structured into two main layers: a lower and a higher resource layer, each one dealing with a particular aspect of making application resources available. The lower layer comprises the physical infrastructure, and it is responsible for the virtualization of storage and computational resources. The higher layer provides specific services, such as software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS). Each of these layers may have its own management and monitoring systems, independent of one another, thus improving flexibility, reuse, and scalability.[6][7]

Since CC provides access to a shared pool of configurable computing resources, its provisioning mode can be classified by the intended access methods and coverage of services’ availability, which yields different models of CC services’ deployment, ranging from private clouds, in which resources are shared within an owner organization, to public clouds, in which cloud providers possess the resources that are consumed by other organizations based on contracts, but also including hybrid cloud environments and community clouds.[8]

The central concept of this paper’s proposal is the verification by the cloud service user that a particular property, in our case the integrity of files, is fulfilled by the cloud service provider, regardless of the mode of a service's provision and deployment, either in the form of private, public, or hybrid clouds.

The verification of file integrity is performed by means of a protocol that uses contemporaneous computational encryption, specifically public key encryption and hashes, which together provide authentication of messages and compact integrity verification sequences that are unequivocally bound to each verified file (signed file hashes).

This proposed protocol is conceived to allow the user of cloud services to check whether the services provider is indeed acting as expected in regard to maintaining the integrity of the user files, which corresponds to the idea of the user monitoring the provider to acquire and maintain trust in the provider behavior in this circumstance.

Some specific aspects of trust, encryption, and hashes that are considered as useful for this paper’s comprehension are briefly reviewed in the subsections below.

Trust

Trust is a common reasoning process for humans to face the world’s complexities and to think sensibly about everyday life possibilities. Trust is strongly linked to expectations about something, which implies a degree of uncertainty and optimism. It is the choice of putting something in another’s hands, considering the other’s behavior to determine how to act in a given situation.[9]

Trust can be considered as a particular level of subjective probability in which an agent believes that another agent will perform a certain action, which is subject to monitoring.[10] Furthermore, trust can be represented as an opinion so that situations involving trust and trust relationships can be modeled. Thus, positive and negative feedback on a specific entity can be accumulated and used to calculate its future behavior.[11] This opinion may result from direct experience or may come from a recommendation from another entity.[12]

According to Adnane et al.[13] and De Sousa, Jr. and Puttini[14], trust, trust models, and trust management have been the subject of various research works demonstrating that the conceptualization of computational trust allows a computing entity to reason with and about trust, and to make decisions regarding other entities. Indeed, since the initial works on the subject by the likes of Marsh[9] and Yahalom et al.[15], computational trust is recognized as an important aspect for decision-making in distributed and auto-organized applications, and its expression allows formalizing and clarifying trust aspects in communication protocols.

Yahalom et al.[15], for instance, find the notion of "trust" to mean that if an entity A trusts an entity B in some respect, this means that A believes that B will behave in a certain way and will perform some action under certain specific circumstances. This leads to the possibility of conducting a protocol operation (action) that is evaluated by the entity A on the basis of what A knows about the entity B and the circumstances of the operation. This accurately corresponds to the protocol relationship established between a CC service consumer and a CC service provider, which is the focus of the present paper.

Thus, in our proposal, trust is used in the context of a cloud computing service as a means to verify specific actions performed by the participating entities in this context. Using the definitions by Yahalom et al.[15] and Grandison and Sloman[16], we can state that in a CC service, one entity, the CC service consumer, may trust another one, the CC service provider, for actions such as providing identification to the other entity, not interfering in the other entity sessions, neither passively by reading secret messages, nor actively by impersonating other parties. Furthermore, the CC service provider will grant access to resources or services, as well as make decisions on behalf of the other entity, with respect to a resource or service that this entity owns or controls.

In these trust verifications, it is required to ensure some properties such as the secrecy and integrity of stored files, authentication of message sources, and the freshness of the presented proofs, avoiding proof replays. It is required as well to present reduced overhead in cloud computing protocol operations and services. In our proposal, these requirements are fulfilled with modern robust public key encryption involving hashes, as discussed hereafter, considering that these means are adequately and easily deployed in current CC service provider and consumer situations.

Encryption

Encryption is a process of converting (or ciphering) a plaintext message into a ciphertext that can be deciphered back to the original message. An encryption algorithm, along with one or more keys, is used either in the encryption or the decryption operation.

The number, type, and length of the keys used depend on the encryption algorithm, the choice of which is a consequence of the security level needed. In conventional symmetric encryption, a single key is used, and with this key the sender can encrypt a message, and a recipient can decrypt the ciphered message. However, key security becomes an issue since at least two copies of the key exist, one at the sender and another at the recipient.

Oppositely, in asymmetric encryption, the encryption key and the decryption key are correlated, but different, one being a public key of the recipient that can be used by the sender to encrypt the message, while the other related key is a recipient private key allowing the recipient to decrypt the message.[17] The private key can be used by its owner to send messages that are considered signed by the owner since every entity can use the corresponding public key to verify if a message comes from the owner of the private key.

These properties of asymmetric encryption are useful for the trust verifications that in our proposal are designed for checking the integrity of files stored in cloud services. Indeed, our proposal uses encryption of hashes as the principal means to fulfill the trust requirements in these operations.

Hashes

A hash value, hash code, or simply hash is the result of applying a mathematical one-way function that takes a string of any size as the data source and returns a relatively small and fixed-length string. A modification of any bit in the source string dramatically alters the resulting hash code after executing the hash function.[18] These one-way functions are designed to make it very difficult to deduce from a hash value the source string that was used to calculate this hash. Furthermore, it is required that it should be extremely difficult to find two source strings whose hash codes are the same, i.e., a hash collision.

Over the years, many cryptographic algorithms have been developed for hashes, for which the Message-Digest algorithm 5 (MD5) and Secure Hash Algorithm (SHA) family of algorithms can be highlighted, due to the wide use of these algorithms in the most diverse information security software packages. MD5 is a very fast cryptographic algorithm that receives as input a random-sized message and produces as output a fixed length hash with 128 bits.[19]

The SHA family is composed of algorithms named as SHA-1, SHA-256, and SHA-512, which differ regarding the respective security level and the output hash length, that can vary from 160 to 512 bits. The SHA-3 algorithm was chosen by the National Institute of Standards and Technology (NIST) in an international competition that aimed to replace all of the SHA family of algorithms.[20]

The Blake2 algorithm is an improved version of the hash cryptographic algorithm called “Blake,” a finalist of the SHA-3 selection competition that is optimized for software applications. Blake2 can generate hash values from eight to 512 bits. The main Blake2 characteristics are: the memory consumption reduction by 32% compared to other SHA algorithms, the processing speed being greater than that of MD5 on 64-bit platforms, direct parallelism support without overhead, and faster hash generation on multicore processors.[21] In our proposed validation prototype, the Blake2 algorithm was considered as a good choice due to its combined characteristics of speed, security, and simplicity.

Related work

This section presents a brief review of papers regarding the themes of computational trust applications, privacy guarantees, data integrity verification, services management, and monitoring, all of them applicable to cloud computing environments.

Computational trust applications

Depending on the used approach, trust can either be directly measured by one entity based on its own experiences or can be evaluated through the use of third-party opinions and recommendations.

Tahta et al.[22] propose a trust model for peer-to-peer (P2P) systems called “GenTrust,” in which genetic algorithms are used to recognize several types of attacks and to help a well-behaved node find other trusted nodes. GenTrust uses extracted features (number of interactions, number of successful interactions, the average size of downloaded files, the average time between two interactions, etc.) that result from a node’s own interactions. However, when there is not enough information for a node to consider, recommendations from other nodes are used. Then, the genetic algorithm selects which characteristics, when evaluated together and in a given context, present the best result to identify the most trustful nodes.

Another approach is presented by Gholami and Arani[23], proposing a trust model named “Turnaround_Trust” aimed at helping clients to find cloud services that can serve them based on service quality requirements. The Turnaround_Trust model considers service quality criteria such as cost, response time, bandwidth, and processor speed, to select the most trustful service among those available in the cloud.

Our approach in this paper differs from these related works since we use trust metrics that are directly related to the stored files in CC and that are paired to the cryptographic proof of these files' integrity.

Canedo[24] bases the proposed trust model on concepts such as direct trust, trust recommendation, indirect trust, situational trust, and reputation to allow a node selection for trustful file exchange in a private cloud. For the sake of trust calculation, the processing capacity of a node, its storage capacity, and operating system—as well as the link capacity—are adopted as trust metrics that compose a set representative of the node availability. Concerning reputation, the calculation considers the satisfactory and unsatisfactory experiences with the referred node informed by other nodes. The proposed model calculates trust and reputation scores for a node based on previously-collected information, i.e., either information requested from other nodes in the network or information that is directly collected from interactions with the node being evaluated.

In the present paper, our approach is applied to both private and public CC services, with the development of the necessary architecture and secure protocol for trust verification regarding the integrity of files in CC services.

Integrity verification and privacy guarantee

In their effort to guarantee the integrity of data stored in cloud services, many research works present proposals in the domain analyzed in this paper.

A protocol is proposed by Juels and Kaliski, Jr.[25] to enable a cloud storage service to prove that a file subjected to verification is not corrupted. To that end, a formal and secure definition of proof of retrievability is presented, and the paper introduces the use of sentinels, which are special blocks hidden in the original file prior to encryption to be afterward used to challenge the cloud service. Based on Juels and Kaliski, Jr.'s work[25], Kumar and Saxena[26] present another scheme where one does not need to encrypt all the data, but only a few bits per data block.

George and Sabitha[27] propose a bipartite solution to improve privacy and integrity. The first part, called “anonymization,” initially recognizes fields in records that could identify their owners and then uses techniques such as generalization, suppression, obfuscation, and the addition of anonymous records to enhance data privacy. The second part, called “integrity checking,” uses public and private key encryption techniques to generate a tag for each record on a table. Both parts are executed with the help of a trusted third party called the “enclave” that saves all generated data that will be used by the de-anonymization and integrity verification processes.

An encryption-based integrity verification method is proposed by Kavuri et al.[28] The proposed method uses a new hash algorithm, the dynamic user policy-based hash algorithm, to calculate hashes of data for each authorized cloud user. For data encryption, an improved attribute-based encryption algorithm is used. The encrypted data and corresponding hash value are saved separately in cloud storage. Data integrity can be verified only by an authorized user and requires the retrieval of all the encrypted data and corresponding hash.

Al-Jaberi and Zainal[29] provide another proposal to simultaneously achieve data integrity verification and privacy-preserving, which proposes the use of two encryption algorithms for every data upload or download transaction. The Advanced Encryption Standard (AES) algorithm is used to encrypt client data, which will be saved in a CSS, and an RSA-based partial homomorphic encryption technique is used to encrypt AES encryption keys that will be saved in a third-party entity together with a hash of the file. Data integrity is verified only when a client downloads one file.

Kai et al.[30] propose a data integrity auditing protocol to allow the fast identification of corrupted data using homomorphic cipher-text verification and a recoverable coding methodology. Checking the integrity of outsourced data is done periodically by either a trusted or untrusted entity. The adopted methodology aims at reducing the total auditing time and the communication cost.

The work of Wang et al.[31] presents a security model for public verification and assurance of stored file correctness that supports dynamic data operation. The model guarantees that no challenged file blocks should be retrieved by the verifier during the verification process, and no state information should be stored at the verifier side between audits. A Merkle hash tree (MHT) is used to save the authentic data value hashes, and both the values and positions of data blocks are authenticated by the verifier.

Our proposal in this paper differs from these described proposals since we introduce the idea of trust resulting from file integrity verification as an aggregate concept to evaluate the long-term behavior of a CSS and including most of the requirements specified in these other proposals, such as hashes of file blocks, freshness of verifications, and integrated support for auditing by an independent party. Further discussion on theses differences is presented in the "Discussion" section based on the results coming from the validation of our proposal.

Management and monitoring of CSS

Some other research works were reviewed since their purpose is to provide management tools to ensure better use of the services offered by CSS providers, as well as monitoring functions regarding the quality of these services, thus allowing one to generate a ranking of these providers.

Pflanzner et al.[32] present an approach to autonomous data management within CSS. This approach proposes a high-level service that helps users to better manage data distributed in multiple CSS. The proposed solution is composed of a framework that consists of three components named MeasureTool, DistributeTool, and CollectTool. Each component is respectively responsible for performing monitoring processes for measuring the performance, splitting, and distributing file chunks between different CSS and retrieving split parts of a required file. Both historical performance and latest performance values are used for CSS selection and to define the number of file chunks that will be stored in each CSS.

Furthermore, they propose the use of cloud infrastructure services to execute applications on mobile data stored in CSS.[32] In this proposal, the services for data management are run in one or more IaaS systems that keep track of the user storage area in CSS and execute the data manipulation processes when new files appear. The service running on an IaaS cloud downloads the user data files from the CSS, executes the necessary application on these files, and uploads the modified data to the CSS. This approach permits overcoming the computing capacity limitations of mobile devices.

The quality of services (QoS) provided by some commercial CSS is analyzed by Gracia-Tinedo et al.[33] For this, a measurement study is presented where important aspects such as transfer speed (upload/download), behavior according to client geographic location, failure rate, and service variability related to file size, time, and account load are broadly explored. To perform the measurement, two platforms are employed, one with homogeneous and dedicated machines and the other with shared and heterogeneous machines distributed in different geographic locations. Furthermore, the measurement is executed using its own CSS REST interfaces, regarding mainly the methods PUT and GET, respectively used to upload and download files. The applied measurement methodology is demonstrated to be efficient and permits one to learn important characteristics about the analyzed CSS.

Our contributions in this paper comprise the periodic monitoring of files stored in the cloud, performed by an integrity checking service that is defined as an abstract role so that it can operate independently either the CSS provider or its consumer, preserving the privacy of stored file contents, and operating according to a new verification protocol. Both the tripartite architecture and the proposed protocol are described hereafter in this paper.

Proposed architecture and protocol

This section presents the proposed architecture that defines roles that work together to enable periodic monitoring of files stored in the cloud. Furthermore, the companion protocol that regulates how these roles interact with one another is detailed and discussed.

The architecture is composed of three roles: (i) Client, (ii) Cloud Storage Service (CSS), and (iii) Integrity Check Service (ICS). The Client represents the owner of files that will be stored by the cloud provider and is responsible for generating the needed information that is stored specifically for the purpose of file integrity monitoring. The CSS role represents the entity responsible for receiving and storing the client’s files, as well as receiving and responding to challenges regarding file integrity that come from the ICS role. The ICS interfaces either with the Client of the CSS, so it acts as the responsible role for information regarding the Client files that are stored by the CSS and uses this information to constantly monitor the Client files’ integrity by submitting challenges to the CSS and later validating the responses of the CSS to each verification challenge.

The proposed protocol

The trust-oriented protocol for continuous monitoring of stored files in the cloud (TOPMCloud) was initially proposed by Pinheiro et al.[34][35] Then, it was further developed and tested, giving way to the results presented in this paper.

The TOPMCloud objective is to make the utilization of an outsourced service possible to allow clients to constantly monitor the integrity of their stored files in CSS without having to keep original file copies or revealing the contents of these files.

From another point of view, the primary requirement for the proposed TOPMCloud is to prevent the CSS provider from offering to and charging a client for a storage service that in practice is not being provided. Complementary requirements comprise low bandwidth consumption, minimal CSS overloading, rapid identification of a misbehaving service, strong defenses against fraud, stored data confidentiality, and utmost predictability for the ICS.

To respond to the specified requirements, TOPMCloud is designed with two distinct and correlated execution processes that are shown together in Figure 1. The first one is called “File Storage Process” and runs on demand from the Client that is this process starting entity. The second is the “Verification Process,” which is instantiated by an ICS and is continuously executed to verify a CSS. An ICS can simultaneously verify more than one CSS by means of parallel instances of the Verification Process.


Fig1 Pinheiro Sensors2018 18-3.png

Fig. 1 Trust-oriented protocol for continuous monitoring of stored files in the cloud (TOPMCloud) processes

The File Storage Process starts in the Client with the encryption of the file to be stored in the CSS. This first step, which is performed under the control of the file owner, is followed by the division of the encrypted file into 4096 chunks. These chunks are randomly permuted and are selected to be grouped into data blocks, each one with 16 distinct file chunks, and the position or address of each chunk is memorized. Then, hashes are generated from these data blocks. Each hash together with the set of its respective chunk addresses are used to build a data structure named the Information Table, which is sent to the ICS.

The selection and distribution of chunks used to assemble the data blocks are done in cycles. The number of cycles will vary according to the file storage period. Each cycle generates 256 data blocks without repeating chunks. The data blocks generated in each cycle contain all of the chunks of the encrypted file (256 * 16 = 4096).

The chosen values 4096, 16, and 256 come from a compromise involving the analysis of the protocol in the next subsections and the experimental evaluation that is presented in the "Experimental validation" section of this paper. Therefore, these values represent choices that were made considering the freshness of the information regarding the trust credited to a CSS, the time for the whole architecture to react to file storage faults, the required number of verifications to hold the trust in a CSS for a certain period of time, as well as the expected performance and the optimization of computational resources and network capacity consumption. The chosen values are indeed parameters in our prototype code, so they can evolve if the protocol requirements change.

The Verification Process in the ICS starts with the computation of how many files should be verified and how many challenges should be sent to a CSS, both numbers being calculated according to the trust level assigned to the CSS. Each stored hash and its corresponding chunk addresses will be used only once by the ICS to send an integrity verification challenge to the CSS provider.

In the CSS, the stored file will be used to respond to the challenges coming from the ICS. On receiving a challenge with a set of chunk addresses, the CSS reads the chunks from the stored file, assembles the data block, generates a hash from this data block, and sends this hash as the challenge answer to the ICS.

To finalize the verification by the ICS, the hash coming in the challenge answer is compared to the original file hash, and the result activates the trust level classification process. For this process, if the compared hashes are equal, this means that the verified content chunks are intact in the stored file in the CSS.

Trust level classification process

The trust level is evaluated as a real value in the range (−1, +1), with values from −1, meaning the most untrustful, to +1, meaning the most trustful, thus constituting the classification level that is attributed by the ICS to the CSS provider.

In the ICS, whenever a file hash verification process fails, the trust level of the verified CSS is downgraded, according to the following rules: when the current trust level value is greater than zero, it is set to zero (the ICS reacts quickly to a misbehavior from a CSS that was considered up to the moment as trustful); when the trust value is in the range between zero and −0.5, it is reduced by 15%; otherwise, the ICS calculates the value of 2.5% from the difference between the current trust level value and −1, and the result is subtracted from the trust level value (the ICS continuously downgrades a CSS that is still considered untrustful). These calculations are shown in Algorithm 1.


Alg1 Pinheiro Sensors2018 18-3.png

Alg. 1 Pseudocode for computing the TrustLevel in the case of hash verification failures

Conversely, whenever a checking cycle is completed without failures (all data blocks of a file have been checked without errors), the trust level assigned to a CSS is raised. If the current trust level value is less than 0.5, then the trust level value is raised by 2.5%. Otherwise, the ICS calculates the value of 0.5% from the difference between one and the current trust level value, and the result is added to the trust level value. These calculations are shown in Algorithm 2. This means that initially we softly redeem an untrustful CSS, while we exponentially upgrade a redeemed CSS and a still trustful CSS.


Alg2 Pinheiro Sensors2018 18-3.png

Alg. 2 Pseudocode for computing the TrustLevel in the case of hash verification failures

Again, these chosen thresholds and downgrading/upgrading values come from the experimental evaluation that is presented in the "Experimental validation" section, based on performance and applicability criteria. They are indeed parameters in our prototype code, so they can evolve if the protocol requirements change.

Freshness of the trust verification process

Since it is important to update the perception that a Client has about a CCS provider, the observed values of trust regarding a CSS are also used to determine the rhythm or intensity of verifications to be performed for this CSS.

Thus, the freshness of results from the trust verification process is assured by updating in the ICS the minimum percentage values of the number of stored files to be verified in a CSS, as well as the minimum percentages of data blocks that should be checked. We choose to present these updates by day, though again, this is a parameter in our implemented prototype.

Consequently, according to the observed trust level for a CSS, the number of files and the percentage of these file contents checked in this CSS are set as specified in Table 1. In this table, the extreme values one and −1 should respectively represent blind trust and complete distrust, but they are not considered as valid for our classification purposes, since we expect trust to be an ever-changing variable, including the idea of redemption.


Tab1 Pinheiro Sensors2018 18-3.png

Table 1 Classification of the trust levels for updating purposes

Whenever the trust value equals zero, as a means to have a decidable system, a fixed value must be artificially assigned to it to preserve the dynamics of evaluations. Thus, if the last verified result is a positive assessment, the value +0.1 is assigned to the observed trust; otherwise, if a verification fault has been observed, the assigned value is −0.1.

Variation of the trust level assigned to the cloud storage service

According to the TOPMCloud definition, the trust level assigned to a CSS always grows when a file-checking cycle is finished without the ICS detecting any verification failures during this cycle. Considering this rule, the first simulations regarding the evolution of trust in the ICS were used to determine the maximum number of days needed for the ICS to finish a checking cycle for a file stored in a CSS. The conclusion of a checking cycle indicates that each of the 4096 file chunks was validated as a part of one of the data blocks that are checked by means of the 256 challenges submitted by the ICS to the CSS.

The projected time for our algorithm to finish a file-checking cycle can vary between a minimum and a maximum value depending on the number of files simultaneously monitored by the ICS on a CSS. However, the checked file size should not significantly influence this time because the daily number of checked data blocks on a file is a percentage of the file size, as defined previously in Table 1.

By means of mathematical calculations, it is possible to determine that in a CSS classified with a “very high distrust” level, i.e., the worst trust level, the maximum time to finish a checking cycle is 38 days. Comparatively, in a CSS classified with a “very high trust” level, i.e., the best trust level, the time to finish a checking cycle can reach 1792 days. Figure 2 shows the maximum and the minimum number of days required to finish a file-checking cycle for each trust level proposed in TOPMCloud.


Fig2 Pinheiro Sensors2018 18-3.png

Fig. 2 Time required to complete a file-checking cycle

Notwithstanding the mathematical calculations regarding the proposed protocol’s maximum time required to finish a file-checking cycle, it is noticeable that this time can increase if the ICS or the CSS servers do not have enough computational capacity to respectively generate or to answer the necessary protocol challenges for each day. Furthermore, the file-checking cycle depends on the available network bandwidth and can worsen if the network does not support the generated packet traffic. This situation can occur when the number of CSS stored files is very large.

The variation of the time to conclude the checking cycle, according to the trust level assigned to the CSS, comes from the different number of data blocks verified per day. This variation aims to reward cloud storage services that historically have no faults, thus minimizing the consumption of resources such as processing capacity and network bandwidth. Moreover, this feature allows our proposed architecture to prioritize the checking of files that are stored in CSS providers, which have already presented faults. Consequently, this feature reduces the requested time to determine if other files were lost or corrupted.

Another interesting characteristic of the proposed protocol was analyzed with calculations that were realized to determine the number of file cycles concluded without identifying any fault so that the trust level assigned to a CSS raises to the highest trust level foreseen in Table 1, “very high trust.” Figure 3 presents the results of this analysis using as a starting point the “not evaluated” situation, which corresponds to a trust level equal to zero assigned to a CSS.


Fig3 Pinheiro Sensors2018 18-3.png

Fig. 3 Expected best performing trust level evolution for a CSS.

From the analysis of the results shown in Figure 2 and Figure 3, it could be concluded that the requested time for a CSS to obtain the maximum trust level is so large that it will be practically impossible to reach this level. This conclusion is easily obtained using the maximum number of days needed to finish a checking cycle for the “high trust” level (896) multiplied by the number of successfully concluded cycles to reach the level of “very high trust” (384 −202 = 182). The result of this calculation is 163.072 days (182 * 896), which is approximately 453 years.

Although this is mathematically correct, in practice, this situation would never occur. The simple explanation for this fact is related to the number of files that have been simultaneously monitored by the ICS in the CSS. The maximum expected time for the file-checking cycle conclusion only occurs when the number of monitored files in a CSS, classified with the level “high trust,” is equal to 25 or a multiple of this value. According to Table 1, this is due to the fact that, at the “high trust” level, it is required that 16% of the file content should be checked by day. The maximum time spent in file checking only occurs when the result of this file percentage calculation is equal to an integer value. Otherwise, the result is rounded up, thus increasing the percentage of files effectively checked.

Indeed, if the ICS is monitoring exactly 25 files in a CSS that is classified with the “high trust,” level and supposing that these files were submitted to CSS in the same day, the checking cycles for this set of files will finish in 896 days. Since in a period of 896 days, there are 25 concluded cycles, then about 20 years are needed for the CSS to attain the 182 cycles requested for reaching the next level, “very high trust.” However, this situation worsens if the number of considered files decreases. For instance, considering the “high trust” level, if there are only six files being monitored, then the time to attain the next level exceeds 65 years.

Figure 4 presents a comparative view of the time required to upgrade to the next trust level according to the number of monitored files. In general, less time will be required to increase the trust level if there are more monitored files.


Fig4 Pinheiro Sensors2018 18-3.png

Fig. 4 Time to upgrade the trust level according to the number of monitored file

As can be seen in Figure 4, the best case is obtained when the number of monitored files is equal to the required number of successfully concluded cycles to upgrade to the next trust level. For this number of files, the time required to increase the trust level is always equal to the time needed to conclude one checking cycle.

Opposite to the trust level raising curve that reflects a slow and gradual process, the trust level reduction is designed as a very fast process. The trust value assigned to the CSS always decreases when a challenge result indicates a fault in a checked file.

To evaluate the proposed process for downgrading the measured trust level, calculations were performed aiming to determine how many file-checking failures are needed for a CSS to reach the maximum distrust level. Any trust level between “very high trust” and “low trust” could be used as the starting point to these calculations. Then, when a challenge-response failure is identified, the trust value is changed to zero and the CSS is immediately reclassified to the “low distrust” level. From this level to the “very high distrust” level, the number of file-checking failures required to reach each next distrust level is shown in Figure 5.


Fig5 Pinheiro Sensors2018 18-3.png

Fig. 5 Number of file-checking failures needed to downgrade to each distrust level

Similarly to the trust level raising process, the required minimum time to downgrade to a distrust level is determined by the number of simultaneously-monitored files. Figure 6 presents a comparative view of the required minimum time to downgrade a CSS considering that all monitored files are corrupted and that failures will be identified upon the ICS receiving the first unsuccessful challenge response from the CSS.


References

  1. Tandel, S.T.; Shah, V.K.; Hiranwal, S. (2013). "An implementation of effective XML based dynamic data integrity audit service in cloud". International Journal of Societal Applications of Computer Science 2 (8): 449–553. https://web.archive.org/web/20150118081656/http://ijsacs.org/previous.html. 
  2. 2.0 2.1 Dabas, P.; Wadhwa, D. (2014). "A Recapitulation of Data Auditing Approaches for Cloud Data". International Journal of Computer Applications Technology and Research 3 (6): 329–32. doi:10.7753/IJCATR0306.1002. https://ijcat.com/archieve/volume3/issue6/ijcatr03061002. 
  3. Mell, P.; Grance, T. (September 2011). "The NIST Definition of Cloud Computing". Computer Security Resource Center. https://csrc.nist.gov/publications/detail/sp/800-145/final. 
  4. 4.0 4.1 Miller, M. (2008). Cloud Computing: Web-Based Applications That Change the Way You Work and Collaborate Online. Que Publishing. ISBN 9780789738035. 
  5. Velte, T.; Velte, A.; Elsenpeter, R.C. (2009). Cloud Computing: A Practical Approach. McGraw-Hill Education. ISBN 9780071626941. 
  6. Zhou, M.; Zhang, R.; Zeng, D.; Qian, W. (2010). "Services in the Cloud Computing era: A survey". Proceedings from the 4th International Universal Communication Symposium: 40–46. doi:10.1109/IUCS.2010.5666772. 
  7. Jing, X.; Jian-Jun, Z. (2010). "A Brief Survey on the Security Model of Cloud Computing". Proceedings from the Ninth International Symposium on Distributed Computing and Applications to Business, Engineering and Science: 475–8. doi:10.1109/DCABES.2010.103. 
  8. Mell, P.; Grance, T. (October 2009). "The NIST Definition of Cloud Computing" (PDF). https://www.nist.gov/sites/default/files/documents/itl/cloud/cloud-def-v15.pdf. 
  9. 9.0 9.1 Marsh, S.P. (April 1994). "Formalising Trust as a Computational Concept" (PDF). University of Stirling. http://stephenmarsh.wdfiles.com/local--files/start/TrustThesis.pdf. 
  10. Gambetta, D. (1990). "Can We Trust Trust?". In Gambetta, D. (PDF). Trust: Making and Breaking Cooperative Relations (2008 Scanned Digital Copy). ISBN 0631155066. https://www.nuffield.ox.ac.uk/media/1779/gambetta-trust_making-and-breaking-cooperative-relations.pdf. 
  11. Jøsang, A.; Knapskog, S.J. (2011). "A Metric for Trusted Systems" (PDF). Proceedings from the 21st National Information Systems Security Conference. https://csrc.nist.gov/csrc/media/publications/conference-paper/1998/10/08/proceedings-of-the-21st-nissc-1998/documents/papera2.pdf. 
  12. Victor, P.; De Cock, M.; Cornelis, C. (2011). "Trust and Recommendations". In Ricci, F.; Rokach, L.; Shapira, B.; Kantor, P.B.. Recommender Systems Handbook. Springer. pp. 645–75. ISBN 9780387858197. 
  13. Adnane, A.; Bidan, C.; de Sousa Júnior, R.T. (2013). "Trust-based security for the OLSR routing protocol". Computer Communications 36 (10–11): 1159-71. doi:10.1016/j.comcom.2013.04.003. 
  14. De Sousa Jr., R.T.; Puttini, R.S. (2010). "Trust Management in Ad Hoc Networks". In Yan, Z.. Trust Modeling and Management in Digital Environments: From Social Concept to System Development. IGI Global. pp. 224–49. ISBN 9781615206827. 
  15. 15.0 15.1 15.2 Yahalom, R.; Klein, B.; Beth, T. (1993). "Trust relationships in secure systems-a distributed authentication perspective". Proceedings from the 1993 IEEE Computer Society Symposium on Research in Security and Privacy: 150–64. doi:10.1109/RISP.1993.287635. 
  16. Grandison, T.; Sloman, M. (2000). "A survey of trust in internet applications". IEEE Communications Surveys & Tutorials 3 (4): 2–16. doi:10.1109/COMST.2000.5340804. 
  17. Bellare, M.; Boldyreva, A.; Micali, S. (2000). "Public-Key Encryption in a Multi-user Setting: Security Proofs and Improvements". Proceedings from Advances in Cryptology — EUROCRYPT 2000: 259–74. doi:10.1109/COMST.2000.5340804. 
  18. Bose, R. (2008). Information Theory, Coding and Cryptography (2nd ed.). Mcgraw Hill Education. pp. 297–8. ISBN 9780070669017. 
  19. Rivest, R. (April 1992). "The MD5 Message-Digest Algorithm". ietf.org. https://tools.ietf.org/html/rfc1321. Retrieved 25 June 2016. 
  20. Dworkin, M.J. (4 August 2015). "SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions". NIST. https://www.nist.gov/publications/sha-3-standard-permutation-based-hash-and-extendable-output-functions. 
  21. Aumasson, J.-P.; Neves, S.; W.-O.; Winnerlein, C. (2013). "BLAKE2: Simpler, Smaller, Fast as MD5". Proceedings from the 2013 International Conference on Applied Cryptography and Network Security: 119–35. doi:10.1007/978-3-642-38980-1_8. 
  22. Tahta, U.E.; Sen, S.; Can, A.B. (2015). "GenTrust: A genetic trust management model for peer-to-peer systems". Applied Soft Computing 34: 693–704. doi:10.1016/j.asoc.2015.04.053. 
  23. Gholami, A.; Arani, M.G. (2015). "A Trust Model Based on Quality of Service in Cloud Computing Environment". International Journal of Database Theory and Application 8 (5): 161–70. https://pdfs.semanticscholar.org/487e/11b3605276b5ff66de363d4e735bcdd740c3.pdf?_ga=2.219348827.37751313.1553622532-1472248397.1551840079. 
  24. Canedo, E.D. (30 January 2013). "Modelo de confiança para a troca de arquivos em uma nuvem privada - Tese (Doutorado em Engenharia Elétrica)". Universidade de Brasília. http://repositorio.unb.br/handle/10482/11987. 
  25. 25.0 25.1 Juels, A.; Kaliski, Jr., B.S. (2007). "PORs: Proofs of retrievability for large files". Proceedings of the 14th ACM Conference on Computer and Communications Security: 584–97. doi:10.1145/1315245.1315317. 
  26. Kumar, R.S.; Saxena, A. (2011). "Data integrity proofs in cloud storage". Proceedings of the Third International Conference on Communication Systems and Networks: 1–4. doi:10.1109/COMSNETS.2011.5716422. 
  27. George, R.S.; Sabitha, S. (2013). "Data anonymization and integrity checking in cloud computing". Proceedings of the Fourth International Conference on Computing, Communications and Networking Technologies: 1–5. doi:10.1109/ICCCNT.2013.6726813. 
  28. Kavuri, S.K.S.V.A.; Kancherla, G.R.; Bobba, B.R. (2014). "Data authentication and integrity verification techniques for trusted/untrusted cloud servers". Proceedings of the 2014 International Conference on Advances in Computing, Communications and Informatics: 2590-2596. doi:10.1109/ICACCI.2014.6968657. 
  29. Al-Jaberi, M.F.; Zainal, A. (2014). "Data integrity and privacy model in cloud computing". Proceedings of the 2014 International Symposium on Biometrics and Security Technologies: 280-284. doi:10.1109/ISBAST.2014.7013135. 
  30. Kai, H.; Chuanhe, H.; Jinhai, W. et al. (2013). "An Efficient Public Batch Auditing Protocol for Data Security in Multi-cloud Storage". Proceedings of the 8th ChinaGrid Annual Conference: 51-56. doi:10.1109/ChinaGrid.2013.13. 
  31. Wang, Q.; Wang, C.; Li, J. et al. (2009). "Enabling public verifiability and data dynamics for storage security in cloud computing". Proceedings of the 14th European conference on Research in computer security: 355–70. doi:10.1007/978-3-642-04444-1_22. 
  32. 32.0 32.1 Pflanzner, T.; Tornyai, R.; Kertesz, A. (2016). "Towards Enabling Clouds for IoT: Interoperable Data Management Approaches by Multi-clouds". In Mahmood, Z. publisher=Springer. Connectivity Frameworks for Smart Devices. pp. 187–207. doi:10.1007/978-3-319-33124-9_8. ISBN 9783319331225. 
  33. Gracia-Tinedo, R.; Artigas, M.S.; Moreno-Martinez, A. et al. (2013). "Actively Measuring Personal Cloud Storage". Proceedings of the 2013 IEEE Sixth International Conference on Cloud Computing: 301-308. doi:10.1109/CLOUD.2013.25. 
  34. Pinheiro, A.; Canedo, E.D.; De Sousa, Jr.; R.T. et al. (2016). "A Proposed Protocol for Periodic Monitoring of Cloud Storage Services Using Trust and Encryption". Proceedings of the 2016 International Conference on Computational Science and Its Applications: 45–59. doi:10.1007/978-3-319-42108-7_4. 
  35. Pinheiro, A.; Canedo, E.D.; De Sousa, Jr.; R.T. et al. (2016). "Trust-Oriented Protocol for Continuous Monitoring of Stored Files in Cloud". Proceedings of the Eleventh International Conference on Software Engineering Advances: 295–301. https://thinkmind.org/index.php?view=article&articleid=icsea_2016_13_20_10164. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation, grammar, and punctuation. In some cases important information was missing from the references, and that information was added.