Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Pervasive Computing: Next Generation Platforms for Intelligent Data Collection
Pervasive Computing: Next Generation Platforms for Intelligent Data Collection
Pervasive Computing: Next Generation Platforms for Intelligent Data Collection
Ebook1,017 pages11 hours

Pervasive Computing: Next Generation Platforms for Intelligent Data Collection

Rating: 4.5 out of 5 stars

4.5/5

()

Read preview

About this ebook

Pervasive Computing: Next Generation Platforms for Intelligent Data Collection presents current advances and state-of-the-art work on methods, techniques, and algorithms designed to support pervasive collection of data under ubiquitous networks of devices able to intelligently collaborate towards common goals.

Using numerous illustrative examples and following both theoretical and practical results the authors discuss: a coherent and realistic image of today’s architectures, techniques, protocols, components, orchestration, choreography, and developments related to pervasive computing components for intelligently collecting data, resource, and data management issues; the importance of data security and privacy in the era of big data; the benefits of pervasive computing and the development process for scientific and commercial applications and platforms to support them in this field.

Pervasive computing has developed technology that allows sensing, computing, and wireless communication to be embedded in everyday objects, from cell phones to running shoes, enabling a range of context-aware applications. Pervasive computing is supported by technology able to acquire and make use of the ubiquitous data sensed or produced by many sensors blended into our environment, designed to make available a wide range of new context-aware applications and systems. While such applications and systems are useful, the time has come to develop the next generation of pervasive computing systems. Future systems will be data oriented and need to support quality data, in terms of accuracy, latency and availability.

Pervasive Computing is intended as a platform for the dissemination of research efforts and presentation of advances in the pervasive computing area, and constitutes a flagship driver towards presenting and supporting advanced research in this area.

Indexing: The books of this series are submitted to EI-Compendex and SCOPUS

  • Offers a coherent and realistic image of today’s architectures, techniques, protocols, components, orchestration, choreography, and development related to pervasive computing
  • Explains the state-of-the-art technological solutions necessary for the development of next-generation pervasive data systems, including: components for intelligently collecting data, resource and data management issues, fault tolerance, data security, monitoring and controlling big data, and applications for pervasive context-aware processing
  • Presents the benefits of pervasive computing, and the development process of scientific and commercial applications and platforms to support them in this field
  • Provides numerous illustrative examples and follows both theoretical and practical results to serve as a platform for the dissemination of research advances in the pervasive computing area
LanguageEnglish
Release dateMay 6, 2016
ISBN9780128037027
Pervasive Computing: Next Generation Platforms for Intelligent Data Collection

Related to Pervasive Computing

Related ebooks

Programming For You

View More

Related articles

Reviews for Pervasive Computing

Rating: 4.5 out of 5 stars
4.5/5

2 ratings1 review

What did you think?

Tap to rate

Review must be at least 10 words

  • Rating: 4 out of 5 stars
    4/5
    GET ME MORE INFO AND BOOK IS VERY INTERESTING . I LIKE IT

Book preview

Pervasive Computing - Ciprian Dobre

TIN2013-46181-C2-1-R).

Part I

Automated Capture of Experiences with Easy Access

Chapter 1

On preserving privacy in cloud computing using ToR

A. Carniellia; M. Aiasha; M. Alazabb    a Middlesex University, London, United Kingdom

b Macquarie University, Sydney, NSW, Australia

Abstract

The Internet allows for an efficient, inexpensive collection of information without surfers’ consents. This includes surfers’ preferences, interests or even credit card information. Furthermore, it is well known that some oppressive governments filter the information that can digitally reach their countries. Also, there are evidences that certain governments spy on citizens and organizations through intercepting phone calls, activities on social media and blogs. For Internet browsing, several methods of privacy-preserving and anonymizers have been developed. An example of such a technology is The Onion Router (ToR) network, which has been proven to guarantee users’ anonymity and privacy while they are browsing the Internet. However, with the wide deployment of virtualization solutions and the tremendous migration to cloud-based services, many online services have moved onto the cloud. This situation complicates the issue of preserving a client’s privacy and anonymity and raises a question mark as to the efficiency of current anonymizers in this new environment. To the best of our knowledge, few studies have been conducted to analyze the compatibility between the ToR network and this new emerging infrastructure. This paper presents a qualitative analysis of the feasibility and efficiency of the ToR in cloud infrastructure and the possibility of integrating these technologies.

Keywords

Privacy; Cloud computing; ToR; Anonymity; Compatibility analysis

1 Introduction

Over the past two decades, threats to users’ privacy and identities have been of rising concern for Internet surfers. According to a report on anonymity, privacy, and security online (Dingledine et al., 2004), most Internet users would like to be anonymous online, at least occasionally, but many think it is not possible to be completely anonymous. The reasons behind such a belief vary from the increasing number of data leakage scandals, to the increasing demands by surfers to not be observed by specific people, organizations, or governments. The report in Dingledine et al. (2004) shows that 55% of Internet users have taken steps online to remove or mask their digital footprints and to avoid their activities being monitored and traced. Such steps range from clearing cookies to encrypting their email and using virtual networks that mask their Internet Protocol (IP) address. Such an attitude is ascribed to the fact that certain authoritative bodies and governments tend to misuse their power and spy on their own people. Several governments also filter specific websites through their border gateways in order to prevent flows of information from reaching their countries (as is the case for People’s Republic of China, North Korea, Bahrain, Iran, Vietnam (Rininsland, 2012), or, most recently, Venezuela (Bajak, 2014)).

Obviously, governments justify these activities of wire-tapping and interception as measures to ensure the National Security. Nevertheless, because of the invasive nature of such methods, Internet dwellers began to design and develop software in the forms of browsing anonymizers and communication obfuscation techniques to maintain their privacy and anonymity in the cyber world. Furthermore, considering the situation and events in some parts of the world such as The Arab Spring in the Middle East, hiding one’s identity while browsing the Internet might be crucial for personnel safety in these oppressive countries.

This situation highlights the huge demand placed on security mechanisms for protecting users’ privacy and anonymity on the Internet. An example of such a mechanism that has been made freely available to users is called The Onion Router (ToR) (Fang et al., 2000; Danezis et al., 2010; Michael et al., 2010). ToR has been designed to make it possible for users to surf the Internet anonymously, so that their activities and locations cannot be discovered by government agencies, corporations, or anyone else. Compared with other anonymizers such as Invisible Internet Project (I2P) (Wang et al., 2011), ToR is more popular; has more visibility in the academic and hacker communities; and benefits from formal studies of anonymity, resistance, and performance.

ToR and other anonymizers were initially conceived to run over the traditional Internet. However, the emergence and wide adoption of cloud-based services raises huge doubts about the efficiency of these anonymizers if used with these new technologies. Furthermore, customers of cloud computing have more concerns about the secrecy of their data, due to the fact that such data are being moved from the client’s local devices to third party online storage. The purpose of this research is to analyze the feasibility of running the ToR network on top of an infrastructure embracing both the traditional and the cloud computing models.

The rest of the paper is organized as follows. Section 2 gives an overview of cloud computing architecture and sheds light on issues such as privacy and anonymity in cloud computing. Section 3 describes the ToR network. Our experiments to deploy ToR with cloud-based services are described in Section 4.

2 Overview of cloud computing

Arguably, there is not a single, unified definition for cloud computing. Being a relatively new concept, different institutions and researchers define cloud computing in their own way; obviously, the key features of the cloud remain the same in each definition. Foster et al. (in Zhao et al., 2008) give a very deterministic and perhaps complicated delineation of this new technology. They define cloud computing as A large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualized, dynamically scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet. In this research, when referring to cloud computing, we avail ourselves of the following simpler description, given by Danish et al. (in Wang et al., 2011): Cloud Computing technology is a new concept of providing dramatically scalable and virtualized resources, bandwidth, software and hardware on demand to consumers. Consumers can typically requests Cloud services via a web browser or web service.

The statement is self-explanatory. In simple terms, cloud computing is a revolutionary way of providing users (who may be a single person or even medium-size companies) with all the kinds of IT resources that they may need in order to fulfill their IT infrastructure requirements. In a cloud computing environment, there are two main players: the cloud provider (CP) (who owns a very huge data center and loans hardware and software resources) and the cloud consumer (who pays a fee to use such resources). Three main categories (models) exist for cloud deployment: public cloud, private cloud, and hybrid cloud as shown in Fig. 1.

Fig. 1 Types of cloud computing. From Ribeiro, M., 2010. Thoughts on Information Technology. https://itechthoughts.wordpress.com/.

When the hardware and software resources of a CPs data center are available as a pay-as-you-go service to the general public, the infrastructure is called a public cloud. When, on the other hand, the data center’s resources are only available to a group of users confined within the perimeter of a company’s internal network (such as its Intranet), the infrastructure is called a private cloud. A hybrid cloud forms in cases where the internal users of a big corporation or institution can afford both a public and a private cloud and are forced to utilize some of the public cloud resources in order to accomplish tasks that were meant to be executed within the private cloud environment. The latter situation can arise when the amount of tasks to be accomplished in a specific time frame is so high that the private cloud’s infrastructure is not enough.

2.1 Cloud Computing Reference Model

In cloud computing, because of the extremely high availability of hardware and software components, thanks to the improvements in virtualization techniques, resources appear infinite to the consumers, who do not have to worry about any possible under-provisioning. All resources, in fact, are being used almost ceaselessly since they are virtualized (estimates of the average server utilization in conventional data centers range from 5% to 20% according to Rivlin, 2008) and hence their throughput is maximized. Apart from under-provisioning, users do not need to worry about over-provisioning either because cloud consumers pay only for the resources they need on a very short-term basis (for certain resources, such as CPUs, the granularity is as fine as an hourly fee).

To appreciate the benefits of cloud computing to online service providers, we give the following examples. Considering a photo-sharing website: it is anticipated that during high seasons the website will most likely experience huge demands on its resources and, in contrast away from these seasons the demands will be minimal. By migrating the service to the cloud, the company managing the website need not own any of the equipment needed to run its business and all hardware and software may be kept and managed remotely by the CP. The great advantage of this approach is that the company will benefit from the elasticity of cloud computing; the company will demand and pay for more resources from the CPs, these resources will be released if no more are needed after the end of high season. This example highlights the importance of cloud computing for flourishing new businesses. One major technology that enables this flexibility in cloud computing is virtualization. With virtualization, a CP uses a virtual machine monitor to manage and share the hardware resources among different virtual machines and can also seamlessly set up and run more virtual machines to accommodate customers’ demands. This concept is shown in Fig. 2.

Fig. 2 VM architecture. From Walker, G., 2012. Cloud Computing Fundamentals. https://www.ibm.com/developerworks/cloud/library/cl-cloudintro/.

As shown in Fig. 3, cloud computing comprises the following computing services at both hardware and software levels (Aiash et al., 2015).

Fig. 3 Cloud computing architecture. From Aiash, M., et al., 2015. Introducing a hybrid infrastructure and information-centric approach for secure cloud computing. In: Proceedings of IEEE 29th International Conference on Advanced Information Networking and Applications Workshops (WAINA).

• Infrastructure as a Service (IaaS) provides the infrastructural components in terms of processing, storage, and networking. It uses virtualization techniques to provide multi-tenancy, scalability, and isolation; different virtual machines can be allocated to a single physical machine known as the host. Examples of such service are Amazon S3, EC2; Mosso and OpenNebula (Varia et al., 2014; Llorente, 2014).

• Platform as a Service (PaaS) provides the service of running applications without the hassle of maintaining the hardware and software infrastructure of the IaaS. Google App Engine and Microsoft Azure (Tulloch, 2013) are examples of PaaS.

• Software as a Service (SaaS) is a model of software deployment that enables end-users to run their software and applications on-demand. Examples of SaaS are Salesforce.com and Clarizen.com.

2.2 Privacy in the Cloud

A significant barrier to the adoption of Cloud services is customer fear of privacy loss in the Cloud (Khan et al., 2012). The statement highlights one of the major problems in cloud computing. According to a statistic of the Fujitsu Research Institute (Fujitsu Research Institute, 2010), 88% of potential cloud customers are hesitating on whether to move to the cloud because of their concerns about the privacy of their data. Due to the intrinsic nature of this new technology and to the concerns shown by customers, privacy-preservability has been considered as a fundamental requirement for cloud computing. When a customer decides to entrust his or her data to a third party corporation, he or she has to be aware of the fact that the data might be physically stored together with sensitive information of other companies. This implies that an efficient access control system must be put in place in order to avoid privacy violations. To clarify this important concept, let us provide an example. Suppose that two banks A and B rely on the same CP and use the same model (IaaS). Each bank holds very confidential information about its clients (account numbers, balance, last transactions, etc.). Due to the deployment of virtualization by the CP, the sensitive information from A is likely to physically reside together with other information, not pertaining to A. Indeed, data belonging to A may be contiguous in the memory device to the data belonging to B. This means that if a very efficient access control system is not in place, the two competitors may be able to access each other’s data. One of the various solutions to this problem is an access control policy called the Chinese wall model, which prevents two competitors who have a conflict of interests from accessing the information that does not belong to them. The Chinese wall model is illustrated in Fig. 4.

Fig. 4 Chinese wall model. From David, F., et al., 2013. The Chinese Wall Security Policy. http://www.gammassl.co.uk/research/chinesewall.php.

In simple terms, the figure shows that if bank A shares the same CP as bank B, then the data from A will not be readable to B and vice versa. This is possible thanks to the property of the Chinese wall model that states that A subject (s) is permitted write access to an object only if (s) has no read access to any object o’, which is in a different company dataset and is unsanitized. In this case, (s) is the bank B, (o’) is the information from bank A, and both A and B have two different company datasets. If B tries to access the information from A, the operation will be denied. The same happens if A tries to access the information from B. This example is just used for highlighting the severity of data privacy issue in cloud computing. This is a major issue and has attracted huge research efforts. Discussing these efforts is beyond the scope of this paper; nevertheless, more information about the Chinese wall model can be found in Brewer et al. (2013).

Another example that highlights the importance of privacy in a shared infrastructure such as that of a public cloud is the one of medical data. Imagine if it was possible for users of a cloud to access all the information in that cloud. More technically, imagine if a customer renting a pool of shared memory devices from a CP could access all the information on those devices (even though it did not belong to the customer). The consequences would be dramatic, especially if non-authorized users had access to medical records of the patients of a health institution.

Besides access-control mechanisms, there are other ways to enforce privacy-preservation. Confidentiality and integrity together provide a sort of data privacy. By using confidentiality, unauthorized users will not be able to understand the data. By using integrity, if any user tries to modify the data this will be detected and the data owner alerted (via checksum or Cyclic Redundancy Check or any other integrity checking algorithm). As Xiao et al. discuss in Xiao and Xiao (2013), In some sense, privacy-preservability is a stricter form of confidentiality, due to the notion that they both prevent information leakage. We share the view that in order to enforce privacy, confidentiality is an essential factor. If data are being encrypted and the decryption key is known only to the designated parties, such data are going to be readable by but not understandable to anyone who does not have the decryption key, hence guaranteeing privacy. Providing integrity to the encrypted data further ensures that these data have not been altered and this, therefore, ensures privacy. Unfortunately, encrypting data using an algorithm that is both strong and efficient is a very expensive operation and is not yet feasible without heavily affecting performance. In an environment such as the cloud, where a tsunami of data is being processed every second, the amount of encryption/decryption operations per second would be extremely high. A solution that has been adopted by the computer experts and by the non-standard users is encryption before submission. In this procedure, data/processes are being encrypted directly by the user before submitting them to the cloud, where they are handled in an encrypted form. This solution has several positive perspectives but it has not been widely deployed yet due to the lack of knowledge of the standard user. There is another property which is necessary and is argued to enhance privacy, but which actually may undermine it: accountability. Accountability is the process of monitoring the resources used by a customer and charging him or her accordingly. This property is at the center of a battle of opinions because some researchers claim accountability and confidentiality to be in conflict due to the fact that if you avail yourself of one, you automatically renounce the other. We acknowledge that accountability and confidentiality are two distinct properties that can work very well together if the underlying system is designed and configured in an appropriate manner.

Although a full privacy-preserving solution in cloud computing has not been standardized as yet, studies and attempts have been made to integrate such a service in the cloud environment. In Craig (2009), Gentry proposed Fully Homomorphic Encryption (FHE), which appears to work well but is still too inefficient for practical use since the complexity of the involved operations is greater than the acceptable level to which average CPUs work nowadays. A possible solution to this problem might be a sort of reduced homomorphic encryption; an example is the one presented by Naehrig et al. (in Craig, 2009). Itani et al. (2009) presented Privacy-as-a-Service, which tries to exploit the power corruption-proof characteristics of cryptographic coprocessors in order to enable secure storage and data privacy. This scheme could work well if the protected environment in which the coprocessors reside within the cloud infrastructure is part of a trusted party and is constantly monitored against tampering. Pearson et al. (in Pearson et al., 2009; Miranda and Siani, 2009) discussed a privacy manager that works by using obfuscation and de-obfuscation techniques on data in order to process such data only when they are already in an encrypted form. In this case, data are obfuscated with a user-chosen key before being sent to the cloud in such a way that the cloud cannot de-obfuscate them. This guarantees absolute secrecy. Most recently, Jin Li et al. considered the challenge of data privacy in the cloud. In Li et al. (2015a), they proposed L-EncDB, a novel lightweight encryption mechanism for database, which (i) keeps the database structure and (ii) supports efficient SQL-based queries. Furthermore, in Li et al. (2015b), they considered the problem of ensuring the integrity of data storage in cloud computing. The main aim was to reduce the computational cost to the user during the integrity verification of their data, especially in the case of power/resource-constrained users. To tackle the challenge, they proposed OPoR, a new cloud storage scheme involving a cloud storage server and a cloud audit server.

2.3 Anonymity in the Cloud

Related to the problem of data privacy is the issue of preserving users’ anonymity online. Arguably, anonymity have long been considered as a complementary security feature. However, as described in Section 1, anonymity and privacy are becoming rather fundamental services that should be available to online customers.

In this regard, Jensen et al. (2010) propose a possible method to guarantee anonymity in the cloud. However, such a method cannot work properly if the resources’ billing process is not flat rate. The solution proposed by them involves ring and group signatures with the constraint of the cloud customer having a flat rate billing system instead of the usual pay-as-you-go system.

• In ring signatures, the cloud customers are part of a huge group the (L) of users. The bigger the (L) group is the greater the guaranteed anonymity level. Customers willing to request a cloud resource will have to sign the request anonymously with respect to (L). The cloud (which acts as the verifier) will be able to determine that the customer pertains to (L), but will not be able to decide which member of (L) produced the signature. This way, anonymity will be guaranteed for requests. However, in the case of a two-way interaction where the customer needs an output, additional anonymity-preserving measures will have to be deployed. In addition, by using ring signatures the cloud cannot identify the source of a request and that is why this scheme works only if the users of (L) use a flat rate billing procedure, by which they pay an initial fee and then are allowed to use resources for a determined time frame. It is not possible to charge using the traditional pay-as-you-go method.

• In group signatures, on the other hand, cloud customers have to register with a group manager that provides each user with a sort of public key infrastructure (PKI) certificate. Every request from a customer will be signed with the certificate and since the group manager is supposed to be a trusted entity, the identity of the requester will not be disclosed. In case of policy violations (eg, the cloud customer uploads illegal material on a website), however, the group manager will be authorized to reveal to the authorities in charge the identity of the lawbreaker.

However, for the time being, it is not yet possible to provide a full anonymity solution unless we rely on external anonymizing platforms such as ToR, as described in the following sections.

3 An overview of ToR

ToR is a low-latency, circuit-based and privacy-preserving anonymizing platform and network. It is one of several systems that have been developed to provide Internet users with a high level of privacy and anonymity in order to cope with the censorship measures taken by authorities and to protect against the constantly increasing threats to these two key security properties. ToR achieves its goals by creating an overlay network, composed of relays (nodes) that randomly forward users’ data between the originator (source) and destination. ToR, therefore, operates over the traditional TCP/IP network, sets up an overlay network that hides the identity of both source and destination nodes, and preserves the confidentiality of the traversing packets. It is worth mentioning that the words node and relay will be used interchangeability throughout the rest of the paper.

3.1 The ToR Network

The ToR network is composed of the ToR-client, an entry/guard node, several relays, and the exit node.

• The ToR-client: Is a piece of software, installed on each ToR user’s device. It enables the user to create a ToR anonymizing circuit and to handle all the cryptographic keys needed to communicate with all the nodes within the circuit.

• The Entry Node: Is the first node in the circuit that receives the client request and forwards it to the second relay in the network.

• The Exit Node: Is the last ToR-relay in the circuit.

Once the connection request leaves the entry node, it will be bounced among all the relays in the circuit until it reaches the exit node. The latter receives the request and relays it to the final destination.

As shown in Fig. 5, the connections in the ToR network between the entry and exit nodes are encrypted using advanced encryption standard (AES). However, the connections between the exit node and the final destination are not encrypted by ToR. This implies that if the session between the client and the destination is not encrypted as part of a higher layer security protocol, such as the HTTPS, an attacker residing near the destination will be able to disclose the data.

Fig. 5 The ToR network. From ToR, 2015a. ToRChat. https://github.com/prof7bit/TorChat.

3.2 Connection/Circuit Setup in ToR

As mentioned previously, ToR uses AES to encrypt the connections between the relays. However, AES is a symmetric encryption algorithm where the same key is used for encryption and decryption. This means that for the ToR-client to use AES, it needs to have shared encryption keys with the relaying nodes. For sharing these keys, ToR uses asymmetric encryption as part of the transport layer security (TLS) protocol.

Each connection from a ToR-relay to another within the ToR network is protected by the TLS protocol. At the moment of building the circuit, the ToR-client needs to gather all the public keys of the nodes in the circuit and has to establish a connection to each of them; these connections will be used to exchange the relevant symmetric keys. To agree on a shared key with a relay, the ToR client starts a Diffie-Hellman key exchange (Chaum, 1998). Initially, the ToR-client encrypts the Diffie-Hellman challenge with the public key of the receiver; it then encrypts the message obtained with the public key of the ToR-relay preceding the receiver in the circuit. This new message will be further encrypted with the public key of all the remaining relays in the circuit going backwards and in their order of encounter respectively. Eventually, the multi-encapsulated message is sent. This procedure is showed in Fig. 6.

Fig. 6 The ToR cryto-encapsulation. From ToR, 2015b. What is the ToR Browser? https://www.torproject.org/projects/torbrowser.html.en.

When the entry node receives the message, sent from the ToR-client, it peels off the first layer of encryption using its private key and, hence, finds enclosed another encrypted message. The relay cannot decipher the enclosed message since it is encrypted with the public key of the next relay. However, within the content that it can understand, it will receive its personal Diffie-Hellman challenge and the instruction to forward the rest of the message to the next relay. When the next relay in the circuit receives the message, it peels off another layer of the multi-encapsulated message (hence the name ToR) and similarly will not be able to understand the inner content except from the forwarding instruction and its own challenge. It is important to note that the routing information of each message is also encrypted with each relay’s public key (that is why it is an encapsulation technique and not a simple encryption one). Using the multi-encapsulation guarantees that the intermediate relays are anonymized; each receiver cannot know the identity of the relay that comes after the next one in the circuit. Eventually, when the multi-decapsulated message reaches the exit relay, the latter will remove the encryption layer using its own private key and will retrieve its Diffie-Hellman

Enjoying the preview?
Page 1 of 1