Palanisamy, B and Singh, A and Mandagere, N and Alatorre, G and Liu, L
(2014)
MapReduce analysis for cloud-archived data.
Proceedings - 14th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2014.
51 - 60.
Abstract
Public storage clouds have become a popular choice for archiving certain classes of enterprise data - for example, application and infrastructure logs. These logs contain sensitive information like IP addresses or user logins due to which regulatory and security requirements often require data to be encrypted before moved to the cloud. In order to leverage such data for any business value, analytics systems (e.g. Hadoop/MapReduce) first download data from these public clouds, decrypt it and then process it at the secure enterprise site. We propose VNCache: an efficient solution for MapReduceanalysis of such cloud-archived log data without requiring an apriori data transfer and loading into the local Hadoop cluster. VNcache dynamically integrates cloud-archived data into a virtual namespace at the enterprise Hadoop cluster. Through a seamless data streaming and prefetching model, Hadoop jobs can begin execution as soon as they are launched without requiring any apriori downloading. With VNcache's accurate pre-fetching and caching, jobs often run on a local cached copy of the data block significantly improving performance. When no longer needed, data is safely evicted from the enterprise cluster reducing the total storage footprint. Uniquely, VNcache is implemented with NO changes to the Hadoop application stack. © 2014 IEEE.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
Article
|
Status: |
Published |
Creators/Authors: |
Creators | Email | Pitt Username | ORCID  |
---|
Palanisamy, B | BPALAN@pitt.edu | BPALAN | | Singh, A | | | | Mandagere, N | | | | Alatorre, G | | | | Liu, L | | | |
|
Date: |
1 January 2014 |
Date Type: |
Publication |
Access Restriction: |
No restriction; Release the ETD for access worldwide immediately. |
Journal or Publication Title: |
Proceedings - 14th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2014 |
Page Range: |
51 - 60 |
Event Type: |
Conference |
DOI or Unique Handle: |
10.1109/ccgrid.2014.13 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
School of Information Sciences > Information Science |
Refereed: |
Yes |
ISBN: |
9781479927838 |
Date Deposited: |
24 Jun 2014 20:09 |
Last Modified: |
02 Feb 2019 16:55 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/22063 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Altmetric.com
Actions (login required)
 |
View Item |