Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Optimization Strategies for A/B Testing on HADOOP

Cherniak, Andrii and Zaidi, Huma and Zadorozhny, Vladimir (2013) Optimization Strategies for A/B Testing on HADOOP. In: International Conference on Very Large Data Bases (VLDB'13), Riva del Garda, Trento, Italy.

[img] Plain Text (licence)
Available under License : See the attached license file.

Download (1kB)

Abstract

In this work, we present a set of techniques that considerably improve the performance of executing concurrent MapRe- duce jobs. Our proposed solution relies on proper resource allocation for concurrent Hive jobs based on data depen- dency, inter-query optimization and modeling of Hadoop cluster load. To the best of our knowledge, this is the first work towards Hive/MapReduce job optimization which takes Hadoop cluster load into consideration. We perform an experimental study that demonstrates 233% reduction in execution time for concurrent vs sequential ex- ecution schema. We report up to 40% extra reduction in execution time for concurrent job execution after resource usage optimization. The results reported in this paper were obtained in a pi- lot project to assess the feasibility of migrating A/B testing from Teradata + SAS analytics infrastructure to Hadoop. This work was performed on eBay production Hadoop clus- ter which uses capacity scheduler. Our analytics jobs were implemented using Apache Hive.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: Conference or Workshop Item (Paper)
Status: Published
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Cherniak, Andrii
Zaidi, Huma
Zadorozhny, Vladimir
Date: 26 August 2013
Date Type: Publication
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Journal or Publication Title: Proceedings of 39th International Conference on Very Large Data Bases (VLDB'13)
Event Title: International Conference on Very Large Data Bases (VLDB'13)
Event Type: Conference
Institution: University of Pittsburgh
Schools and Programs: School of Information Sciences > Information Science
Refereed: Yes
Date Deposited: 15 Jul 2013 14:53
Last Modified: 25 Aug 2017 05:03
URI: http://d-scholarship.pitt.edu/id/eprint/19291

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item