Pitt Logo LinkContact Us

SCALABLE PROCESSING OF MULTIPLE AGGREGATE CONTINUOUS QUERIES

Guirguis, Shenoda (2012) SCALABLE PROCESSING OF MULTIPLE AGGREGATE CONTINUOUS QUERIES. Doctoral Dissertation, University of Pittsburgh.

[img]
Preview
PDF (reviewed and approved by dean's office) - Accepted Version
Download (2781Kb) | Preview

    Abstract

    Data Stream Management Systems (DSMSs) were developed to be at the heart of every monitor- ing application. Monitoring applications typically register hundreds of Continuous Queries (CQs) in DSMSs in order to continuously process unbounded data streams to detect events of interest. DSMSs must be designed to efficiently handle unbounded streams with large volumes of data and large numbers of CQs, i.e., exhibit scalability. This need for scalability means that the underlying processing techniques a DSMS adopts should be optimized for high throughput (i.e., tuple output rate). Towards this, two main approaches were proposed in the literature: (1) Multiple Query Opti- mization (MQO) and (2) Scheduling. In this dissertation we focus on optimizing the processing of multiple Aggregate Continuous Queries (ACQs), given their high processing cost and popularity in all monitoring applications. Specifically, in this dissertation, we explore shared processing of ACQs and introduce the con- cept of ’Weaveability’ as an indicator of the potential gains of sharing the processing of ACQs. We develop Weave Share, a multiple ACQs optimizer that considers the different uncorrelated factors of the processing cost, such as the input rate and ACQs’ specifications. In order to fully reap the benefits of the new weave-based optimization techniques, we conceptualize a new underlying ag- gregate operator implementation and realize it in the TriOps framework. TriOps enables adaptive sharing of multiple ACQs that have different window specification, predicates and group-by at- tributes. The properties of the proposed techniques are studied analytically and their performance advantages are experimentally evaluated using simulation and in the context of the AQSIOS DSMS prototype.


    Share

    Citation/Export:
    Social Networking:

    Details

    Item Type: University of Pittsburgh ETD
    ETD Committee:
    ETD Committee TypeCommittee MemberEmailORCID
    Committee ChairChrysanthis, Panos K.panos@cs.pitt.edu
    Committee CoChairLabrinidis, Alexandroslabrinid@cs.pitt.edu
    Committee MemberPruhs, Kirkkirk@cs.pitt.edu
    Committee MemberMokbel, Mohamedmokbel@cs.umn.edu
    Committee MemberSharaf, Mohamedm.sharaf@uq.edu.au
    Title: SCALABLE PROCESSING OF MULTIPLE AGGREGATE CONTINUOUS QUERIES
    Status: Published
    Abstract: Data Stream Management Systems (DSMSs) were developed to be at the heart of every monitor- ing application. Monitoring applications typically register hundreds of Continuous Queries (CQs) in DSMSs in order to continuously process unbounded data streams to detect events of interest. DSMSs must be designed to efficiently handle unbounded streams with large volumes of data and large numbers of CQs, i.e., exhibit scalability. This need for scalability means that the underlying processing techniques a DSMS adopts should be optimized for high throughput (i.e., tuple output rate). Towards this, two main approaches were proposed in the literature: (1) Multiple Query Opti- mization (MQO) and (2) Scheduling. In this dissertation we focus on optimizing the processing of multiple Aggregate Continuous Queries (ACQs), given their high processing cost and popularity in all monitoring applications. Specifically, in this dissertation, we explore shared processing of ACQs and introduce the con- cept of ’Weaveability’ as an indicator of the potential gains of sharing the processing of ACQs. We develop Weave Share, a multiple ACQs optimizer that considers the different uncorrelated factors of the processing cost, such as the input rate and ACQs’ specifications. In order to fully reap the benefits of the new weave-based optimization techniques, we conceptualize a new underlying ag- gregate operator implementation and realize it in the TriOps framework. TriOps enables adaptive sharing of multiple ACQs that have different window specification, predicates and group-by at- tributes. The properties of the proposed techniques are studied analytically and their performance advantages are experimentally evaluated using simulation and in the context of the AQSIOS DSMS prototype.
    Date: 01 February 2012
    Date Type: Publication
    Defense Date: 23 August 2011
    Approval Date: 01 February 2012
    Submission Date: 06 November 2011
    Release Date: 01 February 2012
    Access Restriction: No restriction; The work is available for access worldwide immediately.
    Patent pending: No
    Number of Pages: 130
    Institution: University of Pittsburgh
    Thesis Type: Doctoral Dissertation
    Refereed: Yes
    Degree: PhD - Doctor of Philosophy
    Additional Information: secondary email: shenoda.work@gmail.com
    Uncontrolled Keywords: Data Streams Management Systems, Continuous Queries, Query Optimization, Scal- able Processing, Aggregation, AQSIOS, Weaveability.
    Schools and Programs: Dietrich School of Arts and Sciences > Computer Science
    Date Deposited: 01 Feb 2012 07:20
    Last Modified: 16 Jul 2014 17:02

    Actions (login required)

    View Item

    Document Downloads