Pitt Logo LinkContact Us

Metrics and Algorithms for Processing Multiple Continuous Queries

Sharaf, Mohamed (2007) Metrics and Algorithms for Processing Multiple Continuous Queries. Doctoral Dissertation, University of Pittsburgh.

[img]
Preview
PDF - Primary Text
Download (855Kb) | Preview

    Abstract

    Data streams processing is an emerging research area that is driven by the growing need for monitoring applications. A monitoring application continuously processes streams of data for interesting, significant, or anomalous events. Such applications include tracking the stock market, real-time detection of diseaseoutbreaks, and environmental monitoring via sensor networks.Efficient employment of those monitoring applications requires advanced data processing techniques that can support the continuous processing of unbounded rapid data streams. Such techniques go beyond the capabilities of the traditional store-then-query Data BaseManagement Systems. This need has led to a new data processing paradigm and created a new generation of data processing systems,supporting continuous queries (CQ) on data streams.Primary emphasis in the development of first generation Data Stream Management Systems (DSMSs) was given to basic functionality. However, in order to support large-scale heterogeneous applications that are envisioned for subsequent generations of DSMSs, greater attention willhave to be paid to performance issues. Towards this, this thesis introduces new algorithms and metrics to the current design of DSMSs.This thesis identifies a collection of quality ofservice (QoS) and quality of data (QoD) metrics that are suitable for a wide range of monitoring applications. The establishment of well-defined metrics aids in the development of novel algorithms that are optimal with respect to a particular metric. Our proposed algorithms exploit the valuable chances for optimization that arise in the presence of multiple applications. Additionally, they aim to balance the trade-off between the DSMS's overall performance and the performance perceived by individual applications. Furthermore, we provide efficient implementations of the proposed algorithms and we also extend them to exploit sharing in optimized multi-query plans and multi-stream CQs. Finally, we experimentally show that our algorithms consistently outperform the current state of the art.


    Share

    Citation/Export:
    Social Networking:

    Details

    Item Type: University of Pittsburgh ETD
    ETD Committee:
    ETD Committee TypeCommittee MemberEmail
    Committee ChairChrysanthis , Panos Kpanos@cs.pitt.edu
    Committee MemberLabrinidis, Alexandroslabrinid@cs.pitt.edu
    Committee MemberFaloutsos, Christoschristos@cs.cmu.edu
    Committee MemberPruhs, Kirkkirk@cs.pitt.edu
    Committee MemberAref, Walidaref@cs.purdue.edu
    Title: Metrics and Algorithms for Processing Multiple Continuous Queries
    Status: Unpublished
    Abstract: Data streams processing is an emerging research area that is driven by the growing need for monitoring applications. A monitoring application continuously processes streams of data for interesting, significant, or anomalous events. Such applications include tracking the stock market, real-time detection of diseaseoutbreaks, and environmental monitoring via sensor networks.Efficient employment of those monitoring applications requires advanced data processing techniques that can support the continuous processing of unbounded rapid data streams. Such techniques go beyond the capabilities of the traditional store-then-query Data BaseManagement Systems. This need has led to a new data processing paradigm and created a new generation of data processing systems,supporting continuous queries (CQ) on data streams.Primary emphasis in the development of first generation Data Stream Management Systems (DSMSs) was given to basic functionality. However, in order to support large-scale heterogeneous applications that are envisioned for subsequent generations of DSMSs, greater attention willhave to be paid to performance issues. Towards this, this thesis introduces new algorithms and metrics to the current design of DSMSs.This thesis identifies a collection of quality ofservice (QoS) and quality of data (QoD) metrics that are suitable for a wide range of monitoring applications. The establishment of well-defined metrics aids in the development of novel algorithms that are optimal with respect to a particular metric. Our proposed algorithms exploit the valuable chances for optimization that arise in the presence of multiple applications. Additionally, they aim to balance the trade-off between the DSMS's overall performance and the performance perceived by individual applications. Furthermore, we provide efficient implementations of the proposed algorithms and we also extend them to exploit sharing in optimized multi-query plans and multi-stream CQs. Finally, we experimentally show that our algorithms consistently outperform the current state of the art.
    Date: 27 September 2007
    Date Type: Completion
    Defense Date: 22 June 2007
    Approval Date: 27 September 2007
    Submission Date: 09 August 2007
    Access Restriction: No restriction; The work is available for access worldwide immediately.
    Patent pending: No
    Institution: University of Pittsburgh
    Thesis Type: Doctoral Dissertation
    Refereed: Yes
    Degree: PhD - Doctor of Philosophy
    URN: etd-08092007-134505
    Uncontrolled Keywords: algorithms; continuous queries; data streams; databases; scheduling
    Schools and Programs: Dietrich School of Arts and Sciences > Computer Science
    Date Deposited: 10 Nov 2011 14:58
    Last Modified: 13 Apr 2012 16:38
    Other ID: http://etd.library.pitt.edu/ETD/available/etd-08092007-134505/, etd-08092007-134505

    Actions (login required)

    View Item

    Document Downloads