Liu, Jiwei
(2018)
Efficient Synchronization for GPGPU.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
Abstract
High-performance General Purpose Graphics processing units (GPGPUs) have exposed bottlenecks in synchronizations of threads and cores. The massively parallel computing cores and complex hierarchies of threads present new challenges for synchronizations at different granularities. Performance of GPU is hindered by inefficient global and local synchronizations. I propose hardware-software cooperative frameworks for efficient synchronization of GPGPU to address the following issues.
To provide efficient global synchronization (Gsync), an API with direct hardware support is proposed. The GPU cores are synchronized by an on-chip Gsync controller. Partial context switch is employed to guarantee deadlock-free execution. The proposed Gsync avoids expensive API calls and alleviates data thrashing. Prioritized warp scheduling is used to increase the overlap of context switch with kernel execution.
To efficiently exploit the inherent parallelism of producer-consumer problems, a flexible wait-signal scheme is proposed at thread-block level. I propose dedicated APIs to express fine-grained static and dynamic dependencies with hardware support. The proposed scheme can accelerate wavefront, graph and machine learning applications. The architectural design of on-chip wait-signal controller eliminates busy wait loop and long-latency memory operations. I also propose thread block dispatch scheduling to address the problem of load imbalance and large context switch overhead.
To reduce stall due to synchronizations, a synchronization-aware warp scheduling is proposed to coordinate multiple warp schedulers upon synchronization events. Both performance and hardware utilization are improved by resolving the barrier sooner.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
25 September 2018 |
Date Type: |
Publication |
Defense Date: |
12 June 2018 |
Approval Date: |
25 September 2018 |
Submission Date: |
20 July 2018 |
Access Restriction: |
No restriction; Release the ETD for access worldwide immediately. |
Number of Pages: |
134 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
Swanson School of Engineering > Electrical Engineering |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
GPU Synchronization |
Date Deposited: |
25 Sep 2018 15:55 |
Last Modified: |
25 Sep 2018 15:55 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/34943 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |