Hardware-Oriented Cache Management for Large-Scale Chip Multiprocessors

Hammoud, Mohammad Hussein (2010) Hardware-Oriented Cache Management for Large-Scale Chip Multiprocessors. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Preview

PDF
Primary Text
Download (7MB) | Preview

Abstract

One of the key requirements to obtaining high performance from chip multiprocessors (CMPs) is to effectively manage the limited on-chip cache resources shared among co-scheduled threads/processes. This thesis proposes new hardware-oriented solutions for distributed CMP caches. Computer architects are faced with growing challenges when designing cache systems for CMPs. These challenges result from non-uniform access latencies, interference misses, the bandwidth wall problem, and diverse workload characteristics. Our exploration of the CMP cache management problem suggests a CMP caching framework (CC-FR) that defines three main approaches to solve the problem: (1) data placement, (2) data retention, and (3) data relocation. We effectively implement CC-FR's components by proposing and evaluating multiple cache management mechanisms.Pressure and Distance Aware Placement (PDA) decouples the physical locations of cache blocks from their addresses for the sake of reducing misses caused by destructive interferences. Flexible Set Balancing (FSB), on the other hand, reduces interference misses via extending the life time of cache lines through retaining some fraction of the working set at underutilized local sets to satisfy far-flung reuses. PDA implements CC-FR's data placement and relocation components and FSB applies CC-FR's retention approach.To alleviate non-uniform access latencies and adapt to phase changes in programs, Adaptive Controlled Migration (ACM) dynamically and periodically promotes cache blocks towards L2 banks close to requesting cores. ACM lies under CC-FR's data relocation category. Dynamic Cache Clustering (DCC), on the other hand, addresses diverse workload characteristics and growing non-uniform access latencies challenges via constructing a cache cluster for each core and expands/contracts all clusters synergistically to match each core's cache demand. DCC implements CC-FR's data placement and relocation approaches. Lastly, Dynamic Pressure and Distance Aware Placement (DPDA) combines PDA and ACM to cooperatively mitigate interference misses and non-uniform access latencies. Dynamic Cache Clustering and Balancing (DCCB), on the other hand, combines DCC and FSB to employ all CC-FR's categories and achieve higher system performance. Simulation results demonstrate the effectiveness of the proposed mechanisms and show that they compare favorably with related cache designs.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Hammoud, Mohammad Hussein	moh7@pitt.edu	MOH7

ETD Committee:

Title	Member	Email Address	Pitt Username
Committee Chair	Melhem, Rami	melhem@cs.pitt.edu	MELHEM
Committee CoChair	Cho, Sangyeun	cho@cs.pitt.edu	SANGYEUN
Committee Member	Childers, Bruce	childers@cs.pitt.edu	CHILDERS
Committee Member	Yang, Jun	juy9@pitt.edu	JUY9

Date:

30 September 2010

Date Type:

Completion

Defense Date:

1 July 2010

Approval Date:

30 September 2010

Submission Date:

6 July 2010

Access Restriction:

No restriction; Release the ETD for access worldwide immediately.

Institution:

University of Pittsburgh

Schools and Programs:

Dietrich School of Arts and Sciences > Computer Science

Degree:

PhD - Doctor of Philosophy

Thesis Type:

Doctoral Dissertation

Refereed:

Yes

Uncontrolled Keywords:

Data Placement; Data Relocation; Chip Multiprocessors; Data Retention

Other ID:

http://etd.library.pitt.edu/ETD/available/etd-07062010-215303/, etd-07062010-215303

Date Deposited:

10 Nov 2011 19:50

Last Modified:

15 Nov 2016 13:45

URI:

http://d-scholarship.pitt.edu/id/eprint/8284

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

Hardware-Oriented Cache Management for Large-Scale Chip Multiprocessors

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds