Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Hardware-Oriented Cache Management for Large-Scale Chip Multiprocessors

Hammoud, Mohammad Hussein (2010) Hardware-Oriented Cache Management for Large-Scale Chip Multiprocessors. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Primary Text

Download (7MB) | Preview


One of the key requirements to obtaining high performance from chip multiprocessors (CMPs) is to effectively manage the limited on-chip cache resources shared among co-scheduled threads/processes. This thesis proposes new hardware-oriented solutions for distributed CMP caches. Computer architects are faced with growing challenges when designing cache systems for CMPs. These challenges result from non-uniform access latencies, interference misses, the bandwidth wall problem, and diverse workload characteristics. Our exploration of the CMP cache management problem suggests a CMP caching framework (CC-FR) that defines three main approaches to solve the problem: (1) data placement, (2) data retention, and (3) data relocation. We effectively implement CC-FR's components by proposing and evaluating multiple cache management mechanisms.Pressure and Distance Aware Placement (PDA) decouples the physical locations of cache blocks from their addresses for the sake of reducing misses caused by destructive interferences. Flexible Set Balancing (FSB), on the other hand, reduces interference misses via extending the life time of cache lines through retaining some fraction of the working set at underutilized local sets to satisfy far-flung reuses. PDA implements CC-FR's data placement and relocation components and FSB applies CC-FR's retention approach.To alleviate non-uniform access latencies and adapt to phase changes in programs, Adaptive Controlled Migration (ACM) dynamically and periodically promotes cache blocks towards L2 banks close to requesting cores. ACM lies under CC-FR's data relocation category. Dynamic Cache Clustering (DCC), on the other hand, addresses diverse workload characteristics and growing non-uniform access latencies challenges via constructing a cache cluster for each core and expands/contracts all clusters synergistically to match each core's cache demand. DCC implements CC-FR's data placement and relocation approaches. Lastly, Dynamic Pressure and Distance Aware Placement (DPDA) combines PDA and ACM to cooperatively mitigate interference misses and non-uniform access latencies. Dynamic Cache Clustering and Balancing (DCCB), on the other hand, combines DCC and FSB to employ all CC-FR's categories and achieve higher system performance. Simulation results demonstrate the effectiveness of the proposed mechanisms and show that they compare favorably with related cache designs.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Hammoud, Mohammad Husseinmoh7@pitt.eduMOH7
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairMelhem, Ramimelhem@cs.pitt.eduMELHEM
Committee CoChairCho, Sangyeuncho@cs.pitt.eduSANGYEUN
Committee MemberChilders, Brucechilders@cs.pitt.eduCHILDERS
Committee MemberYang, Junjuy9@pitt.eduJUY9
Date: 30 September 2010
Date Type: Completion
Defense Date: 1 July 2010
Approval Date: 30 September 2010
Submission Date: 6 July 2010
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Institution: University of Pittsburgh
Schools and Programs: Dietrich School of Arts and Sciences > Computer Science
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Data Placement; Data Relocation; Chip Multiprocessors; Data Retention
Other ID:, etd-07062010-215303
Date Deposited: 10 Nov 2011 19:50
Last Modified: 15 Nov 2016 13:45


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item