Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Maintaining Communication at Scale with OpenSHMEM

Abidi, Collin (2022) Maintaining Communication at Scale with OpenSHMEM. Master's Thesis, University of Pittsburgh. (Unpublished)

[img]
Preview
PDF
Submitted Version

Download (882kB) | Preview

Abstract

As the dawn of the exascale era arrives, high-performance computing (HPC) researchers continue to seek parallel-communication models that perform well on increasingly large distributed systems. The SHMEM (Shared Hierarchical Memory) family of parallel programming libraries has been under development over the last three decades by a community of researchers, government organizations, and corporations. SHMEM has a variety of implementations that have recently been expanded to distributed-memory parallel-computing clusters. The OpenSHMEM project is one of these efforts and has emerged as a standardized application-programming interface that is designed for portability and support of the partitioned global address space (PGAS) model. To investigate the performance characteristics of SHMEM, this research focuses on developing, deploying, and collecting metrics of two variants of the 2D fast Fourier transform algorithm, as well a modified version of the Horovod framework for distributed machine learning. A comparison to OpenMPI's message-passing interface (MPI) methods will be conducted as a point of reference. We show that in a 2D FFT application that is communication-bound by a transpose stage, OpenSHMEM's collective operations outperform that of MPI RMA. On this 2D FFT application, we demonstrate efficiencies of 0.81, 0.80, and 0.93 at largest node counts on PSC Regular Memory, PSC Extreme Memory, and NERSC Perlmutter, respectively.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Unpublished
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Abidi, Collincba15@pitt.educba150000-0003-3612-0496
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Thesis AdvisorAlan, Georgealan.george@pitt.edualan.george
Committee MemberMao, Zhi-Hongzhm4@pitt.eduzhm4
Committee MemberDickerson, Samueldickerson@pitt.edudickerson
Date: 6 September 2022
Date Type: Publication
Defense Date: 20 July 2022
Approval Date: 6 September 2022
Submission Date: 11 July 2022
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 50
Institution: University of Pittsburgh
Schools and Programs: Swanson School of Engineering > Electrical and Computer Engineering
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: hpc, shmem, parallel computing, horovod
Date Deposited: 06 Sep 2022 16:34
Last Modified: 06 Sep 2022 16:34
URI: http://d-scholarship.pitt.edu/id/eprint/43291

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item