Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

An Examination of a Symmetric Memory Model's Impact on Performance in a Distributed Graph Algorithm

Ing, Michael C (2021) An Examination of a Symmetric Memory Model's Impact on Performance in a Distributed Graph Algorithm. Master's Thesis, University of Pittsburgh. (Unpublished)

Download (3MB) | Preview


Over the last few decades, Message Passing Interface (MPI) has become the parallel-communication standard for distributed algorithms on high-performance platforms. MPI's minimal setup overhead and simple API calls give it a low barrier of entry, while still providing support for more complex communication patterns. Communication schemes that use physically or logically shared memory provide a number of improvements to HPC-algorithm
parallelization. These models prioritize the reduction of synchronization calls between processors and the overlapping of communication and computation via strategic programming techniques. The OpenSHMEM specification developed in the last decade applies these benefits to distributed-memory computing systems by leveraging a Partitioned Global Address Space (PGAS) model and remote memory access (RMA) operations. Paired with non-blocking communication patterns, these technologies enable increased parallelization of existing apps. This research studies the impact of these techniques on the Multi-Node Parallel
Boruvka's Minimum Spanning Tree Algorithm (MND-MST), which uses distributed programming for inter-processor communication. This research also provides a foundation for applying complex communication libraries like OpenSHMEM to large-scale parallel apps. To provide further context for the comparison of MPI to OpenSHMEM, this work presents a baseline comparison of relevant API calls as well as a productivity analysis for both imple-
mentations of the MST algorithm. Through experiments performed on the National Energy Research Scientific Computing Center (NERSC), it is found that the OpenSHMEM-based app has an average of 33.9% improvement in overall app execution time scaled up to 16 nodes and 64 processes. The program complexity, measured as a combination of lines of code and API calls, increases from MPI to OpenSHMEM implementations by ~25%. These
findings encourage further study into the use of distributed symmetric-memory architectures and RMA-communication models applied to scalable HPC apps.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Ing, Michael Cmci10@pitt.edumci10
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Thesis AdvisorGeorge, Alan
Committee MemberDickerson, Samuel
Committee MemberGao,
Date: 13 June 2021
Date Type: Publication
Defense Date: 31 March 2021
Approval Date: 13 June 2021
Submission Date: 19 March 2021
Access Restriction: 1 year -- Restrict access to University of Pittsburgh for a period of 1 year.
Number of Pages: 50
Institution: University of Pittsburgh
Schools and Programs: Swanson School of Engineering > Computer Engineering
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: MPI, RMA, OpenSHMEM, PGAS, HPC, MST, Distributed Algorithm, Parallel Communication, Graph Processing
Date Deposited: 13 Jun 2021 18:35
Last Modified: 13 Jun 2022 05:15


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item