Boothe, Jefferson
(2024)
Evaluating One-Sided Communication for Graph Analytics with MPI-RMA and OpenSHMEM.
Master's Thesis, University of Pittsburgh.
(Unpublished)
This is the latest version of this item.
Abstract
The message passing interface (MPI) remains the primary parallel-pro\-gram\-ming library for developing and running code on massively parallel distributed systems. While traditionally used for two-sided and collective communication, the latest MPI standards support remote memory access (RMA) between processes to enable one-sided communication. This paradigm is associated with fine-grained communication and irregular memory accesses, commonly found in graph analytics. This research develops and evaluates distributed implementations of betweenness centrality, a staple graph analysis algorithm, using traditional MPI, MPI-RMA, and OpenSHMEM. The performance of all three versions is found to be nearly identical due to the algorithm's trivially parallel nature, making it a poor fit for evaluating the impact of each communication library on distributed computing performance. The study then focuses on Graph500, a popular benchmark built upon breadth-first search (BFS) on an undirected graph, known for its sparse data accesses and fine-grained communication. The scalability of multiple BFS implementations using MPI-RMA is analyzed and compared against existing OpenSHMEM-based implementations optimized to maximize the benefits of one-sided communication. Additionally, we evaluate these implementations against the state-of-the-art MPI reference code using different numbers of processing elements and various problem sizes. Our experimental evaluation shows consistently improved performance with MPI-RMA over the best OpenSHMEM implementation on Graph500's BFS kernel with scales up to 32 nodes on the Pittsburgh Supercomputing Center Bridges-2 Regular Memory partition and University of Pittsburgh Center for Research Computing (Pitt CRC) MPI Cluster. However, we were unable to outperform the MPI reference implementation in most scenarios with one-sided communication. While we demonstrate MPI-RMA to achieve ∼1.8× better performance over the MPI reference implementation on 32 nodes when only using 4 cores per node, the reference version was more performant in the majority of configurations tested. While one-sided communication has shown promising performance on some large-scale computing tasks, the difficulty of designing and developing applications to efficiently leverage one-sided communication remains a challenge. Additionally, we show that system architecture and library choices can significantly impact the expected performance of one-sided communication.
Share
| Citation/Export: |
|
| Social Networking: |
|
Details
| Item Type: |
University of Pittsburgh ETD
|
| Status: |
Unpublished |
| Creators/Authors: |
|
| ETD Committee: |
|
| Date: |
3 June 2024 |
| Date Type: |
Publication |
| Defense Date: |
27 March 2024 |
| Approval Date: |
3 June 2024 |
| Submission Date: |
28 March 2024 |
| Access Restriction: |
2 year -- Restrict access to University of Pittsburgh for a period of 2 years. |
| Number of Pages: |
47 |
| Institution: |
University of Pittsburgh |
| Schools and Programs: |
Swanson School of Engineering > Electrical and Computer Engineering |
| Degree: |
MS - Master of Science |
| Thesis Type: |
Master's Thesis |
| Refereed: |
Yes |
| Uncontrolled Keywords: |
high-performance computing
Distributed processing |
| Date Deposited: |
03 Jun 2024 14:42 |
| Last Modified: |
03 Jun 2024 14:42 |
| URI: |
http://d-scholarship.pitt.edu/id/eprint/46071 |
Available Versions of this Item
-
Evaluating One-Sided Communication for Graph Analytics with MPI-RMA and OpenSHMEM. (deposited 03 Jun 2024 14:42)
[Currently Displayed]
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
 |
View Item |