Performance and Productivity Evaluation of HPC Communication Libraries and Programming Models

Johnson, Alex (2021) Performance and Productivity Evaluation of HPC Communication Libraries and Programming Models. Master's Thesis, University of Pittsburgh. (Unpublished)

Preview

PDF
Download (1MB) | Preview

Abstract

To reach exascale performance, data centers must scale their systems, increasing the number of nodes and equipping them with high-performance network interconnects. Orchestration of the communication between nodes serves as one of the most performance-critical aspects of highly distributed app development. While the standard for HPC communication is two-sided communication as represented by Message Passing Interface (MPI), the use of two-sided communication may not effectively express certain communication patterns. It may also fail to take advantage of key performance-critical features supported by state-ofthe-art interconnects such as remote direct memory access (RDMA). By contrast, one-sided communication libraries such as MPI’s extensions for remote memory access (RMA) and OpenSHMEM can provide developers with the added flexibility of one-sided communication primitives and the capability to take advantage of RDMA. To investigate these approaches, this research provides comparative performance and productivity analysis of two-sided MPI, one-sided MPI and OpenSHMEM using kernels to simulate various communication and computation patterns representative of HPC apps. Performance is measured in terms of latency and achieved throughput using up to 320 nodes at the National Energy Research Scientific Computing Center (NERSC) Cori and Pittsburgh Supercomputing Center (PSC) Bridges-2 systems. Additionally, the productivity of the communication interfaces is analyzed quantitatively and qualitatively. RMA-based APIs are found to show lower latency and efficient scalability across the DAXPY, Cannon’s Algorithm Matrix Multiply, SUMMA Matrix Multiply and Integer Sort kernels. Similarly, the RMA-based libraries achieve the best throughput, with OpenSHMEM achieving up to double the total concurrent data movement of MPI. Conversely, MPI’s two-sided API produces the simplest programs in terms of lines of code and API calls, but it generally shows the highest latency across the evaluated kernels. The OpenSHMEM API achieves the highest performance for the four kernels and is simpler in terms of our productivity metrics than one-sided MPI for RMA-optimized codes. In contrast to these findings, two-sided MPI remains a strong library for HPC communication due to its robust set of API calls and optimized collective performance.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Johnson, Alex	amj92@pitt.edu	amj92

ETD Committee:

Title	Member	Email Address
Committee Chair	George, Alan D	alan.george@pitt.edu
Committee Member	Dallal, Ahmed	ahd12@pitt.edu
Committee Member	Kerestes, Robert	rjk39@pitt.edu

Date:

13 June 2021

Date Type:

Publication

Defense Date:

1 April 2021

Approval Date:

13 June 2021

Submission Date:

19 March 2021

Access Restriction:

No restriction; Release the ETD for access worldwide immediately.

Number of Pages:

Institution:

University of Pittsburgh

Schools and Programs:

Swanson School of Engineering > Electrical and Computer Engineering

Degree:

MS - Master of Science

Thesis Type:

Master's Thesis

Refereed:

Yes

Uncontrolled Keywords:

RDMA, MPI, OpenSHMEM, HPC, Supercomputing

Date Deposited:

13 Jun 2021 18:36

Last Modified:

13 Jun 2021 18:36

URI:

http://d-scholarship.pitt.edu/id/eprint/40403

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

Performance and Productivity Evaluation of HPC Communication Libraries and Programming Models

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds