Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

In Pursuit of Graph Analysis for Neural-Network Performance Evaluation

Langerman, David (2022) In Pursuit of Graph Analysis for Neural-Network Performance Evaluation. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

[img] PDF (Langerman 2022 Dissertation Final Draft)
Primary Text
Restricted to University of Pittsburgh users only until 10 June 2024.

Download (4MB) | Request a Copy


High-level deep-learning frameworks such as TensorFlow and PyTorch abstract computation and data movement from neural network model designers, boosting productivity, and enabling deep-learning models to grow ever larger and more complex in pursuit of superhuman accuracies. Some of the largest models can even require multi-node clusters to efficiently train and deploy. When these models are published, often only the total floating-point operations (FLOPs) and the parameter count are given as proxies for performance compared to other architectures. The widespread use of GPUs to execute these network models calls into question the validity of using purely computational measures to gauge algorithms that are not compute-bound. While leveraging FLOPs has traditionally been the de facto method of evaluating computational cost, it ignores memory-access penalties, kernel-launch overheads, and data-movement costs.

This dissertation chronicles the journey of identifying and addressing this issue, starting with a low-level hardware accelerator. Even though the FLOPs of the algorithm do not change, it was shown that the accelerator design alone can have a large impact on scalability and performance. From there, a foray into deep learning (DL) begins. An existing DL algorithm was augmented with a state-of-the-art backbone resulting in a model with fewer FLOPs than the original. The goal was to boost the original network's performance. Instead, performance was lost, puzzling the researchers and leading to a deeper analysis on the model itself. It was discovered that the diameter of the directed-acyclic-graph (the Critical Datapath Length) describing a neural-network model was highly correlated with execution time. This phenomenon was shown across a set of 48 popular models running on multiple devices.

The suite of networks was expanded to include over 400 networks with a much wider variety of architectural features. These networks were analyzed with both graph- and compute-based metrics to form a dataset complete with a standard set of metrics including input Size, Parameter count, total Operations, and Critical Datapath Length. This suite of metrics was dubbed SPOC. When analyzed together, SPOC metrics can give actionable performance intuition and showcase how graph metrics can describe the initially perplexing benchmarks that were collected when this voyage began.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Langerman, Daviddal181@pitt.edudal1810000-0001-8777-4655
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairGeorge,
Committee MemberHu,
Committee MemberDickerson,
Committee MemberKubendran,
Committee MemberKovashka,
Date: 10 June 2022
Date Type: Publication
Defense Date: 6 April 2022
Approval Date: 10 June 2022
Submission Date: 11 March 2022
Access Restriction: 2 year -- Restrict access to University of Pittsburgh for a period of 2 years.
Number of Pages: 135
Institution: University of Pittsburgh
Schools and Programs: Swanson School of Engineering > Electrical and Computer Engineering
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Neural Networks, Performance Prediction, Indirect Metrics
Date Deposited: 10 Jun 2022 18:55
Last Modified: 10 Jun 2022 18:55


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item