Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Design Space Exploration of High-Throughput Graph- and Signal-Processing Architectures using High-Level Synthesis and FPGAs

Bickerstaff, James (2024) Design Space Exploration of High-Throughput Graph- and Signal-Processing Architectures using High-Level Synthesis and FPGAs. Master's Thesis, University of Pittsburgh. (Unpublished)

Download (1MB) | Preview


Data-intensive apps are becoming ever more prevalent due to the increasing amount of information available from sources such as social media and high-resolution sensors. The need to rapidly process this data and provide insights cannot be met easily through traditional computing methods. Accelerating apps through the use of custom hardware and specialized techniques is key for more efficient processing as datasets continue to grow in scale. This research focuses on creating high-throughput acceleration architectures for Intel FPGA devices using the oneAPI high-level synthesis (HLS) toolkit. We target two areas of research: graph processing and signal processing. The two chosen graph operations are breadth-first search (BFS) and minimum-spanning-tree (MST). The signal processing investigation focuses on accelerating the Fast Fourier Transform (FFT). Custom, partition-based methods are designed and developed for the acceleration of BFS and MST. Through design space exploration, we evaluate overall performance and productivity gains achieved by leveraging the oneAPI tools. Results showcase BFS performance of up to 75 million traversed edges per second, achieving up to 3.0× speedup over the Intel Xeon 6128 CPU baseline. Despite falling short of related hardware description language (HDL) research, the HLS methods created use 5.85× fewer lines of code compared to the HDL implementations. MST designs exhibit speedups of ∼1.5× when compared to the CPU baseline. To accelerate FFT using oneAPI and FPGA, a feedforward architecture was implemented and optimized. A design space exploration is performed to evaluate varying FFT resolutions, from 64k-point up to 512k-point in size. We find that a resolution of 256k-point provides a balance between resource utilization and performance, however, its performance lags behind that of the Fastest Fourier Transform in the West (FFTW) and Intel oneMKL libraries when executed in parallel on an eight-core Intel Xeon Platinum 8256 processor. Through the creation of these architectures, we are able to demonstrate the high productivity available with the oneAPI toolkit by evaluating different configurations of the designs with only minor changes to the code base.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Bickerstaff, Jamesjames.bickerstaff@pitt.edujjb1690009-0008-8239-9997
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairGeorge,
Committee MemberDickerson, Samueldickerson@pitt.edusjdst31@pitt.edu0000-0003-2281-5115
Committee MemberZhou, Peipeipeipei.zhou@pitt.edupez41@pitt.edu0000-0002-0493-1844
Date: 11 January 2024
Date Type: Publication
Defense Date: 9 November 2023
Approval Date: 11 January 2024
Submission Date: 2 October 2023
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 58
Institution: University of Pittsburgh
Schools and Programs: Swanson School of Engineering > Electrical and Computer Engineering
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: FPGA, high-level synthesis (HLS), graph, breadth-first search (BFS), minimum-spanning-tree (MST), fast Fourier transform (FFT), million traversed edges per second (MTEPS), oneAPI
Date Deposited: 11 Jan 2024 19:33
Last Modified: 11 Jan 2024 19:33


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item