Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Convolutional Neural Networks Accelerators on FPGA Integrated With Hybrid Memory Cube

Lanois, thibaut (2018) Convolutional Neural Networks Accelerators on FPGA Integrated With Hybrid Memory Cube. Master's Thesis, University of Pittsburgh. (Unpublished)

This is the latest version of this item.

[img]
Preview
PDF
Download (831kB) | Preview

Abstract

Convolutional Neural Networks (CNN) have been widely used for non-trivial tasks such as image classification or speech recognition. As the CNNs become deeper (more and more layers), the accuracy increases, but also the computational cost and the model size. Due to the numerous application of CNNs, numerous accelerators have been proposed to handle computation and memory access pattern. Accelerators use the high parallelism of FPGAs, GPUs or custom ASICs to compute CNNs efficiently. Whereas the GPU throughput is very high, the power consumption can be, in some environment, a concern. FPGA accelerators provide a good balance between time to market and power consumption but face several challenges. FPGA resources must carefully be chosen to match network topologies. The hardware constraints, to carefully choose, are the number of DSPs/BRAMs and the external memory bandwidth. In this thesis, we present two new FPGA designs using the HMC as external memory to accelerate efficiently CNN.
The first design is a 32-bit fixed point design named Memory Conscious CNN Accelerator. This design use one Convolution Layer Processor (CLP) per layer and use the parallelism of the HMC to supply data. To maximize the HMC bandwidth, a new data layout is proposed. This new layout makes data request sequential. With this new layout, the frequency of the accelerator can reach 300MHz and a throughput of 232 Images/sec on AlexNet. The second design is a low power DNN accelerator, where data layout is done before the Pipeline Execution (PE). During the layout process, Deep Compression techniques can be applied to improve performances. The PE design achieves a 66 GOPs for only 1.3 Watt of power consumption.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Unpublished
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Lanois, thibautthl26@pitt.eduthl26
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairYang, Jun
Committee MemberGeorge, Alan D
Committee MemberHu, Jingtong
Committee MemberZhang, Youtao
Date: 24 January 2018
Date Type: Publication
Defense Date: 27 November 2017
Approval Date: 24 January 2018
Submission Date: 13 November 2017
Access Restriction: 1 year -- Restrict access to University of Pittsburgh for a period of 1 year.
Number of Pages: 65
Institution: University of Pittsburgh
Schools and Programs: Swanson School of Engineering > Electrical Engineering
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: FPGA, HMC, CNN
Date Deposited: 24 Jan 2018 19:17
Last Modified: 24 Jan 2019 06:15
URI: http://d-scholarship.pitt.edu/id/eprint/33526

Available Versions of this Item

  • Convolutional Neural Networks Accelerators on FPGA Integrated With Hybrid Memory Cube. (deposited 24 Jan 2018 19:17) [Currently Displayed]

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item