Secure accelerator design for deep neural networksZhao, Lei (2022) Secure accelerator design for deep neural networks. Doctoral Dissertation, University of Pittsburgh. (Unpublished)
AbstractDeep neural networks (DNNs) have recently gained popularity in a wide range of modern application domains due to its superior inference accuracy. With growing problem size and complexity, modern DNNs, e.g., CNNs (convolutional neural networks), contain a large number of weights, which require tremendous efforts not only to prepare representative training data but also to train the network. There is an increasing demand to protect the DNN weights, an emerging intellectual property (IP) in the DNN field. This thesis proposes a line of solutions for protecting the DNN weights deployed on domain specific accelerators. Firstly, I propose AEP, a DNN weights protection scheme for accelerators based on conventional CMOS based technologies. Because of the extremely high memory bandwidth demand in DNN accelerators, conventional encryption based approaches, which require the integration of expensive encryption engines, pose significant overheads on the execution latency and energy consumption. Instead, AEP enables effective IP protection by utilizing fingerprints generated from hardware characteristics to eliminate the need of encryption. Adopting such hardware fingerprints achieves high inference accuracy only on the authorized device, while unauthorized devices can not produce any useful results from the same set of weights. Secondly, as the size of DNNs keeps increasing rapidly, the large number of intermediate results (i.e, the outputs from the previous layer and the inputs to the current layer) can not be held on-chip. These intermediate results also contain sensitive information about the DNN itself. In this part, I propose SCA which can securely off-load data dynamically generated inside the accelerator chip to off-chip memories. SCA is a full DNN protection scheme that protects both the DNN weights and the intermediate results, and supports both training and inference on CMOS based accelerators. Thirdly, ReRAM based accelerators introduce new challenges to DNN IP protection due to their crossbar structure and non-volatility. ReRAM's non-volatility retains data even after the system is powered off, making the stored DNN weights vulnerable to attacks by simply reading out the ReRAM content. Because the crossbar structure can only compute on cleartext data, encrypting the ReRAM content is no longer a feasible solution in this scenario. To solve these issues, I propose SRA, a novel non-encryption base protection method that still maintains ReRAM's in-memory computing capability. Lastly, although SRA provides security guarantees, the weights are represented in stochastic computing (SC) bit stream format, which induces a large storage overhead. However, conventional DNN model compression methods, such as pruning and quantization, are not applicable ReRAM based PIM accelerators. In this part, I propose BFlip --- a novel DNN model compression scheme --- to share crossbars among multiple bit matrices. BFlip not only reduces storage overhead but also improves performance and energy efficiency. Share
Details
MetricsMonthly Views for the past 3 yearsPlum AnalyticsActions (login required)
|