Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

THE VLIW-SUPERCISC COMPILER: EXPLOITINGPARALLELISM FROM C-BASED APPLICATIONS

Fazekas, Joshua David (2006) THE VLIW-SUPERCISC COMPILER: EXPLOITINGPARALLELISM FROM C-BASED APPLICATIONS. Master's Thesis, University of Pittsburgh. (Unpublished)

[img]
Preview
PDF
Primary Text

Download (2MB) | Preview

Abstract

A common approach to decreasing embedded application execution time is creating a homogeneous parallel processor architecture. The parallelism of any such architecture is limited to the number of instructions that can be scheduled in the same cycle. This number of instructions scheduled in a cycle, or instruction-level parallelism (ILP), is limited by the ability to extract parallelism from the application. Other techniques attempt to improve performance with hardware acceleration. Often, segments of highly computational extensive code are extracted and custom hardware is created to replace the software execution. This technique requires many resources and still does not address the segments of code outside of the computationally extensive kernel.To solve this problem, hardware acceleration for computationally intensive segments of code in addition to accelerating the entire application with very long instruction word, VLIW, techniques is proposed. (1) A compilation flow that targets a 4-wide VLIW processor architecture is presented. This system was used to investigate the available speed-up of VLIW architectures. The architecture was modified to combine the VLIW processor with the capability to execute application specific customized instructions. To create the custom instruction hardware, a control and data flow graph (CDFG) framework was created. The CDFG framework was created to provide a framework for compiler transformations and hardware generation. In order to remove control flow from segments of code selected for hardware generation, (2) the technique of hardware predication was developed. Hardware predication allows if-then and if-then-else control flow constructs to be transformed into strict data flow through the use of multiplexors. From the transformed CDFGs, (3) a VHDL generation pass was created that translates the compiler data structures into synthesizable VHDL. The resulting architecture contains the VLIW processor and tightly coupled application specific hardware. This architecture was analyzed for performance changes comparedto the initial VLIW architecture, and a traditional processor. Lastly, (4) the architecture was analyzed for power and energy savings. A post static timing pass was added to the compilation flow for the insertion of hardware to delay early switching of operations.By measuring only the execution of the hardware function and comparing the performance to the equivalent code executed in software, a performance multiplier of up to 322 times is seen when synthesized onto an Altera Stratix II ES2S180F1508C4 FPGA. The average performance increase seen was 63 times faster. For the entire application, the speedup reached nearly 30X and was on average 12X better than a single processor implementation. The power and energy required by the VLIW processor core and the hardware functions for the computational kernels after 160nm OKI standard cell ASIC synthesis show a maximum power savings of 417 times that of execution on the processor with an average of 133 times savings in power consumption. With the increased execution time and the savings in power the energy savings will see a multiplicative effect. The energy improvement is therefore several orders of magnitude for the hardware functions, the savings range from over 1,000X to approximately 60,000X.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Unpublished
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Fazekas, Joshua Davidjdf7@pitt.eduJDF7
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairJones, Alex Kakjones@ece.pitt.eduAKJONES
Committee MemberCain, JTcain@engr.pitt.eduJTC
Committee MemberHoare, RayHoare@engr.pitt.edu
Date: 27 September 2006
Date Type: Completion
Defense Date: 28 April 2006
Approval Date: 27 September 2006
Submission Date: 8 May 2006
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Institution: University of Pittsburgh
Schools and Programs: Swanson School of Engineering > Electrical Engineering
Degree: MSEE - Master of Science in Electrical Engineering
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: application specific; c/c++; compilation; superCISC; VLIW
Other ID: http://etd.library.pitt.edu/ETD/available/etd-05082006-141739/, etd-05082006-141739
Date Deposited: 10 Nov 2011 19:44
Last Modified: 15 Nov 2016 13:43
URI: http://d-scholarship.pitt.edu/id/eprint/7829

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item