TY - GEN
T1 - Fast cycle-accurate compile based simulator for reconfigurable processor
AU - Miniskar, Narasinga Rao
AU - Gadde, Raj Narayana
AU - Cho, Young Chul Rams
AU - Kim, Sukjin
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/9/25
Y1 - 2017/9/25
N2 - Reconfigurable Processor (RP) provides great flexibility of hardware re-configurability through software solution for high-performance computing. RP is used as a DSP in Samsung DTV and Camera to run the Audio, Video codecs and image quality enhancement algorithms. RP runs in two modes: VLIW (Very Large Instruction Word) and CGRA (Coarse Grain Reconfigurable Array). To minimize the time-to-market of products, application developers of the RP require a fast and cycle-accurate profiling-enabled simulator to verify the functionality of applications and to optimize the hot-spots (performance critical codes). Further, fast simulation is necessary to verify the functionalities for certification of various standards of multimedia (DivX, Dolby, etc.). The state-of-the-art RP simulators are very slow running at 2 MIPS (Million instructions per second) in CGRA mode and 4 MIPS in VLIW mode, running on x86 host processor at 3.4 GHz. We propose FastSim, a fast compile based cycle-accurate simulator for RP, which yields simulation speed of 900 MIPS in CGRA mode and 1600 MIPS in VLIW mode with > 99.5% core cycle accuracy. Thus, FastSim enables ∼400x faster simulation when compared to existing RP simulators. FastSim in VLIW mode is 2x faster when compared to the state-of-the-art functional simulators, and 16x faster when compared to cycle-accurate simulators, of the VLIW and RISC processors in the industry. FastSim speed is also comparable to application running time on native x86. The faster simulation speed is achieved with the use of an innovative maximal static analysis in both VLIW and CGRA modes.
AB - Reconfigurable Processor (RP) provides great flexibility of hardware re-configurability through software solution for high-performance computing. RP is used as a DSP in Samsung DTV and Camera to run the Audio, Video codecs and image quality enhancement algorithms. RP runs in two modes: VLIW (Very Large Instruction Word) and CGRA (Coarse Grain Reconfigurable Array). To minimize the time-to-market of products, application developers of the RP require a fast and cycle-accurate profiling-enabled simulator to verify the functionality of applications and to optimize the hot-spots (performance critical codes). Further, fast simulation is necessary to verify the functionalities for certification of various standards of multimedia (DivX, Dolby, etc.). The state-of-the-art RP simulators are very slow running at 2 MIPS (Million instructions per second) in CGRA mode and 4 MIPS in VLIW mode, running on x86 host processor at 3.4 GHz. We propose FastSim, a fast compile based cycle-accurate simulator for RP, which yields simulation speed of 900 MIPS in CGRA mode and 1600 MIPS in VLIW mode with > 99.5% core cycle accuracy. Thus, FastSim enables ∼400x faster simulation when compared to existing RP simulators. FastSim in VLIW mode is 2x faster when compared to the state-of-the-art functional simulators, and 16x faster when compared to cycle-accurate simulators, of the VLIW and RISC processors in the industry. FastSim speed is also comparable to application running time on native x86. The faster simulation speed is achieved with the use of an innovative maximal static analysis in both VLIW and CGRA modes.
UR - http://www.scopus.com/inward/record.url?scp=85032663626&partnerID=8YFLogxK
U2 - 10.1109/ISCAS.2017.8050318
DO - 10.1109/ISCAS.2017.8050318
M3 - Conference contribution
AN - SCOPUS:85032663626
T3 - Proceedings - IEEE International Symposium on Circuits and Systems
BT - IEEE International Symposium on Circuits and Systems
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 50th IEEE International Symposium on Circuits and Systems, ISCAS 2017
Y2 - 28 May 2017 through 31 May 2017
ER -