TY - GEN
T1 - PinComm
T2 - 16th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2010
AU - Heirman, Wim
AU - Stroobandt, Dirk
AU - Miniskar, Narasinga Rao
AU - Wuyts, Roel
AU - Catthoor, Francky
PY - 2010
Y1 - 2010
N2 - As the number of cores in both embedded Multi-Processor Systems-on-Chip and general purpose processors keeps rising, on-chip communication becomes more and more important. In order to write efficient programs for these architectures it is therefore necessary to have a good idea of the communication behavior of an application. We present a communication profiler that extracts this behavior from compiled, sequential or parallel C/C++ programs, and constructs a dynamic data-flow graph at the level of major functional blocks. In contrast to existing methods of measuring inter-program communication, our tool automatically generates the program's data-flow graph and is less demanding for the developer. It can also be used to view differences between program phases (such as different video frames), which allows both input- and phase-specific optimizations to be made. We will also describe briefly how this information can subsequently be used to guide the effort of parallelizing the application, to co-design the software, memory hierarchy and communication hardware, and to provide new sources of communication-related runtime optimizations.
AB - As the number of cores in both embedded Multi-Processor Systems-on-Chip and general purpose processors keeps rising, on-chip communication becomes more and more important. In order to write efficient programs for these architectures it is therefore necessary to have a good idea of the communication behavior of an application. We present a communication profiler that extracts this behavior from compiled, sequential or parallel C/C++ programs, and constructs a dynamic data-flow graph at the level of major functional blocks. In contrast to existing methods of measuring inter-program communication, our tool automatically generates the program's data-flow graph and is less demanding for the developer. It can also be used to view differences between program phases (such as different video frames), which allows both input- and phase-specific optimizations to be made. We will also describe briefly how this information can subsequently be used to guide the effort of parallelizing the application, to co-design the software, memory hierarchy and communication hardware, and to provide new sources of communication-related runtime optimizations.
KW - Communication
KW - Dynamic dataflow graph
KW - Network-on-chip
KW - Profiling
UR - http://www.scopus.com/inward/record.url?scp=79951751866&partnerID=8YFLogxK
U2 - 10.1109/ICPADS.2010.56
DO - 10.1109/ICPADS.2010.56
M3 - Conference contribution
AN - SCOPUS:79951751866
SN - 9780769543079
T3 - Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS
SP - 500
EP - 507
BT - Proceedings - 16th International Conference on Parallel and Distributed Systems, ICPADS 2010
Y2 - 8 December 2010 through 10 December 2010
ER -