TY - GEN
T1 - Feedback-directed thread scheduling with memory considerations
AU - Song, Fengguang
AU - Moore, Shirley
AU - Dongarra, Jack
PY - 2007
Y1 - 2007
N2 - This paper describes a novel approach to generate an optimized schedule to run threads on distributed shared memory (DSM) systems. The approach relies upon a binary instrumentation tool to automatically acquire the memory sharingrelationship between user-level threads by analyzing their memory trace. We introduce the concept of Affinity Graph to model the relationship. Expensive I/O for large trace files is completely eliminated by using an online graph creation scheme. We apply the technique of hierarchical graph partitioning and thread reordering to the affinity graph to determine an optimal thread schedule. We have performed experiments on an SGI Altix system. The experimental results show that our approach is able to reduce the totalexecution time by 10% to 38% for a variety of applications through the maximization of the data reuse within a single processor, minimization of the data sharing between processors, and a good load balance.
AB - This paper describes a novel approach to generate an optimized schedule to run threads on distributed shared memory (DSM) systems. The approach relies upon a binary instrumentation tool to automatically acquire the memory sharingrelationship between user-level threads by analyzing their memory trace. We introduce the concept of Affinity Graph to model the relationship. Expensive I/O for large trace files is completely eliminated by using an online graph creation scheme. We apply the technique of hierarchical graph partitioning and thread reordering to the affinity graph to determine an optimal thread schedule. We have performed experiments on an SGI Altix system. The experimental results show that our approach is able to reduce the totalexecution time by 10% to 38% for a variety of applications through the maximization of the data reuse within a single processor, minimization of the data sharing between processors, and a good load balance.
KW - Affinity graph
KW - Distributed shared memory
KW - Scientific applications
KW - Shared-memory programming
UR - http://www.scopus.com/inward/record.url?scp=34548088089&partnerID=8YFLogxK
U2 - 10.1145/1272366.1272380
DO - 10.1145/1272366.1272380
M3 - Conference contribution
AN - SCOPUS:34548088089
SN - 1595936734
SN - 9781595936738
T3 - Proceedings of the 16th International Symposium on High Performance Distributed Computing 2007, HPDC'07
SP - 97
EP - 106
BT - Proceedings of the 16th International Symposium on High Performance Distributed Computing 2007, HPDC'07
T2 - 16th International Symposium on High Performance Distributed Computing 2007, HPDC'07 and Co-Located Workshops
Y2 - 25 June 2007 through 29 June 2007
ER -