Catalogo Articoli (Spogli Riviste)

OPAC HELP

Titolo:
Four-ary tree-based barrier synchronization for 2D meshes without nonmember involvement
Autore:
Moh, S; Yu, CS; Lee, B; Youn, HY; Han, DS; Lee, D;
Indirizzi:
Informat & Commun Univ, Sch Engn, Taejon 305348, South Korea Informat & Commun Univ Taejon South Korea 305348 jon 305348, South Korea Oregon State Univ, Dept Elect & Comp Engn, Corvallis, OR 97331 USA Oregon State Univ Corvallis OR USA 97331 mp Engn, Corvallis, OR 97331 USA Sung Kyun Kwan Univ, Sch Elect & Comp Engn, Suwon 440746, South Korea SungKyun Kwan Univ Suwon South Korea 440746 , Suwon 440746, South Korea
Titolo Testata:
IEEE TRANSACTIONS ON COMPUTERS
fascicolo: 8, volume: 50, anno: 2001,
pagine: 811 - 823
SICI:
0018-9340(200108)50:8<811:FTBSF2>2.0.ZU;2-C
Fonte:
ISI
Lingua:
ENG
Soggetto:
WORMHOLE; COMMUNICATION; IMPLEMENTATION; MULTICOMPUTERS; NETWORKS;
Keywords:
barrier synchronization; hardware-supported barriers; communication latency; wormhole routing; MPI;
Tipo documento:
Article
Natura:
Periodico
Settore Disciplinare:
Engineering, Computing & Technology
Citazioni:
28
Recensione:
Indirizzi per estratti:
Indirizzo: Moh, S Informat & Commun Univ, Sch Engn, 58-4 Hwa Am, Taejon 305348, SouthKorea Informat & Commun Univ 58-4 Hwa Am Taejon South Korea 305348 Korea
Citazione:
S. Moh et al., "Four-ary tree-based barrier synchronization for 2D meshes without nonmember involvement", IEEE COMPUT, 50(8), 2001, pp. 811-823

Abstract

This paper proposes a Barrier Tree for Meshes (BTM) to minimize the barrier synchronization latency for two-dimensional (2D) meshes. The proposed BTMscheme has two distinguishing features. First, the synchronization tree is4-ary. The synchronization latency of the BTM scheme is asymptotically Theta (log(4) n), while that of the fastest scheme reported in the literature is bounded between Omega (log(3) n) and O(n(1/2)), where n is the number ofmember nodes. Second, nonmember nodes are neither involved in the construction of a BTM nor actively participate in the synchronization operations, which avoids interference among different process groups during synchronization. This not only results in low setup overhead, but also reduces the synchronization latency. The low setup overhead is particularly effective for the dynamic process model provided in MPI-2. Extensive simulation study shows that, for up to 64 x 64 meshes, the BTM scheme results in about 40 similar to 70 percent shorter synchronization latency and is more scalable than conventional schemes.

ASDD Area Sistemi Dipartimentali e Documentali, Università di Bologna, Catalogo delle riviste ed altri periodici
Documento generato il 28/11/20 alle ore 21:33:34