Publication Details
Polykarpos Thomadakis and Nikos Chrisochoides.
Submitted to The Journal of Supercomputing, Publisher Springer, January, 2023
Abstract
This paper presents an effort to mitigate overheads, latencies and limitations observed in message-driven runtime frameworks, by utilizing lightweight threads tightly integrated with message-passing. It also introduces new abstractions and features for group communication as well as fine-grained concurrency on top of remote method invocations to improve workload balancing in shared and distributed memory.} We observe up to 100\% difference in performance behavior for task creation and handling message passing. Evaluations on 1000 cores (25 nodes) of a distributed memory machine showed that the integration of fine-grained concurrency with the runtime achieves performance improvements of 12\% on a seismic wave simulation benchmark, as opposed to 50% degradation with OpenMP. Moreover, a 3D mesh refinement application showed 50% improvement, exploiting multi-grain parallelism at data and task level.