Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Simulation-based Optimization and Sensibility Analysis of MPI Applications: Variability Matters

Abstract : Finely tuning MPI applications and understanding the influence of key parameters (number of processes, granularity, collective operation algorithms, virtual topology, and process placement) is critical to obtain good performance on supercomputers. With the high consumption of running applications at scale, doing so solely to optimize their performance is particularly costly. Having inexpensive but faithful predictions of expected performance could be a great help for researchers and system administrators. The methodology we propose decouples the complexity of the platform, which is captured through statistical models of the performance of its main components (MPI communications, BLAS operations), from the complexity of adaptive applications by emulating the application and skipping regular non-MPI parts of the code. We demonstrate the capability of our method with High-Performance Linpack (HPL), the benchmark used to rank supercomputers in the TOP500, which requires careful tuning. We briefly present (1) how the open-source version of HPL can be slightly modified to allow a fast emulation on a single commodity server at the scale of a supercomputer. Then we present (2) an extensive (in)validation study that compares simulation with real experiments and demonstrates our ability to predict the performance of HPL within a few percent consistently. This study allows us to identify the main modeling pitfalls (e.g., spatial and temporal node variability or network heterogeneity and irregular behavior) that need to be considered. Last, we show (3) how our ``surrogate'' allows studying several subtle HPL parameter optimization problems while accounting for uncertainty on the platform.
Complete list of metadata

https://hal.inria.fr/hal-03141988
Contributor : Tom Cornebize <>
Submitted on : Monday, February 15, 2021 - 4:27:18 PM
Last modification on : Monday, April 12, 2021 - 7:02:23 PM

Files

paper.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-03141988, version 1
  • ARXIV : 2102.07674

Citation

Tom Cornebize, Arnaud Legrand. Simulation-based Optimization and Sensibility Analysis of MPI Applications: Variability Matters. 2021. ⟨hal-03141988⟩

Share

Metrics

Record views

94

Files downloads

161