Adaptively Periodic I/O Scheduling for Concurrent HPC Applications

Benbo Zha, Hong Shen

Research output: Contribution to journalArticlepeer-review

Abstract

With the convergence of big data and HPC (high-performance computing), various machine learning applications and traditional large-scale simulations with a stochastically iterative I/O periodicity are running concurrently on HPC platforms, which poses more challenges on the scarcely shared I/O resources due to the ever-growing data transfer demand. Currently the existing heuristic online and periodic offline I/O scheduling methods for traditional HPC applications with a fixed I/O periodicity are not suitable for the applications with stochastically iterative I/O periodicities, which are required to schedule the concurrent I/Os from different applications under I/O congestion. In this work, we propose an adaptively periodic I/O scheduling (APIO) method that optimizes the system efficiency and application dilation by taking the stochastically iterative I/O periodicity of the applications into account. We first build a periodic offline scheduling method within a specified duration to capture the iterative nature. After that, APIO adjusts the bandwidth allocation to resist stochasticity based on the actual length of the computing phrase. In the case where the specified duration does not satisfy the actual running requirements, the period length will be extended to adapt to the actual duration. Theoretical analysis and extensive simulations demonstrate the efficiency of our proposed I/O scheduling method over the existing online approach.

Original languageEnglish
Article number1318
JournalElectronics (Switzerland)
Volume11
Issue number9
DOIs
Publication statusPublished - 1 May 2022
Externally publishedYes

Keywords

  • I/O scheduling
  • high-performance computing
  • periodic I/O scheduling
  • stochastic iterative application

Fingerprint

Dive into the research topics of 'Adaptively Periodic I/O Scheduling for Concurrent HPC Applications'. Together they form a unique fingerprint.

Cite this