Improved probabilistic I/O scheduling for limited-size Burst-Buffers deployed HPC

Benbo Zha, Hong Shen

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

I/O bottleneck is a critical problem in current High Performance Computing (HPC) systems which hinges the performance scalability of a system. Some techniques, such as I/O scheduling and Burst-Buffering, had been proposed to accelerate data exchange between the compute and storage components on HPC platforms. Probabilistic I/O scheduling, a Markov-chain-based hybrid method combined the above-mentioned two techniques, controls the data transmission considering the whole load states of the Burst-Buffers system to mitigate the I/O congestion caused by unpredictable concurrent I/O bursts. However, this method requires a large amount of computation to make online scheduling, resulting in significant wastage of computing resources and decreased efficiency in scheduling. In this paper, we first introduce the architecture of Burst-Buffers deployed HPC platform, the probabilistic execution model of applications, and the basic probabilistic I/O scheduling method with a proof of its efficiency based on the Markov-chain framework. Then, we propose the modularization technique, as the first improvement, to reduce the repeated computation by isolating the heuristic application selection module from the original method and reusing the application ranking result to adjust the I/O scheduling. Next, we propose the thresholding technique, as the second improvement, to reduce the number of data transferring on burst-buffers by considering the write amplification characteristic of the underlying storage devices. Finally, we conduct extensive simulation experiments to show that our proposed I/O scheduling methods outperform the existing I/O scheduling methods without introducing burst-buffers states and without considering the characteristics of storage devices.

Original languageEnglish
Article number102708
JournalParallel Computing
Volume101
DOIs
Publication statusPublished - Apr 2021
Externally publishedYes

Keywords

  • Burst-Buffers
  • High performance computing
  • I/O congestion
  • Probabilistic I/O scheduling

Fingerprint

Dive into the research topics of 'Improved probabilistic I/O scheduling for limited-size Burst-Buffers deployed HPC'. Together they form a unique fingerprint.

Cite this