I/O scheduling for limited-size burst-buffers deployed high performance computing

Benbo Zha, Hong Shen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

Burst-Buffers is a high throughput, small size intermediate storage system integrated between computing nodes and permanent storage system to mitigate the I/O bottleneck problem in modern High Performance Computing (HPC) platforms. This system, however, is unable to effectively handle variable-intensity I/O bursts resulted by unpredictable concurrent accesses to the shared Parallel File System (PFS). In this paper, we introduce a probabilistic I/O scheduling method that takes into account of the burst-buffer load state and instantaneous I/O load distribution of the system based on the probabilistic model of applications to relieve the I/O congestion when I/O load exceeds the PFS bandwidth caused by dynamic application interference. The proposed scheduling method for limited-size Burst-Buffers deployed HPC platforms makes online decision of probabilistic selection of concurrent I/O requests for going through (to PFS), buffering (to Burst-Buffers) or declination in accordance to both the available I/O bandwidth and the current buffer state in order to maximize system efficiency or minimize application dilation. Extensive experiment results on actual characteristic synthetic data show that our method handles the I/O congestion effectively.

Original languageEnglish
Title of host publicationProceedings - 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019
EditorsHui Tian, Hong Shen, Wee Lum Tan
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages52-57
Number of pages6
ISBN (Electronic)9781728126166
DOIs
Publication statusPublished - Dec 2019
Externally publishedYes
Event20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019 - Gold Coast, Australia
Duration: 5 Dec 20197 Dec 2019

Publication series

NameProceedings - 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019

Conference

Conference20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019
Country/TerritoryAustralia
CityGold Coast
Period5/12/197/12/19

Keywords

  • Burst-buffers
  • High performance computing
  • I/O congestion
  • I/O scheduling

Fingerprint

Dive into the research topics of 'I/O scheduling for limited-size burst-buffers deployed high performance computing'. Together they form a unique fingerprint.

Cite this