I/O scheduling for limited-size burst-buffers deployed high performance computing

Benbo Zha, Hong Shen

研究成果: Conference contribution同行評審

2 引文 斯高帕斯(Scopus)

摘要

Burst-Buffers is a high throughput, small size intermediate storage system integrated between computing nodes and permanent storage system to mitigate the I/O bottleneck problem in modern High Performance Computing (HPC) platforms. This system, however, is unable to effectively handle variable-intensity I/O bursts resulted by unpredictable concurrent accesses to the shared Parallel File System (PFS). In this paper, we introduce a probabilistic I/O scheduling method that takes into account of the burst-buffer load state and instantaneous I/O load distribution of the system based on the probabilistic model of applications to relieve the I/O congestion when I/O load exceeds the PFS bandwidth caused by dynamic application interference. The proposed scheduling method for limited-size Burst-Buffers deployed HPC platforms makes online decision of probabilistic selection of concurrent I/O requests for going through (to PFS), buffering (to Burst-Buffers) or declination in accordance to both the available I/O bandwidth and the current buffer state in order to maximize system efficiency or minimize application dilation. Extensive experiment results on actual characteristic synthetic data show that our method handles the I/O congestion effectively.

原文English
主出版物標題Proceedings - 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019
編輯Hui Tian, Hong Shen, Wee Lum Tan
發行者Institute of Electrical and Electronics Engineers Inc.
頁面52-57
頁數6
ISBN(電子)9781728126166
DOIs
出版狀態Published - 12月 2019
對外發佈
事件20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019 - Gold Coast, Australia
持續時間: 5 12月 20197 12月 2019

出版系列

名字Proceedings - 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019

Conference

Conference20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019
國家/地區Australia
城市Gold Coast
期間5/12/197/12/19

指紋

深入研究「I/O scheduling for limited-size burst-buffers deployed high performance computing」主題。共同形成了獨特的指紋。

引用此