Clustering based Probabilistic I/O Scheduling for Burst-Buffers Equipped HPC

Benbo Zha, Hong Shen, Hankz Hankui Zhuo, Zhijian Luo

研究成果: Conference contribution同行評審

摘要

Modern High-Performance Computing (HPC) platforms usually consist of an intermediate high-throughput layer, Burst-Buffers (BBs), between computing nodes and underlying shared Parallel File System (PFS) to absorb the I/O bursts caused by concurrent I/O requests from different applications. As concurrent applications increase I/O demand, BBs may experience I/O contention due to its limited capacity. The existing probabilistic I/O scheduling method can schedule I/O under limited BBs' capacity, which can sense BBs' congestion via the Markov-Chain-based probability model. However, the probability model requires consistent I/O characteristics of applications, including similar I/O duration and longer application length, to obtain an accurate I/O load estimation. These consistency conditions do not often hold in realistic situations.In this paper, we proposed a probability I/O scheduling framework based on application clustering (PIOS) to eliminate the consistency requirement. The framework first clusters all applications by 1-D K-means according to their I/O phrase length. Next, the expected I/O workload of each cluster is calculated and then the BBs's capacity is partitioned according to the expected I/O workload. Finally, the probabilistic I/O scheduling is applied to each application cluster. The simulation results demonstrate our framework can adapt to inconsistency and show more efficiency.

原文English
主出版物標題Proceedings - 2023 The 14th International Symposium on Parallel Architectures, Algorithms and Programming, PAAP 2023
發行者IEEE Computer Society
ISBN(電子)9798350371024
DOIs
出版狀態Published - 2023
事件14th International Symposium on Parallel Architectures, Algorithms and Programming, PAAP 2023 - Beijing, China
持續時間: 24 11月 202326 11月 2023

出版系列

名字Proceedings - International Symposium on Parallel Architectures, Algorithms and Programming, PAAP
ISSN(列印)2168-3034
ISSN(電子)2168-3042

Conference

Conference14th International Symposium on Parallel Architectures, Algorithms and Programming, PAAP 2023
國家/地區China
城市Beijing
期間24/11/2326/11/23

指紋

深入研究「Clustering based Probabilistic I/O Scheduling for Burst-Buffers Equipped HPC」主題。共同形成了獨特的指紋。

引用此