TY - GEN
T1 - Clustering based Probabilistic I/O Scheduling for Burst-Buffers Equipped HPC
AU - Zha, Benbo
AU - Shen, Hong
AU - Zhuo, Hankz Hankui
AU - Luo, Zhijian
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Modern High-Performance Computing (HPC) platforms usually consist of an intermediate high-throughput layer, Burst-Buffers (BBs), between computing nodes and underlying shared Parallel File System (PFS) to absorb the I/O bursts caused by concurrent I/O requests from different applications. As concurrent applications increase I/O demand, BBs may experience I/O contention due to its limited capacity. The existing probabilistic I/O scheduling method can schedule I/O under limited BBs' capacity, which can sense BBs' congestion via the Markov-Chain-based probability model. However, the probability model requires consistent I/O characteristics of applications, including similar I/O duration and longer application length, to obtain an accurate I/O load estimation. These consistency conditions do not often hold in realistic situations.In this paper, we proposed a probability I/O scheduling framework based on application clustering (PIOS) to eliminate the consistency requirement. The framework first clusters all applications by 1-D K-means according to their I/O phrase length. Next, the expected I/O workload of each cluster is calculated and then the BBs's capacity is partitioned according to the expected I/O workload. Finally, the probabilistic I/O scheduling is applied to each application cluster. The simulation results demonstrate our framework can adapt to inconsistency and show more efficiency.
AB - Modern High-Performance Computing (HPC) platforms usually consist of an intermediate high-throughput layer, Burst-Buffers (BBs), between computing nodes and underlying shared Parallel File System (PFS) to absorb the I/O bursts caused by concurrent I/O requests from different applications. As concurrent applications increase I/O demand, BBs may experience I/O contention due to its limited capacity. The existing probabilistic I/O scheduling method can schedule I/O under limited BBs' capacity, which can sense BBs' congestion via the Markov-Chain-based probability model. However, the probability model requires consistent I/O characteristics of applications, including similar I/O duration and longer application length, to obtain an accurate I/O load estimation. These consistency conditions do not often hold in realistic situations.In this paper, we proposed a probability I/O scheduling framework based on application clustering (PIOS) to eliminate the consistency requirement. The framework first clusters all applications by 1-D K-means according to their I/O phrase length. Next, the expected I/O workload of each cluster is calculated and then the BBs's capacity is partitioned according to the expected I/O workload. Finally, the probabilistic I/O scheduling is applied to each application cluster. The simulation results demonstrate our framework can adapt to inconsistency and show more efficiency.
KW - Application clustering
KW - Burst-buffering
KW - High-performance computing
KW - I/O scheduling
UR - http://www.scopus.com/inward/record.url?scp=85184372616&partnerID=8YFLogxK
U2 - 10.1109/PAAP60200.2023.10391426
DO - 10.1109/PAAP60200.2023.10391426
M3 - Conference contribution
AN - SCOPUS:85184372616
T3 - Proceedings - International Symposium on Parallel Architectures, Algorithms and Programming, PAAP
BT - Proceedings - 2023 The 14th International Symposium on Parallel Architectures, Algorithms and Programming, PAAP 2023
PB - IEEE Computer Society
T2 - 14th International Symposium on Parallel Architectures, Algorithms and Programming, PAAP 2023
Y2 - 24 November 2023 through 26 November 2023
ER -