跳至主導覽 跳至搜尋 跳過主要內容

Online scheduling of coflows by attention-empowered scalable deep reinforcement learning

  • Xin Wang
  • , Hong Shen
  • Sun Yat-Sen University

研究成果: Article同行評審

6 引文 斯高帕斯(Scopus)

摘要

With the abstraction of parallel data transmission flows being a coflow, data transmissions in large-scale computing jobs can be modeled by a coflow directed acyclic graph (coflow DAG) in which nodes are coflows and edges represent dependencies between coflows. Efficient scheduling of coflows on network links is crucial for reducing the overall communication and job completion time. The known best coflow scheduling method deploying deep reinforcement learning (DRL), DeepWeave (Sun et al., 2020), suffers from poor scalability due to the requirement of O(dn)-size policy network for processing n coflows of d dimensions which is difficult to train. This paper extends the directed acyclic graph neural network (DAGNN) to Pipelined-DAGNN that embeds the features of different stages of input coflow DAGs in pipeline to effectively speed up the feature extraction process. To effectively process the feature vectors of coflow DAGs of arbitrary size and shape without compromising scheduling accuracy (quality), we propose a novel self-attention empowered DRL coflow scheduling model to generate coflow scheduling policies, which enables the scale of policy network depends only on features (dimensions) rather than coflows, without the need of packing all individual embedding vectors from Pipelined-DAGNN into a long flat vector. Our model reduces the size of the policy network in DRL from previously O(dn) to O(d), achieving a high scalability independent of the number of coflows. Simulation results on Facebook trace show that our model reduces the average weighted job completion time by up to 33.88%, apart from being more scalable and robust, compared with the state-of-the-art methods.

原文English
頁(從 - 到)195-206
頁數12
期刊Future Generation Computer Systems
146
DOIs
出版狀態Published - 9月 2023

指紋

深入研究「Online scheduling of coflows by attention-empowered scalable deep reinforcement learning」主題。共同形成了獨特的指紋。

引用此