跳至主導覽 跳至搜尋 跳過主要內容

Recursive least-squares TD (λ) learning algorithm based on improved extreme learning machine

  • Yuan Xu
  • , Bingming Huang
  • , Yanlin He

研究成果: Article同行評審

5 引文 斯高帕斯(Scopus)

摘要

To meet the requirements on accuracy and computational time of value approximation algorithms, a recursive least-squares temporal difference reinforcement learning algorithm with eligibility traces based on improved extreme learning machine (RLSTD(λ)-IELM) was proposed. First, a recursive least-squares temporal difference reinforcement learning (RLSTD) was created by introducing recursive method into least-squares temporal difference reinforcement learning algorithm (LSTD), in order to eliminate matrix inversion process in least-squares algorithm and to reduce complexity and computation of the proposed algorithm. Then, eligibility trace was introduced into RLSTD algorithm to form the recursive least-squares temporal difference reinforcement learning algorithm with eligibility trace (RLSTD(λ)), in order to solve issues of slow convergence speed of LSTD(0) and low efficiency of experience exploitation. Furthermore, since value function in most reinforcement learning problem was monotonic, a single suppressed approximation Softplus function was used to replace sigmoid activation function in the extreme learning machine network in order to reduce computation load and improve computing speed. The experiment result on generalized Hop-world problem demonstrated that the proposed algorithm RLSTD(λ)-IELM had faster computing speed than the least-squares temporal difference learning algorithm based on extreme learning machine (LSTD-ELM), and better accuracy than the least-squares temporal difference learning algorithm based on radial basis functions (LSTD-RBF).

原文English
頁(從 - 到)916-924
頁數9
期刊Huagong Xuebao/CIESC Journal
68
發行號3
DOIs
出版狀態Published - 1 3月 2017
對外發佈

指紋

深入研究「Recursive least-squares TD (λ) learning algorithm based on improved extreme learning machine」主題。共同形成了獨特的指紋。

引用此