Title | Deterministic Policy Gradient: Convergence Analysis |
Author | |
Corresponding Author | Zhang,Wei |
Publication Years | 2022
|
Source Title | |
Pages | 2159-2169
|
Abstract | The deterministic policy gradient (DPG) method proposed in Silver et al. [2014] has been demonstrated to exhibit superior performance particularly for applications with multi-dimensional and continuous action spaces. However, it remains unclear whether DPG converges, and if so, how fast it converges and whether it converges as efficiently as other PG methods. In this paper, we provide a theoretical analysis of DPG to answer those questions. We study the single timescale DPG (often the case in practice) in both on-policy and off-policy settings, and show that both algorithms attain an ε- accurate stationary policy up to a system error with a sample complexity of O(ε). Moreover, we establish the convergence rate for DPG under Gaussian noise exploration, which is widely adopted in practice to improve the performance of DPG. To our best knowledge, this is the first non-asymptotic convergence characterization for DPG methods. |
SUSTech Authorship | Corresponding
|
Language | English
|
URL | [Source Record] |
Funding Project | Science and Technology Program of Jingdezhen City[JCYJ20200109141601708];
|
Scopus EID | 2-s2.0-85146148658
|
Data Source | Scopus
|
Document Type | Conference paper |
Identifier | http://kc.sustech.edu.cn/handle/2SGJ60CL/524336 |
Department | Department of Mechanical and Energy Engineering |
Affiliation | 1.Department of Electrical and Computer Engineering,The Ohio State University,Columbus,United States 2.Department of Electrical and Computer Engineering,National University of Singapore,Singapore,Singapore 3.Department of Mechanical and Energy Engineering,Southern University of Science and Technology (SUSTech),Shenzhen,Guangdong,China |
Corresponding Author Affilication | Department of Mechanical and Energy Engineering |
Recommended Citation GB/T 7714 |
Xiong,Huaqing,Xu,Tengyu,Zhao,Lin,et al. Deterministic Policy Gradient: Convergence Analysis[C],2022:2159-2169.
|
Files in This Item: | There are no files associated with this item. |
|
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment