Hierarchical approximate policy iteration with binary-tree state space decomposition

IEEE Transactions on Neural Networks
Xin XuDewen Hu

Abstract

In recent years, approximate policy iteration (API) has attracted increasing attention in reinforcement learning (RL), e.g., least-squares policy iteration (LSPI) and its kernelized version, the kernel-based LSPI algorithm. However, it remains difficult for API algorithms to obtain near-optimal policies for Markov decision processes (MDPs) with large or continuous state spaces. To address this problem, this paper presents a hierarchical API (HAPI) method with binary-tree state space decomposition for RL in a class of absorbing MDPs, which can be formulated as time-optimal learning control tasks. In the proposed method, after collecting samples adaptively in the state space of the original MDP, a learning-based decomposition strategy of sample sets was designed to implement the binary-tree state space decomposition process. Then, API algorithms were used on the sample subsets to approximate local optimal policies of sub-MDPs. The original MDP was decomposed into a binary-tree structure of absorbing sub-MDPs, constructed during the learning process, thus, local near-optimal policies were approximated by API algorithms with reduced complexity and higher precision. Furthermore, because of the improved quality of local policies, the...Continue Reading

References

Dec 23, 1998·Network : Computation in Neural Systems·M Haft, J L van Hemmen
Dec 23, 1998·Network : Computation in Neural Systems·J P NadalN Parga
Dec 23, 1998·Network : Computation in Neural Systems·E Fransén, A Lansner
Jul 6, 2000·Network : Computation in Neural Systems·F S Chance, L F Abbott
Oct 3, 2000·Network : Computation in Neural Systems·O FrançoistT Hervé
Dec 29, 2000·Network : Computation in Neural Systems·A OmurtagL Sirovich
Aug 3, 2007·IEEE Transactions on Neural Networks·Xin XuXicheng Lu
Jul 18, 2008·IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics : a Publication of the IEEE Systems, Man, and Cybernetics Society·Marco A Wiering, Hado van Hasselt
Jul 10, 2010·IEEE Transactions on Neural Networks·John Seiffertt, D C Wunsch
Jun 22, 2011·IEEE Transactions on Neural Networks·Jian FuXinmin Zhou

Citations

Oct 8, 2014·IEEE Transactions on Cybernetics·Jaedeug Choi, Kee-Eung Kim
May 9, 2014·IEEE Transactions on Neural Networks and Learning Systems·Chunlin ChenTzyh-Jong Tarn
Dec 1, 2012·IEEE Transactions on Neural Networks and Learning Systems·Huai-Ning Wu, Biao Luo
Nov 1, 2012·IEEE Transactions on Neural Networks and Learning Systems·Sjoerd van den Dries, Marco A Wiering
Oct 1, 2012·IEEE Transactions on Neural Networks and Learning Systems·Wenwen WangJanusz A Starzyk
Feb 2, 2016·IEEE Transactions on Neural Networks and Learning Systems·Xin XuHaibo He
Feb 1, 2017·IEEE Transactions on Neural Networks and Learning Systems·Hongliang LiDing Wang

Related Concepts

Knowledge Representation (Computer)
In Silico
Theoretical Study
Pattern Recognition System
Psychological Reinforcement
Tissue Adhesives
Local
Evaluation
Cytochrome-c Oxidase Deficiency
Decision

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Lipidomics & Rhinovirus Infection

Lipidomics can be used to examine the lipid species involved with pathogenic conditions, such as viral associated inflammation. Discovered the latest research on Lipidomics & Rhinovirus Infection.

Alzheimer's Disease: MS4A

Variants within the membrane-spanning 4-domains subfamily A (MS4A) gene cluster have recently been implicated in Alzheimer's disease in genome-wide association studies. Here is the latest research on Alzheimer's disease and MS4A.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Torsion Dystonia

Torsion dystonia is a movement disorder characterized by loss of control of voluntary movements appearing as sustained muscle contractions and/or abnormal postures. Here is the latest research.

Generating Insulin-Secreting Cells

Reprogramming cells or using induced pluripotent stem cells to generate insulin-secreting cells has significant therapeutic implications for diabetics. Here is the latest research on generation of insulin-secreting cells.

Central Pontine Myelinolysis

Central Pontine Myelinolysis is a neurologic disorder caused most frequently by rapid correction of hyponatremia and is characterized by demyelination that affects the central portion of the base of the pons. Here is the latest research on this disease.

Epigenome Editing

Epigenome editing is the directed modification of epigenetic marks on chromatin at specified loci. This tool has many applications in research as well as in the clinic. Find the latest research on epigenome editing here.

Related Papers

Neural Networks : the Official Journal of the International Neural Network Society
Draguna Vrabie, F L Lewis
IEEE Transactions on Neural Networks
Derong LiuHuaguang Zhang
Neural Networks : the Official Journal of the International Neural Network Society
Paul J Werbos
© 2021 Meta ULC. All rights reserved