Integrating Temporal Abstraction And Intrinsic Motivation - ArXiv

Computer Science > Machine Learning arXiv:1604.06057 (cs) [Submitted on 20 Apr 2016 (v1), last revised 31 May 2016 (this version, v2)] Title:Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation Authors:Tejas D. Kulkarni, Karthik R. Narasimhan, Ardavan Saeedi, Joshua B. Tenenbaum View a PDF of the paper titled Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, by Tejas D. Kulkarni and 3 other authors View PDF
Abstract:Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. The primary difficulty arises due to insufficient exploration, resulting in an agent being unable to learn robust value functions. Intrinsically motivated agents can explore new behavior for its own sake rather than to directly solve problems. Such intrinsic behaviors could eventually help the agent solve tasks posed by the environment. We present hierarchical-DQN (h-DQN), a framework to integrate hierarchical value functions, operating at different temporal scales, with intrinsically motivated deep reinforcement learning. A top-level value function learns a policy over intrinsic goals, and a lower-level function learns a policy over atomic actions to satisfy the given goals. h-DQN allows for flexible goal specifications, such as functions over entities and relations. This provides an efficient space for exploration in complicated environments. We demonstrate the strength of our approach on two problems with very sparse, delayed feedback: (1) a complex discrete stochastic decision process, and (2) the classic ATARI game `Montezuma's Revenge'.
Comments: 14 pages, 7 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
Cite as: arXiv:1604.06057 [cs.LG]
(or arXiv:1604.06057v2 [cs.LG] for this version)
https://doi.org/10.48550/arXiv.1604.06057 Focus to learn more arXiv-issued DOI via DataCite

Submission history

From: Karthik Narasimhan [view email] [v1] Wed, 20 Apr 2016 18:47:48 UTC (1,158 KB) [v2] Tue, 31 May 2016 14:45:58 UTC (1,171 KB) Full-text links:

Access Paper:

    View a PDF of the paper titled Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, by Tejas D. Kulkarni and 3 other authors
  • View PDF
  • TeX Source
view license Current browse context: cs.LG < prev | next > new | recent | 2016-04 Change to browse by: cs cs.AI cs.CV cs.NE stat stat.ML

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar

1 blog link

(what is this?)

DBLP - CS Bibliography

listing | bibtex Tejas D. KulkarniKarthik NarasimhanArdavan SaeediJoshua B. Tenenbaum export BibTeX citation

BibTeX formatted citation

× loading... Data provided by:

Bookmark

BibSonomy logo Reddit logo Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Code, Data and Media Associated with this Article alphaXiv Toggle alphaXiv (What is alphaXiv?) Links to Code Toggle CatalyzeX Code Finder for Papers (What is CatalyzeX?) DagsHub Toggle DagsHub (What is DagsHub?) GotitPub Toggle Gotit.pub (What is GotitPub?) Huggingface Toggle Hugging Face (What is Huggingface?) Links to Code Toggle Papers with Code (What is Papers with Code?) ScienceCast Toggle ScienceCast (What is ScienceCast?) Demos Demos Replicate Toggle Replicate (What is Replicate?) Spaces Toggle Hugging Face Spaces (What is Spaces?) Spaces Toggle TXYZ.AI (What is TXYZ.AI?) Related Papers Recommenders and Search Tools Link to Influence Flower Influence Flower (What are Influence Flowers?) Core recommender toggle CORE Recommender (What is CORE?) IArxiv recommender toggle IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic
About arXivLabs arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Từ khóa » H-dqn