To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
KEA: Keeping Exploration Alive by Proactively Coordinating Exploration Strategies
Örebro University, School of Science and Technology.ORCID iD: 0000-0002-4245-6706
Örebro University, School of Science and Technology.ORCID iD: 0000-0001-8658-2985
Örebro University, School of Science and Technology.ORCID iD: 0000-0003-3958-6179
Örebro University, School of Science and Technology.ORCID iD: 0000-0002-6013-4874
2025 (English)Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

Soft Actor-Critic (SAC) has achieved notable success in continuous control tasks but struggles in sparse reward settings, where infrequent rewards make efficient exploration challenging. While novelty-based exploration methods address this issue by encouraging the agent to explore novel states, they are not trivial to apply to SAC. In particular, managing the interaction between novelty-based exploration and SAC’s stochastic policy can lead to inefficient exploration and redundant sample collection. In this paper, we propose KEA (Keeping Exploration Alive) which tackles the inefficiencies in balancing exploration strategies when combining SAC with novelty-based exploration. KEA integrates a novelty-augmented SAC with a standard SAC agent, proactively coordinated via a switching mechanism. This coordination allows the agent to maintain stochasticity in high-novelty regions, enhancing exploration efficiency and reducing repeated sample collection. We first analyze this potential issue in a 2D navigation task, and then evaluate KEA on the DeepSea hard-exploration benchmark as well as sparse reward control tasks from the DeepMind Control Suite. Compared to state-of-the-art novelty-based exploration baselines, our experiments show that KEA significantly improves learning efficiency and robustness in sparse reward setups.

Place, publisher, year, edition, pages
2025.
Keywords [en]
Reinforcement Learning, Novelty-based Exploration, Soft Actor-Critic, Sparse reward
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:oru:diva-124954OAI: oai:DiVA.org:oru-124954DiVA, id: diva2:2013284
Conference
Forty-second International Conference on Machine Learning (ICML 2025), Vancouver, Canada, July 13-19, 2025
Projects
DARKO
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)EU, Horizon 2020
Note

This work has received funding from the EU’s Horizon 2020 research and innovation programme under grant agreement No 101017274, and was supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.

Available from: 2025-11-12 Created: 2025-11-12 Last updated: 2025-11-13Bibliographically approved

Open Access in DiVA

KEA: Keeping Exploration Alive by Proactively Coordinating Exploration Strategies(9262 kB)34 downloads
File information
File name FULLTEXT01.pdfFile size 9262 kBChecksum SHA-512
2f3b63cb8c3de0e47df00e882c10e07716dc6a85a7c5255f140f6b86a60b468e6b829296f6ea66262f8a0bf42ea809b223ed69a94724f701bc3edec1173758ff
Type fulltextMimetype application/pdf

Other links

Free full text

Authority records

Shih-Min, YangMagnusson, MartinStork, Johannes AndreasStoyanov, Todor

Search in DiVA

By author/editor
Shih-Min, YangMagnusson, MartinStork, Johannes AndreasStoyanov, Todor
By organisation
School of Science and Technology
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 1053 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf