To Örebro University

oru.seÖrebro University Publications
Change search
Link to record
Permanent link

Direct link
Publications (2 of 2) Show all publications
Rietz, F., Magg, S., Heintz, F., Stoyanov, T., Wermter, S. & Stork, J. A. (2023). Hierarchical goals contextualize local reward decomposition explanations. Neural Computing & Applications, 35(23), 16693-16704
Open this publication in new window or tab >>Hierarchical goals contextualize local reward decomposition explanations
Show others...
2023 (English)In: Neural Computing & Applications, ISSN 0941-0643, E-ISSN 1433-3058, Vol. 35, no 23, p. 16693-16704Article in journal (Refereed) Published
Abstract [en]

One-step reinforcement learning explanation methods account for individual actions but fail to consider the agent's future behavior, which can make their interpretation ambiguous. We propose to address this limitation by providing hierarchical goals as context for one-step explanations. By considering the current hierarchical goal as a context, one-step explanations can be interpreted with higher certainty, as the agent's future behavior is more predictable. We combine reward decomposition with hierarchical reinforcement learning into a novel explainable reinforcement learning framework, which yields more interpretable, goal-contextualized one-step explanations. With a qualitative analysis of one-step reward decomposition explanations, we first show that their interpretability is indeed limited in scenarios with multiple, different optimal policies-a characteristic shared by other one-step explanation methods. Then, we show that our framework retains high interpretability in such cases, as the hierarchical goal can be considered as context for the explanation. To the best of our knowledge, our work is the first to investigate hierarchical goals not as an explanation directly but as additional context for one-step reinforcement learning explanations.

Place, publisher, year, edition, pages
Springer, 2023
Reinforcement learning, Explainable AI, Reward decomposition, Hierarchical goals, Local explanations
National Category
Computer Sciences
urn:nbn:se:oru:diva-99115 (URN)10.1007/s00521-022-07280-8 (DOI)000794083400001 ()2-s2.0-85129803505 (Scopus ID)

Funding agencies:

Örebro University

Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Federal Ministry for Economic Affairs and Climate FKZ 20X1905A-D

Available from: 2022-05-23 Created: 2022-05-23 Last updated: 2023-11-28Bibliographically approved
Rietz, F., Schaffernicht, E., Stoyanov, T. & Stork, J. A. (2022). Towards Task-Prioritized Policy Composition. In: : . Paper presented at 35th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022), Kyoto, Japan, October 24-26, 2022.
Open this publication in new window or tab >>Towards Task-Prioritized Policy Composition
2022 (English)Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

Combining learned policies in a prioritized, ordered manner is desirable because it allows for modular design and facilitates data reuse through knowledge transfer. In control theory, prioritized composition is realized by null-space control, where low-priority control actions are projected into the null-space of high-priority control actions. Such a method is currently unavailable for Reinforcement Learning. We propose a novel, task-prioritized composition framework for Reinforcement Learning, which involves a novel concept: The indifferent-space of Reinforcement Learning policies. Our framework has the potential to facilitate knowledge transfer and modular design while greatly increasing data efficiency and data reuse for Reinforcement Learning agents. Further, our approach can ensure high-priority constraint satisfaction, which makes it promising for learning in safety-critical domains like robotics. Unlike null-space control, our approach allows learning globally optimal policies for the compound task by online learning in the indifference-space of higher-level policies after initial compound policy construction. 

National Category
Computer Systems
urn:nbn:se:oru:diva-102120 (URN)
35th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022), Kyoto, Japan, October 24-26, 2022
Available from: 2022-11-08 Created: 2022-11-08 Last updated: 2024-01-03Bibliographically approved

Search in DiVA

Show all publications