To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Deep Explainable Relational Reinforcement Learning: A Neuro-Symbolic Approach
Örebro University, School of Science and Technology. (MPI, Centre for Applied Autonomous Sensor Systems (AASS))ORCID iD: 0000-0003-3422-2085
Örebro University, School of Science and Technology. (Centre for Applied Autonomous Sensor Systems (AASS))ORCID iD: 0000-0002-6860-6303
2023 (English)In: Machine Learning and Knowledge Discovery in Databases: Research Track: European Conference, ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Proceedings, Part IV / [ed] Danai Koutra; Claudia Plant; Manuel Gomez Rodriguez; Elena Baralis; Francesco Bonchi, Springer, 2023, Vol. 14172, p. 213-229Conference paper, Published paper (Refereed)
Abstract [en]

Despite its successes, Deep Reinforcement Learning (DRL) yields non-interpretable policies. Moreover, since DRL does not exploit symbolic relational representations, it has difficulties in coping with structural changes in its environment (such as increasing the number of objects).  Meanwhile, Relational Reinforcement Learning inherits the relational representations from symbolic planning to learn reusable policies. However, it has so far been unable to scale up and exploit the power of deep neural networks. We propose Deep Explainable Relational Reinforcement Learning (DERRL), a framework that exploits the best of both -- neural and symbolic worlds. By resorting to a neuro-symbolic approach, DERRL combines relational representations and constraints from symbolic planning with deep learning to extract interpretable policies. These policies are in the form of logical rules that explain why each decision (or action) is arrived at. Through several experiments, in setups like the Countdown Game, Blocks World, Gridworld, Traffic, and Mingrid, we show that the policies learned by DERRL are adaptable to varying configurations and environmental changes.

Place, publisher, year, edition, pages
Springer, 2023. Vol. 14172, p. 213-229
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 14172
Keywords [en]
Neuro-Symbolic AI, Relational Reinforcement Learning, Deep Reinforcement Learning, Explainability
National Category
Computer Sciences
Research subject
Computer and Systems Science; Computer Science
Identifiers
URN: urn:nbn:se:oru:diva-108100DOI: 10.48550/arXiv.2304.08349ISI: 001156141200013ISBN: 9783031434204 (print)ISBN: 9783031434211 (electronic)OAI: oai:DiVA.org:oru-108100DiVA, id: diva2:1794426
Conference
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2023), Turin, Italy, September 18-22, 2023
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Available from: 2023-09-05 Created: 2023-09-05 Last updated: 2025-09-01Bibliographically approved
In thesis
1. Neurosymbolic Decision-Making with Large Language Models
Open this publication in new window or tab >>Neurosymbolic Decision-Making with Large Language Models
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Reasoning and decision-making are foundational challenges in artificial intelligence (AI). These processes are closely linked – an intelligent agent must reason about its environment and goals in order to make decisions and select actions. Two principal frameworks for sequential decision-making are AI planning and reinforcement learning (RL). Planning assumes access to a known model of the environment and uses symbolic representations to compute a sequence of actions that leads from an initial state to a desired goal. In contrast, RL focuse son learning behavior through interaction, enabling agents to develop policies that maximize long-term reward under uncertainty. Despite methodological differences, both approaches aim to generate intelligent, goal-directed action sequences.

The rise of Large Language Models (LLMs) has sparked significant interest in their potential to perform reasoning, planning, and decision-making tasks. Despite their impressive performance in natural language understanding and generalization, there is growing skepticism about whether LLMs genuinely reason or merely leverage statistical correlations. This dissertation investigates this question through a principled evaluation grounded in computational theory, using 3-SAT – the canonical NP-complete problem – as a testbed. The findings demonstrate that LLMs fail to exhibit sound and complete reasoning, especially on complex instances where shallow heuristics fail, and that their apparent reasoning abilities often stem from overfitting to statistical patterns.

To address these limitations, this dissertation proposes a range of neurosymbolic architectures that combine the generative flexibility of LLMs with the rigor and reliability of symbolic methods. Empirical evaluations across planning, reward design, and plan verification tasks show that such integration yields systems that are more robust and accurate. This work advances our theoretical and practical understanding of LLM-based reasoning, provides concrete design principles for neurosymbolic systems, and charts a path toward AI agents that integrate world knowledge with logical precision.

Place, publisher, year, edition, pages
Örebro: Örebro University, 2025. p. 67
Series
Örebro Studies in Technology, ISSN 1650-8580 ; 106
National Category
Computer Sciences
Identifiers
urn:nbn:se:oru:diva-122456 (URN)9789175296869 (ISBN)
Public defence
2025-10-17, Örebro universitet, Långhuset, Hörsal L2, Fakultetsgatan 1, Örebro, 13:00 (English)
Opponent
Supervisors
Available from: 2025-07-22 Created: 2025-07-22 Last updated: 2025-09-04Bibliographically approved

Open Access in DiVA

Deep Explainable Relational Reinforcement Learning: A Neuro-Symbolic Approach(1442 kB)681 downloads
File information
File name FULLTEXT01.pdfFile size 1442 kBChecksum SHA-512
0878c53e3cfd1580a0f726b62233b72af66bfd23eb6adebc9cd1ba7d35717c486c1ce9ed10acce2eaa96a24479efaffbe63ca7dccf8a0d63f96cfcfc514a2e2d
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Hazra, RishiDe Raedt, Luc

Search in DiVA

By author/editor
Hazra, RishiDe Raedt, Luc
By organisation
School of Science and Technology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 681 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 722 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf