Neurosymbolic Reinforcement Learning: Playing MiniHack With Probabilistic Logic Shields
2025 (English)In: Proceedings of the AAAI Conference on Artificial Intelligence / [ed] Walsh, T; Shah, J; Kolter, Z, AAAI Press, 2025, p. 29631-29633Conference paper, Published paper (Refereed)
Abstract [en]
Probabilistic logic shields integrate deep reinforcement learning (RL) with probabilistic logic reasoning to train agents that operate in uncertain environments while giving strong guarantees with respect to logical constraints, such as safety properties. In this demo paper, we introduce a codebase that streamlines the design of custom MiniHack environments where neurosymbolic RL agents leverage probabilistic logic shields to learn safe and interpretable policies with strong guarantees. Our framework allows expert users to easily define and train agents that integrate deep neural policies with probabilistic logic in arbitrarily complex games: from simple exploration to planning and interacting with enemies. Additionally, we provide a web-based platform that showcases our application, offering an interactive interface for the broader community to experiment with and explore the capabilities of neurosymbolic reinforcement learning. This lowers the barrier for researchers and developers, making it accessible for a wider audience to engage with safety-critical RL scenarios.
Place, publisher, year, edition, pages
AAAI Press, 2025. p. 29631-29633
Series
Proceedings of the AAAI Conference on Artificial Intelligence, ISSN 2159-5399, E-ISSN 2374-3468 ; Vol. 39, no 28
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:oru:diva-122622DOI: 10.1609/aaai.v39i28.35349ISI: 001477477000212ISBN: 9781577358978 (print)OAI: oai:DiVA.org:oru-122622DiVA, id: diva2:1986640
Conference
39th AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, February 25 - March 4, 2025
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)EU, Horizon Europe, 101142702
Note
DD is a fellow of the Research Foundation-Flanders (FWO-Vlaanderen, 1185125N). This research has also received funding from the 1185125N Leuven Research Funds (STG/22/021, CELSA/24/008, C14/24/092), from the Flemish Government under the "Onderzoeksprogramma Artificiele Intelligentie (AI) Vlaanderen" programme, from the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation, and from the European Research Council (ERC) under the European Union's Horizon Europe research and innovation programme (grant agreement no101142702).
2025-08-012025-08-012025-08-01Bibliographically approved