Inference and Learning in Dynamic Decision Networks Using Knowledge Compilation
2024 (English)In: Proceedings of the 38th AAAI Conference on Artificial Intelligence / [ed] Michael Wooldridge; Jennifer Dy; Sriraam Natarajan, AAAI Press, 2024, Vol. 38, p. 20567-20576Conference paper, Published paper (Refereed)
Abstract [en]
Decision making under uncertainty in dynamic environments is a fundamental AI problem in which agents need to determine which decisions (or actions) to make at each time step to maximise their expected utility. Dynamic decision networks (DDNs) are an extension of dynamic Bayesian networks with decisions and utilities, and can be used to compactly represent Markov decision processes (MDPs). We propose a novel algorithm called mapl-cirup that leverages knowledge compilation techniques developed for (dynamic) Bayesian networks to perform inference and gradient-based learning in DDNs. Specifically, we knowledge-compile the Bellman update present in DDNs into dynamic decision circuits and evaluate them within an (algebraic) model counting framework. In contrast to other exact symbolic MDP approaches, we obtain differentiable circuits that enable gradient-based parameter learning.
Place, publisher, year, edition, pages
AAAI Press, 2024. Vol. 38, p. 20567-20576
Series
Proceedings of the AAAI Conference on Artificial Intelligence, ISSN 2159-5399, E-ISSN 2374-3468 ; 38:18
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:oru:diva-115497DOI: 10.1609/aaai.v38i18.30042ISI: 001241509500088Scopus ID: 2-s2.0-85189535865ISBN: 9781577358879 (print)OAI: oai:DiVA.org:oru-115497DiVA, id: diva2:1890935
Conference
38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence, Vancouver, Canada, February 20-27, 2024
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Knut and Alice Wallenberg Foundation
Note
This work was supported by the KU Leuven Research Fund (C14/18/062), the Research Foundation-Flanders (FWO, 1SA5520N), the Flemish Government under the “Onder-zoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” programme, the EU H2020 ICT48 project “TAILOR” under contract #952215, and the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.
2024-08-212024-08-212024-08-21Bibliographically approved