Learning Costs of Constrained Optimization Problems for Structured Inference
2026 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]
For traditional learning algorithms, inferring structured outputs, such as decisions that must satisfy domain-specific constraints, is not straightforward. A common challenge arises when elements in the output vector are interdependent in ways that standard learning algorithms or neural network activation functions cannot easily enforce. On the other hand, the fields of Operations Research and Discrete Optimization offer a complementary approach with the use of specific solvers or algorithms with techniques to ensure constraint satisfaction, generally formulated as Constrained Optimization Problems (COPs).
This thesis explores how COPs with predefined constraints can be integrated into learning frameworks by learning missing components of their cost functions through gradient-based optimization to ensure constraint satisfaction in learning tasks. Learning those missing components of the COP cost allows the inference process to use a suitable solver under the learned COP that ensures structured outputs, i.e., outputs satisfying those predefined constraints. The key challenge in this approach is the learning process itself, namely the difficulty of propagating gradients through the solver, which is often nondifferentiable or discrete. This tension motivates the central question of the thesis: How to learn the costs of the COP such that the observed decisions can be approximately recovered when applying the corresponding COP solver?
To address this question, we develop new methods for estimating or approximating gradients through specific classes of solvers while also leveraging existing techniques from differentiable optimization. These methods underpin novel deep learning frameworks that achieve structured inference by treating solvers as implicit layers in the learning process. This thesis is grouped into three learning paradigms: i) decision-focused learning, ii) imitation learning, and iii) structured generative modeling. In each paradigm, a neural architecture infers the cost parameters of the underlying COP. Consequently, the proposed methods not only preserve the expressiveness needed to capture complex relationships but, more importantly, guarantee feasibility at inference time.
Applications examined in this thesis are relevant to real-world decisionmaking and include improving the understanding of uncertainty estimation in conditional allocations through Portfolio Optimization Problems and contextaware inventory decisions via Inventory Optimization Problems; enhancing scalability in recovering transition costs in graphs from observed paths and cycles using Discrete Routing Optimization Problems, where large-scale experiments are conducted with real taxi and ship trajectory data; and improving the quality of control inputs in autonomous racing with simulation and hardware-implemented Model Predictive Controllers. Across all these applications, the proposed methods were successful and demonstrated consistent improvements over established baselines.
Place, publisher, year, edition, pages
Örebro: Örebro University , 2026. , p. 161
Series
Örebro Studies in Technology, ISSN 1650-8580 ; 113
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:oru:diva-126481ISBN: 9789175297507 (print)OAI: oai:DiVA.org:oru-126481DiVA, id: diva2:2030588
Public defence
2026-03-09, Örebro universitet, Långhuset, Hörsal L2, Fakultetsgatan 1, Örebro, 09:00 (English)
Opponent
Supervisors
2026-01-212026-01-212026-03-18Bibliographically approved