To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Constrained reinforcement learningalgorithms in various environments
Örebro University, School of Science and Technology.
2024 (English)Independent thesis Basic level (professional degree), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Reinforcement learning has shown promise in developing autonomous agents capable of complex decision-making. Nonetheless, traditional reinforcement learning methods often operate in environments without constraints, resulting in unsafe or trivial behavior in realworld scenarios. This thesis investigates the development and evaluation of constrained reinforcement learning algorithms to enhance safety, performance, and reliability. Agents were trained using Q-learning in the Cliffwalking-v0 environment and Deep Q-network (DQN) in a customized Cartpole-v1 environment, navigating both discrete and continuous action spaces. Q-learning employed reward penalties, while DQN used deep constrained Q-learning to avoid unsafe actions. Performing multiple tests with different seeds indicated that constrained agents experienced lower Q-value and temporal difference losses, as well as fewer constraint violations, compared to unconstrained counterparts. Although constrained agents initially sacrificed some immediate rewards, they illustrated more consistent and safer behavior throughout training, ultimately achieving comparable or superior overall effectiveness.

Place, publisher, year, edition, pages
2024.
Keywords [en]
Reinforcement learning, Q-learning, Deep Q-Network, Deep Constrained Q-learning, constrained optimization, safety, OpenAI safety gym, CleanRL
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:oru:diva-115750OAI: oai:DiVA.org:oru-115750DiVA, id: diva2:1894330
Subject / course
Computer Engineering
Supervisors
Examiners
Available from: 2024-09-23 Created: 2024-09-03 Last updated: 2024-09-23Bibliographically approved

Open Access in DiVA

Constrained reinforcement learning algorithms in various environments(687 kB)214 downloads
File information
File name FULLTEXT01.pdfFile size 687 kBChecksum SHA-512
f0d74c160277553c9ba77fc4f5ff3d91fa6cb17d3c524fc25ee98d729c6216de6bd4e1d07bdff3987c6644f6173903870c1ebaede404500db5f39f1088fed171
Type fulltextMimetype application/pdf

By organisation
School of Science and Technology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 214 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 214 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf