To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
MPR-RL: Multi-Prior Regularized Reinforcement Learning for Knowledge Transfer
Örebro University, School of Science and Technology. (Autonomous Mobile Manipulation lab; Center for Applied Autonomous Sensor Systems (AASS))ORCID iD: 0000-0001-5655-0990
Örebro University, School of Science and Technology. (Autonomous Mobile Manipulation lab; Center for Applied Autonomous Sensor Systems (AASS))ORCID iD: 0000-0003-3958-6179
Örebro University, School of Science and Technology. Department of Computing and Software, McMaster University, Canada. (Autonomous Mobile Manipulation lab; Center for Applied Autonomous Sensor Systems (AASS))ORCID iD: 0000-0002-6013-4874
2022 (English)In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 7, no 3, p. 7652-7659Article in journal (Refereed) Published
Abstract [en]

In manufacturing, assembly tasks have been a challenge for learning algorithms due to variant dynamics of different environments. Reinforcement learning (RL) is a promising framework to automatically learn these tasks, yet it is still not easy to apply a learned policy or skill, that is the ability of solving a task, to a similar environment even if the deployment conditions are only slightly different. In this letter, we address the challenge of transferring knowledge within a family of similar tasks by leveraging multiple skill priors. We propose to learn prior distribution over the specific skill required to accomplish each task and compose the family of skill priors to guide learning the policy for a new task by comparing the similarity between the target task and the prior ones. Our method learns a latent action space representing the skill embedding from demonstrated trajectories for each prior task. We have evaluated our method on a task in simulation and a set of peg-in-hole insertion tasks and demonstrate better generalization to new tasks that have never been encountered during training. Our Multi-Prior Regularized RL (MPR-RL) method is deployed directly on a real world Franka Panda arm, requiring only a set of demonstrated trajectories from similar, but crucially not identical, problem instances.

Place, publisher, year, edition, pages
IEEE Press, 2022. Vol. 7, no 3, p. 7652-7659
Keywords [en]
Machine Learning for Robot Control, Reinforcement Learning, Transfer Learning
National Category
Robotics
Identifiers
URN: urn:nbn:se:oru:diva-99762DOI: 10.1109/LRA.2022.3184805ISI: 000818872000024Scopus ID: 2-s2.0-85133574877OAI: oai:DiVA.org:oru-99762DiVA, id: diva2:1677520
Note

Funding agency:

Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Available from: 2022-06-28 Created: 2022-06-28 Last updated: 2024-01-17Bibliographically approved
In thesis
1. Robot Skill Acquisition through Prior-Conditioned Reinforcement Learning
Open this publication in new window or tab >>Robot Skill Acquisition through Prior-Conditioned Reinforcement Learning
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Advancements in robotics and artificial intelligence have paved the way for autonomous agents to perform complex tasks in various domains. A critical challenge in the field of robotics is enabling robots to acquire and refine skills efficiently, allowing them to adapt and excel in diverse environments. This thesis investigates the questions of how to acquire robot skills through priorconstrained machine learning and adapt these learned skills to novel environments safely and efficiently.

The thesis leverages the synergy between Reinforcement Learning (RL) and prior knowledge to facilitate skill acquisition in robots. It integrates existing task constraints, domain knowledge and contextual information into the learning process, enabling the robot to acquire new skills efficiently. The core idea behind our method is to exploit structured priors derived from both expert demonstrations and domain-specific information which guide the RL process to effectively explore and exploit the state-action space.

The first contribution lies in guaranteeing the execution of safe actions and preventing constraint violations during the exploration phase of RL. By incorporating task-specific constraints, the robot avoids entering into regions of the environment where potential risks or failures may occur. It allows for efficient exploration of the action space while maintaining safety, making it well-suited for scenarios where continuous actions need to adhere to specific constraints. The second contribution addresses the challenge of learning a policy on a real robot to accomplish contact-rich tasks by exploiting a set of pre-collected demonstrations. Specifically, a variable impedance action space is leveraged to enable the system to effectively adapt its interactions during contact-rich manipulation tasks. In the third contribution, the thesis explores the transferability of skills acquired across different tasks and domains, highlighting the framework’s potential for building a repository of reusable skills. By comparing the similarity between the target task and the prior tasks, prior knowledge is combined to guide the policy learning process for new tasks. In the fourth contribution of this thesis, we introduce a cycle generative model to transfer acquired skills across different robot platforms by learning from unstructured prior demonstrations. In summary, the thesis introduces a novel paradigm for advancing the field of robotic skill acquisition by synergizing prior knowledge with RL.

Place, publisher, year, edition, pages
Örebro: Örebro University, 2023. p. 66
Series
Örebro Studies in Technology, ISSN 1650-8580 ; 101
Keywords
Reinforcement Learning, Robot Manipulation, Transfer Learning, Safety Constraints, Prior Knowledge Learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:oru:diva-108230 (URN)9789175295251 (ISBN)
Public defence
2023-10-31, Örebro universitet, Forumhuset, Hörsal F, Fakultetsgatan 1, Örebro, 09:15 (English)
Opponent
Supervisors
Available from: 2023-09-12 Created: 2023-09-12 Last updated: 2023-10-17Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Yang, QuantaoStork, Johannes A.Stoyanov, Todor

Search in DiVA

By author/editor
Yang, QuantaoStork, Johannes A.Stoyanov, Todor
By organisation
School of Science and Technology
In the same journal
IEEE Robotics and Automation Letters
Robotics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 134 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf