To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Robot Skill Acquisition through Prior-Conditioned Reinforcement Learning
Örebro University, School of Science and Technology.ORCID iD: 0000-0001-5655-0990
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Advancements in robotics and artificial intelligence have paved the way for autonomous agents to perform complex tasks in various domains. A critical challenge in the field of robotics is enabling robots to acquire and refine skills efficiently, allowing them to adapt and excel in diverse environments. This thesis investigates the questions of how to acquire robot skills through priorconstrained machine learning and adapt these learned skills to novel environments safely and efficiently.

The thesis leverages the synergy between Reinforcement Learning (RL) and prior knowledge to facilitate skill acquisition in robots. It integrates existing task constraints, domain knowledge and contextual information into the learning process, enabling the robot to acquire new skills efficiently. The core idea behind our method is to exploit structured priors derived from both expert demonstrations and domain-specific information which guide the RL process to effectively explore and exploit the state-action space.

The first contribution lies in guaranteeing the execution of safe actions and preventing constraint violations during the exploration phase of RL. By incorporating task-specific constraints, the robot avoids entering into regions of the environment where potential risks or failures may occur. It allows for efficient exploration of the action space while maintaining safety, making it well-suited for scenarios where continuous actions need to adhere to specific constraints. The second contribution addresses the challenge of learning a policy on a real robot to accomplish contact-rich tasks by exploiting a set of pre-collected demonstrations. Specifically, a variable impedance action space is leveraged to enable the system to effectively adapt its interactions during contact-rich manipulation tasks. In the third contribution, the thesis explores the transferability of skills acquired across different tasks and domains, highlighting the framework’s potential for building a repository of reusable skills. By comparing the similarity between the target task and the prior tasks, prior knowledge is combined to guide the policy learning process for new tasks. In the fourth contribution of this thesis, we introduce a cycle generative model to transfer acquired skills across different robot platforms by learning from unstructured prior demonstrations. In summary, the thesis introduces a novel paradigm for advancing the field of robotic skill acquisition by synergizing prior knowledge with RL.

Place, publisher, year, edition, pages
Örebro: Örebro University , 2023. , p. 66
Series
Örebro Studies in Technology, ISSN 1650-8580 ; 101
Keywords [en]
Reinforcement Learning, Robot Manipulation, Transfer Learning, Safety Constraints, Prior Knowledge Learning
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:oru:diva-108230ISBN: 9789175295251 (print)OAI: oai:DiVA.org:oru-108230DiVA, id: diva2:1796327
Public defence
2023-10-31, Örebro universitet, Forumhuset, Hörsal F, Fakultetsgatan 1, Örebro, 09:15 (English)
Opponent
Supervisors
Available from: 2023-09-12 Created: 2023-09-12 Last updated: 2023-10-17Bibliographically approved
List of papers
1. Null space based efficient reinforcement learning with hierarchical safety constraints
Open this publication in new window or tab >>Null space based efficient reinforcement learning with hierarchical safety constraints
2021 (English)In: 2021 European Conference on Mobile Robots (ECMR), IEEE, 2021Conference paper, Published paper (Refereed)
Abstract [en]

Reinforcement learning is inherently unsafe for use in physical systems, as learning by trial-and-error can cause harm to the environment or the robot itself. One way to avoid unpredictable exploration is to add constraints in the action space to restrict the robot behavior. In this paper, we proposea null space based framework of integrating reinforcement learning methods in constrained continuous action spaces. We leverage a hierarchical control framework to decompose target robotic skills into higher ranked tasks (e. g., joint limits and obstacle avoidance) and lower ranked reinforcement learning task. Safe exploration is guaranteed by only learning policies in the null space of higher prioritized constraints. Meanwhile multiple constraint phases for different operational spaces are constructed to guide the robot exploration. Also, we add penalty loss for violating higher ranked constraints to accelerate the learning procedure. We have evaluated our method on different redundant robotic tasks in simulation and show that our null space based reinforcement learning method can explore and learn safely and efficiently.

Place, publisher, year, edition, pages
IEEE, 2021
National Category
Robotics
Identifiers
urn:nbn:se:oru:diva-95146 (URN)10.1109/ECMR50962.2021.9568848 (DOI)000810510000061 ()9781665412131 (ISBN)
Conference
European Conference on Mobile Robots (ECMR 2021), Virtual meeting, August 31 - September 3, 2021
Note

Funding agency:

Wallenberg Artificial Intelligence, Autonomous Systems and Software Program (WASP)

Available from: 2021-10-21 Created: 2021-10-21 Last updated: 2023-10-06Bibliographically approved
2. Variable Impedance Skill Learning for Contact-Rich Manipulation
Open this publication in new window or tab >>Variable Impedance Skill Learning for Contact-Rich Manipulation
Show others...
2022 (English)In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 7, no 3, p. 8391-8398Article in journal, Letter (Refereed) Published
Abstract [en]

Contact-rich manipulation tasks remain a hard problem in robotics that requires interaction with unstructured environments. Reinforcement Learning (RL) is one potential solution to such problems, as it has been successfully demonstrated on complex continuous control tasks. Nevertheless, current state-of-the-art methods require policy training in simulation to prevent undesired behavior and later domain transfer even for simple skills involving contact. In this paper, we address the problem of learning contact-rich manipulation policies by extending an existing skill-based RL framework with a variable impedance action space. Our method leverages a small set of suboptimal demonstration trajectories and learns from both position, but also crucially impedance-space information. We evaluate our method on a number of peg-in-hole task variants with a Franka Panda arm and demonstrate that learning variable impedance actions for RL in Cartesian space can be deployed directly on the real robot, without resorting to learning in simulation.

Place, publisher, year, edition, pages
IEEE Press, 2022
Keywords
Machine learning for robot control, reinforcement learning, variable impedance control
National Category
Robotics
Research subject
Computer Science
Identifiers
urn:nbn:se:oru:diva-100386 (URN)10.1109/LRA.2022.3187276 (DOI)000838455200009 ()2-s2.0-85133737407 (Scopus ID)
Funder
Knut and Alice Wallenberg Foundation
Available from: 2022-08-01 Created: 2022-08-01 Last updated: 2024-01-17
3. MPR-RL: Multi-Prior Regularized Reinforcement Learning for Knowledge Transfer
Open this publication in new window or tab >>MPR-RL: Multi-Prior Regularized Reinforcement Learning for Knowledge Transfer
2022 (English)In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 7, no 3, p. 7652-7659Article in journal (Refereed) Published
Abstract [en]

In manufacturing, assembly tasks have been a challenge for learning algorithms due to variant dynamics of different environments. Reinforcement learning (RL) is a promising framework to automatically learn these tasks, yet it is still not easy to apply a learned policy or skill, that is the ability of solving a task, to a similar environment even if the deployment conditions are only slightly different. In this letter, we address the challenge of transferring knowledge within a family of similar tasks by leveraging multiple skill priors. We propose to learn prior distribution over the specific skill required to accomplish each task and compose the family of skill priors to guide learning the policy for a new task by comparing the similarity between the target task and the prior ones. Our method learns a latent action space representing the skill embedding from demonstrated trajectories for each prior task. We have evaluated our method on a task in simulation and a set of peg-in-hole insertion tasks and demonstrate better generalization to new tasks that have never been encountered during training. Our Multi-Prior Regularized RL (MPR-RL) method is deployed directly on a real world Franka Panda arm, requiring only a set of demonstrated trajectories from similar, but crucially not identical, problem instances.

Place, publisher, year, edition, pages
IEEE Press, 2022
Keywords
Machine Learning for Robot Control, Reinforcement Learning, Transfer Learning
National Category
Robotics
Identifiers
urn:nbn:se:oru:diva-99762 (URN)10.1109/LRA.2022.3184805 (DOI)000818872000024 ()2-s2.0-85133574877 (Scopus ID)
Note

Funding agency:

Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Available from: 2022-06-28 Created: 2022-06-28 Last updated: 2024-01-17Bibliographically approved
4. Learn from Robot: Transferring Skills for Diverse Manipulation via Cycle Generative Networks
Open this publication in new window or tab >>Learn from Robot: Transferring Skills for Diverse Manipulation via Cycle Generative Networks
2023 (English)In: 2023 IEEE 19th International Conference on Automation Science and Engineering (CASE), IEEE conference proceedings, 2023Conference paper, Published paper (Refereed)
Abstract [en]

Reinforcement learning (RL) has shown impressive results on a variety of robot tasks, but it requires a large amount of data for learning a single RL policy. However, in manufacturing there is a wide demand of reusing skills from different robots and it is hard to transfer the learned policy to different hardware due to diverse robot body morphology, kinematics, and dynamics. In this paper, we address the problem of transferring policies between different robot platforms. We learn a set of skills on each specific robot and represent them in a latent space. We propose to transfer the skills between different robots by mapping latent action spaces through a cycle generative network in a supervised learning manner. We extend the policy model learned on one robot with a pre-trained generative network to enable the robot to learn from the skill of another robot. We evaluate our method on several simulated experiments and demonstrate that our Learn from Robot (LfR) method accelerates new skill learning.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2023
Series
IEEE International Conference on Automation Science and Engineering, ISSN 2161-8070, E-ISSN 2161-8089
Keywords
Reinforcement Learning, Transfer Learning, Generative Models
National Category
Robotics
Identifiers
urn:nbn:se:oru:diva-108719 (URN)10.1109/CASE56687.2023.10260484 (DOI)9798350320701 (ISBN)9798350320695 (ISBN)
Conference
19th International Conference on Automation Science and Engineering (IEEE CASE 2023), Cordis, Auckland, New Zealand, August 26-30, 2023
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2023-10-03 Created: 2023-10-03 Last updated: 2024-03-07Bibliographically approved

Open Access in DiVA

Cover(131 kB)71 downloads
File information
File name COVER01.pdfFile size 131 kBChecksum SHA-512
1fd3ef83806eab5b619e3f8099162522fcae43ec5d7e2a96dcfb9cb9693d353ee29256eeedd13b4f00d65bcc5c2f4d80a26f852d7d20d1363f038eb61ba56bff
Type coverMimetype application/pdf
Robot Skill Acquisition through Prior-Conditioned Reinforcement Learning(1205 kB)130 downloads
File information
File name FULLTEXT01.pdfFile size 1205 kBChecksum SHA-512
8d29635dfebaa3be282811990cc15b3aee8863f7cd4f114307b5a4d1bd46ab1961deb0075524b8c962a7e5566cd9afbfc7516aff209e5bf12b3fd8de59d1a8a2
Type fulltextMimetype application/pdf
Spikblad(126 kB)47 downloads
File information
File name SPIKBLAD01.pdfFile size 126 kBChecksum SHA-512
64505493d6de5ed7979361cf4c9b296c48b1340f00787d7799cb90b9a01a320b22041e41243aded5ebbc7a896e8e4d3a329caf46c818412daefb3924e94375e4
Type spikbladMimetype application/pdf

Authority records

Yang, Quantao

Search in DiVA

By author/editor
Yang, Quantao
By organisation
School of Science and Technology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 130 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1793 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf