Safe-To-Explore State Spaces: Ensuring Safe Exploration in Policy Search with Hierarchical Task OptimizationShow others and affiliations
2018 (English)In: IEEE-RAS Conference on Humanoid Robots / [ed] Asfour, T, IEEE, 2018, p. 132-138Conference paper, Published paper (Refereed)
Abstract [en]
Policy search reinforcement learning allows robots to acquire skills by themselves. However, the learning procedure is inherently unsafe as the robot has no a-priori way to predict the consequences of the exploratory actions it takes. Therefore, exploration can lead to collisions with the potential to harm the robot and/or the environment. In this work we address the safety aspect by constraining the exploration to happen in safe-to-explore state spaces. These are formed by decomposing target skills (e.g., grasping) into higher ranked sub-tasks (e.g., collision avoidance, joint limit avoidance) and lower ranked movement tasks (e.g., reaching). Sub-tasks are defined as concurrent controllers (policies) in different operational spaces together with associated Jacobians representing their joint-space mapping. Safety is ensured by only learning policies corresponding to lower ranked sub-tasks in the redundant null space of higher ranked ones. As a side benefit, learning in sub-manifolds of the state-space also facilitates sample efficiency. Reaching skills performed in simulation and grasping skills performed on a real robot validate the usefulness of the proposed approach.
Place, publisher, year, edition, pages
IEEE, 2018. p. 132-138
Series
IEEE-RAS International Conference on Humanoid Robots, ISSN 2164-0572
Keywords [en]
Sensorimotor learning, Grasping and Manipulation, Concept and strategy learning
National Category
Computer Sciences Computer Vision and Robotics (Autonomous Systems)
Identifiers
URN: urn:nbn:se:oru:diva-71311ISI: 000458689700019OAI: oai:DiVA.org:oru-71311DiVA, id: diva2:1277232
Conference
IEEE-RAS 18th Conference on Humanoid Robots (Humanoids 2018), Beijing, China, November 6-9, 2018
Funder
Swedish Foundation for Strategic Research
Note
Funding Agency:
Academy of Finland 314180
2019-01-092019-01-092024-01-03Bibliographically approved