To Örebro University

oru.seÖrebro University Publications
Change search
Link to record
Permanent link

Direct link
Stoyanov, Todor, Associate Prof.ORCID iD iconorcid.org/0000-0002-6013-4874
Alternative names
Publications (10 of 78) Show all publications
Rietz, F., Magg, S., Heintz, F., Stoyanov, T., Wermter, S. & Stork, J. A. (2023). Hierarchical goals contextualize local reward decomposition explanations. Neural Computing & Applications, 35(23), 16693-16704
Open this publication in new window or tab >>Hierarchical goals contextualize local reward decomposition explanations
Show others...
2023 (English)In: Neural Computing & Applications, ISSN 0941-0643, E-ISSN 1433-3058, Vol. 35, no 23, p. 16693-16704Article in journal (Refereed) Published
Abstract [en]

One-step reinforcement learning explanation methods account for individual actions but fail to consider the agent's future behavior, which can make their interpretation ambiguous. We propose to address this limitation by providing hierarchical goals as context for one-step explanations. By considering the current hierarchical goal as a context, one-step explanations can be interpreted with higher certainty, as the agent's future behavior is more predictable. We combine reward decomposition with hierarchical reinforcement learning into a novel explainable reinforcement learning framework, which yields more interpretable, goal-contextualized one-step explanations. With a qualitative analysis of one-step reward decomposition explanations, we first show that their interpretability is indeed limited in scenarios with multiple, different optimal policies-a characteristic shared by other one-step explanation methods. Then, we show that our framework retains high interpretability in such cases, as the hierarchical goal can be considered as context for the explanation. To the best of our knowledge, our work is the first to investigate hierarchical goals not as an explanation directly but as additional context for one-step reinforcement learning explanations.

Place, publisher, year, edition, pages
Springer, 2023
Keywords
Reinforcement learning, Explainable AI, Reward decomposition, Hierarchical goals, Local explanations
National Category
Computer Sciences
Identifiers
urn:nbn:se:oru:diva-99115 (URN)10.1007/s00521-022-07280-8 (DOI)000794083400001 ()2-s2.0-85129803505 (Scopus ID)
Note

Funding agencies:

Örebro University

Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Federal Ministry for Economic Affairs and Climate FKZ 20X1905A-D

Available from: 2022-05-23 Created: 2022-05-23 Last updated: 2023-11-28Bibliographically approved
Yang, Q., Stork, J. A. & Stoyanov, T. (2023). Learn from Robot: Transferring Skills for Diverse Manipulation via Cycle Generative Networks. In: 2023 IEEE 19th International Conference on Automation Science and Engineering (CASE): . Paper presented at 19th International Conference on Automation Science and Engineering (IEEE CASE 2023), Cordis, Auckland, New Zealand, August 26-30, 2023. IEEE conference proceedings
Open this publication in new window or tab >>Learn from Robot: Transferring Skills for Diverse Manipulation via Cycle Generative Networks
2023 (English)In: 2023 IEEE 19th International Conference on Automation Science and Engineering (CASE), IEEE conference proceedings, 2023Conference paper, Published paper (Refereed)
Abstract [en]

Reinforcement learning (RL) has shown impressive results on a variety of robot tasks, but it requires a large amount of data for learning a single RL policy. However, in manufacturing there is a wide demand of reusing skills from different robots and it is hard to transfer the learned policy to different hardware due to diverse robot body morphology, kinematics, and dynamics. In this paper, we address the problem of transferring policies between different robot platforms. We learn a set of skills on each specific robot and represent them in a latent space. We propose to transfer the skills between different robots by mapping latent action spaces through a cycle generative network in a supervised learning manner. We extend the policy model learned on one robot with a pre-trained generative network to enable the robot to learn from the skill of another robot. We evaluate our method on several simulated experiments and demonstrate that our Learn from Robot (LfR) method accelerates new skill learning.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2023
Series
IEEE International Conference on Automation Science and Engineering, ISSN 2161-8070, E-ISSN 2161-8089
Keywords
Reinforcement Learning, Transfer Learning, Generative Models
National Category
Robotics
Identifiers
urn:nbn:se:oru:diva-108719 (URN)10.1109/CASE56687.2023.10260484 (DOI)9798350320701 (ISBN)9798350320695 (ISBN)
Conference
19th International Conference on Automation Science and Engineering (IEEE CASE 2023), Cordis, Auckland, New Zealand, August 26-30, 2023
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2023-10-03 Created: 2023-10-03 Last updated: 2024-03-07Bibliographically approved
Dominguez, D. C., Iannotta, M., Stork, J. A., Schaffernicht, E. & Stoyanov, T. (2022). A Stack-of-Tasks Approach Combined With Behavior Trees: A New Framework for Robot Control. IEEE Robotics and Automation Letters, 7(4), 12110-12117
Open this publication in new window or tab >>A Stack-of-Tasks Approach Combined With Behavior Trees: A New Framework for Robot Control
Show others...
2022 (English)In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 7, no 4, p. 12110-12117Article in journal (Refereed) Published
Abstract [en]

Stack-of-Tasks (SoT) control allows a robot to simultaneously fulfill a number of prioritized goals formulated in terms of (in)equality constraints in error space. Since this approach solves a sequence of Quadratic Programs (QP) at each time-step, without taking into account any temporal state evolution, it is suitable for dealing with local disturbances. However, its limitation lies in the handling of situations that require non-quadratic objectives to achieve a specific goal, as well as situations where countering the control disturbance would require a locally suboptimal action. Recent works address this shortcoming by exploiting Finite State Machines (FSMs) to compose the tasks in such a way that the robot does not get stuck in local minima. Nevertheless, the intrinsic trade-off between reactivity and modularity that characterizes FSMs makes them impractical for defining reactive behaviors in dynamic environments. In this letter, we combine the SoT control strategy with Behavior Trees (BTs), a task switching structure that addresses some of the limitations of the FSMs in terms of reactivity, modularity and re-usability. Experimental results on a Franka Emika Panda 7-DOF manipulator show the robustness of our framework, that allows the robot to benefit from the reactivity of both SoT and BTs.

Place, publisher, year, edition, pages
IEEE Press, 2022
Keywords
Behavior-based systems, control architectures and programming
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:oru:diva-101946 (URN)10.1109/LRA.2022.3211481 (DOI)000868319800006 ()
Funder
Knut and Alice Wallenberg Foundation
Note

Funding agencies:

Industrial Graduate School Collaborative AI & Robotics (CoAIRob)

General Electric Dnr:20190128

Available from: 2022-10-27 Created: 2022-10-27 Last updated: 2024-01-17Bibliographically approved
Hoang, D.-C., Stork, J. A. & Stoyanov, T. (2022). Context-Aware Grasp Generation in Cluttered Scenes. In: 2022 International Conference on Robotics and Automation (ICRA): . Paper presented at IEEE International Conference on Robotics and Automation (ICRA 2022), Philadelphia, USA, May 23-27, 2022 (pp. 1492-1498). IEEE
Open this publication in new window or tab >>Context-Aware Grasp Generation in Cluttered Scenes
2022 (English)In: 2022 International Conference on Robotics and Automation (ICRA), IEEE, 2022, p. 1492-1498Conference paper, Published paper (Refereed)
Abstract [en]

Conventional methods to autonomous grasping rely on a pre-computed database with known objects to synthesize grasps, which is not possible for novel objects. On the other hand, recently proposed deep learning-based approaches have demonstrated the ability to generalize grasp for unknown objects. However, grasp generation still remains a challenging problem, especially in cluttered environments under partial occlusion. In this work, we propose an end-to-end deep learning approach for generating 6-DOF collision-free grasps given a 3D scene point cloud. To build robustness to occlusion, the proposed model generates candidates by casting votes and accumulating evidence for feasible grasp configurations. We exploit contextual information by encoding the dependency of objects in the scene into features to boost the performance of grasp generation. The contextual information enables our model to increase the likelihood that the generated grasps are collision-free. Our experimental results confirm that the proposed system performs favorably in terms of predicting object grasps in cluttered environments in comparison to the current state of the art methods.

Place, publisher, year, edition, pages
IEEE, 2022
National Category
Computer Sciences
Identifiers
urn:nbn:se:oru:diva-98437 (URN)10.1109/ICRA46639.2022.9811371 (DOI)000941265701005 ()2-s2.0-85136323876 (Scopus ID)9781728196824 (ISBN)9781728196817 (ISBN)
Conference
IEEE International Conference on Robotics and Automation (ICRA 2022), Philadelphia, USA, May 23-27, 2022
Funder
EU, Horizon 2020, 101017274 (DARKO)
Available from: 2022-04-01 Created: 2022-04-01 Last updated: 2023-05-03Bibliographically approved
Iannotta, M., Dominguez, D. C., Stork, J. A., Schaffernicht, E. & Stoyanov, T. (2022). Heterogeneous Full-body Control of a Mobile Manipulator with Behavior Trees. In: IROS 2022 Workshop on Mobile Manipulation and Embodied Intelligence (MOMA): Challenges and  Opportunities: . Paper presented at International Conference on Intelligent Robots and Systems (IROS 2022), Kyoto, Japan, October 23-27, 2022.
Open this publication in new window or tab >>Heterogeneous Full-body Control of a Mobile Manipulator with Behavior Trees
Show others...
2022 (English)In: IROS 2022 Workshop on Mobile Manipulation and Embodied Intelligence (MOMA): Challenges and  Opportunities, 2022Conference paper, Published paper (Refereed)
Abstract [en]

Integrating the heterogeneous controllers of a complex mechanical system, such as a mobile manipulator, within the same structure and in a modular way is still challenging. In this work we extend our framework based on Behavior Trees for the control of a redundant mechanical system to the problem of commanding more complex systems that involve multiple low-level controllers. This allows the integrated systems to achieve non-trivial goals that require coordination among the sub-systems.

National Category
Robotics
Research subject
Computer Science
Identifiers
urn:nbn:se:oru:diva-102984 (URN)10.48550/arXiv.2210.08600 (DOI)
Conference
International Conference on Intelligent Robots and Systems (IROS 2022), Kyoto, Japan, October 23-27, 2022
Funder
Knowledge Foundation
Available from: 2023-01-09 Created: 2023-01-09 Last updated: 2024-01-03Bibliographically approved
Yang, Y., Stork, J. A. & Stoyanov, T. (2022). Learn to Predict Posterior Probability in Particle Filtering for Tracking Deformable Linear Objects. In: 3rd Workshop on Robotic Manipulation of Deformable Objects: Challenges in Perception, Planning and Control for Soft Interaction (ROMADO-SI), IROS 2022, Kyoto, Japan: . Paper presented at 35th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022), Kyoto, Japan, October 24-26, 2022.
Open this publication in new window or tab >>Learn to Predict Posterior Probability in Particle Filtering for Tracking Deformable Linear Objects
2022 (English)In: 3rd Workshop on Robotic Manipulation of Deformable Objects: Challenges in Perception, Planning and Control for Soft Interaction (ROMADO-SI), IROS 2022, Kyoto, Japan, 2022Conference paper, Published paper (Refereed)
Abstract [en]

Tracking deformable linear objects (DLOs) is a key element for applications where robots manipulate DLOs. However, the lack of distinctive features or appearance on the DLO and the object’s high-dimensional state space make tracking challenging and still an open question in robotics. In this paper, we propose a method for tracking the state of a DLO by applying a particle filter approach, where the posterior probability of each sample is estimated by a learned predictor. Our method can achieve accurate tracking even with no prerequisite segmentation which many related works require. Due to the differentiability of the posterior probability predictor, our method can leverage the gradients of posterior probabilities with respect to the latent states to improve the motion model in the particle filter. The preliminary experiments suggest that the proposed method can provide robust tracking results and the estimated DLO state converges quickly to the true state if the initial state is unknown.

National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:oru:diva-102743 (URN)
Conference
35th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022), Kyoto, Japan, October 24-26, 2022
Funder
Vinnova, 2019-05175Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2023-01-27 Created: 2023-01-27 Last updated: 2023-09-18Bibliographically approved
Yang, Y., Stork, J. A. & Stoyanov, T. (2022). Learning differentiable dynamics models for shape control of deformable linear objects. Robotics and Autonomous Systems, 158, Article ID 104258.
Open this publication in new window or tab >>Learning differentiable dynamics models for shape control of deformable linear objects
2022 (English)In: Robotics and Autonomous Systems, ISSN 0921-8890, E-ISSN 1872-793X, Vol. 158, article id 104258Article in journal (Refereed) Published
Abstract [en]

Robots manipulating deformable linear objects (DLOs) – such as surgical sutures in medical robotics, or cables and hoses in industrial assembly – can benefit substantially from accurate and fast differentiable predictive models. However, the off-the-shelf analytic physics models fall short of differentiability. Recently, neural-network-based data-driven models have shown promising results in learning DLO dynamics. These models have additional advantages compared to analytic physics models, as they are differentiable and can be used in gradient-based trajectory planning. Still, the data-driven approaches demand a large amount of training data, which can be challenging for real-world applications. In this paper, we propose a framework for learning a differentiable data-driven model for DLO dynamics with a minimal set of real-world data. To learn DLO twisting and bending dynamics in a 3D environment, we first introduce a new suitable DLO representation. Next, we use a recurrent network module to propagate effects between different segments along a DLO, thereby addressing a critical limitation of current state-of-the-art methods. Then, we train a data-driven model on synthetic data generated in simulation, instead of foregoing the time-consuming and laborious data collection process for real-world applications. To achieve a good correspondence between real and simulated models, we choose a set of simulation model parameters through parameter identification with only a few trajectories of a real DLO required. We evaluate several optimization methods for parameter identification and demonstrate that the differential evolution algorithm is efficient and effective for parameter identification. In DLO shape control tasks with a model-based controller, the data-driven model trained on synthetic data generated by the resulting models performs on par with the ones trained with a comparable amount of real-world data which, however, would be intractable to collect.

Place, publisher, year, edition, pages
Elsevier, 2022
Keywords
Deformable linear object, Model learning, Parameter identification, Model predictive control
National Category
Computer Sciences
Identifiers
urn:nbn:se:oru:diva-101292 (URN)10.1016/j.robot.2022.104258 (DOI)000869528600006 ()2-s2.0-85138188346 (Scopus ID)
Funder
Vinnova, 2019-05175Knut and Alice Wallenberg Foundation
Available from: 2022-09-19 Created: 2022-09-19 Last updated: 2023-09-18Bibliographically approved
Yang, Q., Stork, J. A. & Stoyanov, T. (2022). MPR-RL: Multi-Prior Regularized Reinforcement Learning for Knowledge Transfer. IEEE Robotics and Automation Letters, 7(3), 7652-7659
Open this publication in new window or tab >>MPR-RL: Multi-Prior Regularized Reinforcement Learning for Knowledge Transfer
2022 (English)In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 7, no 3, p. 7652-7659Article in journal (Refereed) Published
Abstract [en]

In manufacturing, assembly tasks have been a challenge for learning algorithms due to variant dynamics of different environments. Reinforcement learning (RL) is a promising framework to automatically learn these tasks, yet it is still not easy to apply a learned policy or skill, that is the ability of solving a task, to a similar environment even if the deployment conditions are only slightly different. In this letter, we address the challenge of transferring knowledge within a family of similar tasks by leveraging multiple skill priors. We propose to learn prior distribution over the specific skill required to accomplish each task and compose the family of skill priors to guide learning the policy for a new task by comparing the similarity between the target task and the prior ones. Our method learns a latent action space representing the skill embedding from demonstrated trajectories for each prior task. We have evaluated our method on a task in simulation and a set of peg-in-hole insertion tasks and demonstrate better generalization to new tasks that have never been encountered during training. Our Multi-Prior Regularized RL (MPR-RL) method is deployed directly on a real world Franka Panda arm, requiring only a set of demonstrated trajectories from similar, but crucially not identical, problem instances.

Place, publisher, year, edition, pages
IEEE Press, 2022
Keywords
Machine Learning for Robot Control, Reinforcement Learning, Transfer Learning
National Category
Robotics
Identifiers
urn:nbn:se:oru:diva-99762 (URN)10.1109/LRA.2022.3184805 (DOI)000818872000024 ()2-s2.0-85133574877 (Scopus ID)
Note

Funding agency:

Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Available from: 2022-06-28 Created: 2022-06-28 Last updated: 2024-01-17Bibliographically approved
Ivan, J.-P. A., Stoyanov, T. & Stork, J. A. (2022). Online Distance Field Priors for Gaussian Process Implicit Surfaces. IEEE Robotics and Automation Letters, 7(4), 8996-9003
Open this publication in new window or tab >>Online Distance Field Priors for Gaussian Process Implicit Surfaces
2022 (English)In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 7, no 4, p. 8996-9003Article in journal (Refereed) Published
Abstract [en]

Gaussian process (GP) implicit surface models provide environment and object representations which elegantly address noise and uncertainty while remaining sufficiently flexible to capture complex geometry. However, GP models quickly become intractable as the size of the observation set grows-a trait which is difficult to reconcile with the rate at which modern range sensors produce data. Furthermore, naive applications of GPs to implicit surface models allocate model resources uniformly, thus using precious resources to capture simple geometry. In contrast to prior work addressing these challenges though model sparsification, spatial partitioning, or ad-hoc filtering, we propose introducing model bias online through the GP's mean function. We achieve more accurate distance fields using smaller models by creating a distance field prior from features which are easy to extract and have analytic distance fields. In particular, we demonstrate this approach using linear features. We show the proposed distance field halves model size in a 2D mapping task using data from a SICK S300 sensor. When applied to a single 3D scene from the TUM RGB-D SLAM dataset, we achieve a fivefold reduction in model size. Our proposed prior results in more accurate GP implicit surfaces, while allowing existing models to function in larger environments or with larger spatial partitions due to reduced model size.

Place, publisher, year, edition, pages
IEEE, 2022
Keywords
Gaussian processes, machine learning, robot sensing systems, supervised learning
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:oru:diva-100884 (URN)10.1109/LRA.2022.3189434 (DOI)000838567100055 ()2-s2.0-85134253745 (Scopus ID)
Funder
Knut and Alice Wallenberg Foundation
Available from: 2022-08-31 Created: 2022-08-31 Last updated: 2024-01-17Bibliographically approved
Yang, Y., Stork, J. A. & Stoyanov, T. (2022). Online Model Learning for Shape Control of Deformable Linear Objects. In: 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS): . Paper presented at 35th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022), Kyoto, Japan, October 23-27, 2022 (pp. 4056-4062). IEEE
Open this publication in new window or tab >>Online Model Learning for Shape Control of Deformable Linear Objects
2022 (English)In: 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2022, p. 4056-4062Conference paper, Published paper (Refereed)
Abstract [en]

Traditional approaches to manipulating the state of deformable linear objects (DLOs) - i.e., cables, ropes - rely on model-based planning. However, constructing an accurate dynamic model of a DLO is challenging due to the complexity of interactions and a high number of degrees of freedom. This renders the task of achieving a desired DLO shape particularly difficult and motivates the use of model-free alternatives, which while maintaining generality suffer from a high sample complexity. In this paper, we bridge the gap between these fundamentally different approaches and propose a framework that learns dynamic models of DLOs through trial-and-error interaction. Akin to model-based reinforcement learning (RL), we interleave learning and exploration to solve a 3D shape control task for a DLO. Our approach requires only a fraction of the interaction samples of the current state-of-the-art model-free RL alternatives to achieve superior shape control performance. Unlike offline model learning, our approach does not require expert knowledge for data collection, retains the ability to explore, and automatically selects relevant experience.

Place, publisher, year, edition, pages
IEEE, 2022
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:oru:diva-103194 (URN)10.1109/IROS47612.2022.9981080 (DOI)000908368203013 ()9781665479271 (ISBN)9781665479288 (ISBN)
Conference
35th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022), Kyoto, Japan, October 23-27, 2022
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Vinnova, SIP-STRIM projects 2019-05175
Available from: 2023-01-16 Created: 2023-01-16 Last updated: 2023-09-18
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-6013-4874

Search in DiVA

Show all publications