To Örebro University

oru.seÖrebro University Publications
Change search
Refine search result
1 - 5 of 5
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Hoang, Dinh-Cuong
    Örebro University, School of Science and Technology.
    Vision-based Perception For Autonomous Robotic Manipulation2021Doctoral thesis, monograph (Other academic)
    Abstract [en]

    In order to safely and effectively operate in real-world unstructured environments where a priori knowledge of the surroundings is not available, robots must have adequate perceptual capabilities. This thesis is concerned with several important aspects of vision-based perception for autonomous robotic manipulation. With a focus on topics related to scene reconstruction, object pose estimation and grasp configuration generation, we aim at helping robots to better understand their surroundings, to avoid undesirable contacts with the environment and to accurately grasp selected objects.

    With the wide availability of affordable RGB-D cameras, research on visual SLAM (Simultaneous Localization and Mapping) or scene reconstruction has made giant strides in development. As a key element of an RGB-D reconstruction system, a large number of registration algorithms have been proposed in the context of RGB-D Tracking and Mapping (TAM). The state-of-the-art methods rely on color and depth information to track camera poses. Besides depth and color images, semantic information is now often available due to the advancement of image segmentation driven by deep learning. We are interested to explore to what extent the use of semantic cues can increase the robustness of camera pose tracking. This leads to the first contribution of this dissertation. A method for reliable camera tracking using an objective function that combines geometric, appearance, and semantic cues with adaptive weights.

    Beyond the purely geometric model of the environment produced by classical reconstruction systems, the inclusion of rich semantic information and 6D poses of object instances within a dense map is useful for robots to effectively operate and interact with objects. Therefore, the second contribution of this thesis is an approach for recognizing objects present in a scene and estimating their full pose by means of an accurate 3D semantic reconstruction. Our framework deploys simultaneously a 3D mapping algorithm to reconstruct a semantic model of the environment, and an incremental 6D object pose recovery algorithm that carries out predictions using the reconstructed model. We demonstrate that we can exploit multiple viewpoints around the same object to achieve robust and stable 6D pose estimation in the presence of heavy clutter and occlusion.

    The methods taking RGB-D images as input have achieved state-of-the-art performance on the object pose estimation task. However, in a number of cases, color information may not be available — for example, when the input is point cloud data from laser range finders or industrial high-resolution 3D sensors. Therefore, besides methods using RGB-D images, studies on recovering the 6D pose of rigid objects from 3D point clouds containing only geometric information are necessary. The third contribution of this dissertation is a novel deep learning architecture to address the problem of estimating the 6D pose of multiple rigid objects in a cluttered scene, using only a 3D point cloud of the scene as an input. The proposed architecture pools geometric features together using a self-attention mechanism and adopts a deep Hough voting scheme for pose proposal generation. We show that by exploiting the correlation between poses of object instances and object parts we can improve the performance of object pose estimation.

    By applying a 6D object pose estimation algorithm, robots can perform grasping known objects where the 3D model of objects is available and a grasp database is pre-defined. What if we want to grasp novel objects? The fourth contribution of this thesis is a method for robust manipulation of novel objects in cluttered environments. we develop an end-to-end deep learning approach for generating grasp configurations for a two-finger parallel jaw gripper, based on 3D point cloud observations of the scene. The proposed model generates candidates by casting votes to accumulate evidence for feasible grasp configurations. We exploit contextual information by encoding the dependency of objects in the scene into features to boost the performance of grasp generation.

    Download full text (pdf)
    Vision-based Perception For Autonomous Robotic Manipulation
    Download (png)
    Bild
    Download (pdf)
    Cover
  • 2.
    Hoang, Dinh-Cuong
    et al.
    Örebro University, School of Science and Technology.
    Lilienthal, Achim
    Örebro University, School of Science and Technology.
    Stoyanov, Todor
    Örebro University, School of Science and Technology.
    Object-RPE: Dense 3D Reconstruction and Pose Estimation with Convolutional Neural Networks2020In: Robotics and Autonomous Systems, ISSN 0921-8890, E-ISSN 1872-793X, Vol. 133, article id 103632Article in journal (Refereed)
    Abstract [en]

    We present an approach for recognizing objects present in a scene and estimating their full pose by means of an accurate 3D instance-aware semantic reconstruction. Our framework couples convolutional neural networks (CNNs) and a state-of-the-art dense Simultaneous Localisation and Mapping(SLAM) system, ElasticFusion [1], to achieve both high-quality semantic reconstruction as well as robust 6D pose estimation for relevant objects. We leverage the pipeline of ElasticFusion as a back-bone and propose a joint geometric and photometric error function with per-pixel adaptive weights. While the main trend in CNN-based 6D pose estimation has been to infer an object’s position and orientation from single views of the scene, our approach explores performing pose estimation from multiple viewpoints, under the conjecture that combining multiple predictions can improve the robustness of an object detection system. The resulting system is capable of producing high-quality instance-aware semantic reconstructions of room-sized environments, as well as accurately detecting objects and their 6D poses. The developed method has been verified through extensive experiments on different datasets. Experimental results confirmed that the proposed system achieves improvements over state-of-the-art methods in terms of surface reconstruction and object pose prediction. Our code and video are available at https://sites.google.com/view/object-rpe.

    Download full text (pdf)
    Object-RPE: Dense 3D reconstruction and pose estimation with convolutional neural networks
  • 3.
    Hoang, Dinh-Cuong
    et al.
    Örebro University, School of Science and Technology.
    Lilienthal, Achim
    Örebro University, School of Science and Technology.
    Stoyanov, Todor
    Örebro University, School of Science and Technology.
    Panoptic 3D Mapping and Object Pose Estimation Using Adaptively Weighted Semantic Information2020In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 5, no 2, p. 1962-1969Article in journal (Refereed)
    Abstract [en]

    We present a system capable of reconstructing highly detailed object-level models and estimating the 6D pose of objects by means of an RGB-D camera. In this work, we integrate deep-learning-based semantic segmentation, instance segmentation, and 6D object pose estimation into a state of the art RGB-D mapping system. We leverage the pipeline of ElasticFusion as a backbone and propose modifications of the registration cost function to make full use of the semantic class labels in the process. The proposed objective function features tunable weights for the depth, appearance, and semantic information channels, which are learned from data. A fast semantic segmentation and registration weight prediction convolutional neural network (Fast-RGBD-SSWP) suited to efficient computation is introduced. In addition, our approach explores performing 6D object pose estimation from multiple viewpoints supported by the high-quality reconstruction system. The developed method has been verified through experimental validation on the YCB-Video dataset and a dataset of warehouse objects. Our results confirm that the proposed system performs favorably in terms of surface reconstruction, segmentation quality, and accurate object pose estimation in comparison to other state-of-the-art systems. Our code and video are available at https://sites.google.com/view/panoptic-mope.

  • 4.
    Hoang, Dinh-Cuong
    et al.
    Örebro University, School of Science and Technology. ICT Department, FPT University, Hanoi, Vietnam.
    Stork, Johannes Andreas
    Örebro University, School of Science and Technology.
    Stoyanov, Todor
    Örebro University, School of Science and Technology.
    Context-Aware Grasp Generation in Cluttered Scenes2022In: 2022 International Conference on Robotics and Automation (ICRA), IEEE, 2022, p. 1492-1498Conference paper (Refereed)
    Abstract [en]

    Conventional methods to autonomous grasping rely on a pre-computed database with known objects to synthesize grasps, which is not possible for novel objects. On the other hand, recently proposed deep learning-based approaches have demonstrated the ability to generalize grasp for unknown objects. However, grasp generation still remains a challenging problem, especially in cluttered environments under partial occlusion. In this work, we propose an end-to-end deep learning approach for generating 6-DOF collision-free grasps given a 3D scene point cloud. To build robustness to occlusion, the proposed model generates candidates by casting votes and accumulating evidence for feasible grasp configurations. We exploit contextual information by encoding the dependency of objects in the scene into features to boost the performance of grasp generation. The contextual information enables our model to increase the likelihood that the generated grasps are collision-free. Our experimental results confirm that the proposed system performs favorably in terms of predicting object grasps in cluttered environments in comparison to the current state of the art methods.

    Download full text (pdf)
    Context-Aware Grasp Generation in Cluttered Scenes
  • 5.
    Hoang, Dinh-Cuong
    et al.
    Örebro University, School of Science and Technology.
    Stoyanov, Todor
    Örebro University, School of Science and Technology.
    Lilienthal, Achim J.
    Örebro University, School of Science and Technology.
    Object-RPE: Dense 3D Reconstruction and Pose Estimation with Convolutional Neural Networks for Warehouse Robots2019In: 2019 European Conference on Mobile Robots, ECMR 2019: Proceedings, IEEE, 2019, article id 152970Conference paper (Refereed)
    Abstract [en]

    We present a system for accurate 3D instance-aware semantic reconstruction and 6D pose estimation, using an RGB-D camera. Our framework couples convolutional neural networks (CNNs) and a state-of-the-art dense Simultaneous Localisation and Mapping (SLAM) system, ElasticFusion, to achieve both high-quality semantic reconstruction as well as robust 6D pose estimation for relevant objects. The method presented in this paper extends a high-quality instance-aware semantic 3D Mapping system from previous work [1] by adding a 6D object pose estimator. While the main trend in CNN-based 6D pose estimation has been to infer object's position and orientation from single views of the scene, our approach explores performing pose estimation from multiple viewpoints, under the conjecture that combining multiple predictions can improve the robustness of an object detection system. The resulting system is capable of producing high-quality object-aware semantic reconstructions of room-sized environments, as well as accurately detecting objects and their 6D poses. The developed method has been verified through experimental validation on the YCB-Video dataset and a newly collected warehouse object dataset. Experimental results confirmed that the proposed system achieves improvements over state-of-the-art methods in terms of surface reconstruction and object pose prediction. Our code and video are available at https://sites.google.com/view/object-rpe.

    Download full text (pdf)
    Object-RPE: Dense 3D Reconstruction and Pose Estimation with Convolutional Neural Networks for Warehouse Robots
1 - 5 of 5
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf