Visual-Tactile Multimodality for Following Deformable Linear Objects Using Reinforcement Learning

Leszek Pecyna; Siyuan Dong; Shan Luo

Visual-Tactile Multimodality for Following Deformable Linear Objects Using Reinforcement Learning

Leszek Pecyna, Siyuan Dong, Shan Luo^*

^*Corresponding author for this work

Engineering

University of Washington

Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review

44 Downloads (Pure)

Abstract

Manipulation of deformable objects is a challenging task for a robot. It would be problematic to use a single sensory input to track the behaviour of such objects: vision can be subjected to occlusions, whereas tactile inputs cannot capture the global information that is useful for the task. In this paper, we study the problem of using vision and tactile inputs together to complete the task of following deformable linear objects, for the first time. We create a Reinforcement Learning agent using different sensing modalities and investigate how its behaviour can be boosted using visual-tactile fusion, compared to using a single sensing modality. To this end, we developed a benchmark in simulation for manipulating the deformable linear objects using multimodal sensing inputs. The policy of the agent uses distilled information, e.g., the pose of the object in both visual and tactile perspectives, instead of the raw sensing signals, so that it can be directly transferred to real environments. In this way, we disentangle the perception system and the learned control policy. Our extensive experiments show that the use of both vision and tactile inputs, together with proprioception, allows the agent to complete the task in up to 92\% of cases, compared to 77\% when only one of the signals is given. Our results can provide valuable insights for the future design of tactile sensors and for deformable objects manipulation. Code and videos can be found at: \url{https://github.com/lpecyna/SoftSlidingGym}.

Original language	English
Title of host publication	The 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022)
Publication status	Published - 2022

Access to Document

Tactile_vision_cable_sliding

Cite this

@inbook{6971c605035f46fe8ea11351c7bf5c5b,

title = "Visual-Tactile Multimodality for Following Deformable Linear Objects Using Reinforcement Learning",

abstract = "Manipulation of deformable objects is a challenging task for a robot. It would be problematic to use a single sensory input to track the behaviour of such objects: vision can be subjected to occlusions, whereas tactile inputs cannot capture the global information that is useful for the task. In this paper, we study the problem of using vision and tactile inputs together to complete the task of following deformable linear objects, for the first time. We create a Reinforcement Learning agent using different sensing modalities and investigate how its behaviour can be boosted using visual-tactile fusion, compared to using a single sensing modality. To this end, we developed a benchmark in simulation for manipulating the deformable linear objects using multimodal sensing inputs. The policy of the agent uses distilled information, e.g., the pose of the object in both visual and tactile perspectives, instead of the raw sensing signals, so that it can be directly transferred to real environments. In this way, we disentangle the perception system and the learned control policy. Our extensive experiments show that the use of both vision and tactile inputs, together with proprioception, allows the agent to complete the task in up to 92\% of cases, compared to 77\% when only one of the signals is given. Our results can provide valuable insights for the future design of tactile sensors and for deformable objects manipulation. Code and videos can be found at: \url{https://github.com/lpecyna/SoftSlidingGym}.",

author = "Leszek Pecyna and Siyuan Dong and Shan Luo",

year = "2022",

language = "English",

booktitle = "The 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022)",

}

TY - CHAP

T1 - Visual-Tactile Multimodality for Following Deformable Linear Objects Using Reinforcement Learning

AU - Pecyna, Leszek

AU - Dong, Siyuan

AU - Luo, Shan

PY - 2022

Y1 - 2022

N2 - Manipulation of deformable objects is a challenging task for a robot. It would be problematic to use a single sensory input to track the behaviour of such objects: vision can be subjected to occlusions, whereas tactile inputs cannot capture the global information that is useful for the task. In this paper, we study the problem of using vision and tactile inputs together to complete the task of following deformable linear objects, for the first time. We create a Reinforcement Learning agent using different sensing modalities and investigate how its behaviour can be boosted using visual-tactile fusion, compared to using a single sensing modality. To this end, we developed a benchmark in simulation for manipulating the deformable linear objects using multimodal sensing inputs. The policy of the agent uses distilled information, e.g., the pose of the object in both visual and tactile perspectives, instead of the raw sensing signals, so that it can be directly transferred to real environments. In this way, we disentangle the perception system and the learned control policy. Our extensive experiments show that the use of both vision and tactile inputs, together with proprioception, allows the agent to complete the task in up to 92\% of cases, compared to 77\% when only one of the signals is given. Our results can provide valuable insights for the future design of tactile sensors and for deformable objects manipulation. Code and videos can be found at: \url{https://github.com/lpecyna/SoftSlidingGym}.

AB - Manipulation of deformable objects is a challenging task for a robot. It would be problematic to use a single sensory input to track the behaviour of such objects: vision can be subjected to occlusions, whereas tactile inputs cannot capture the global information that is useful for the task. In this paper, we study the problem of using vision and tactile inputs together to complete the task of following deformable linear objects, for the first time. We create a Reinforcement Learning agent using different sensing modalities and investigate how its behaviour can be boosted using visual-tactile fusion, compared to using a single sensing modality. To this end, we developed a benchmark in simulation for manipulating the deformable linear objects using multimodal sensing inputs. The policy of the agent uses distilled information, e.g., the pose of the object in both visual and tactile perspectives, instead of the raw sensing signals, so that it can be directly transferred to real environments. In this way, we disentangle the perception system and the learned control policy. Our extensive experiments show that the use of both vision and tactile inputs, together with proprioception, allows the agent to complete the task in up to 92\% of cases, compared to 77\% when only one of the signals is given. Our results can provide valuable insights for the future design of tactile sensors and for deformable objects manipulation. Code and videos can be found at: \url{https://github.com/lpecyna/SoftSlidingGym}.

M3 - Conference paper

BT - The 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022)

ER -

Visual-Tactile Multimodality for Following Deformable Linear Objects Using Reinforcement Learning

Abstract

Access to Document

Fingerprint

Cite this