Spatio-semantic Task Recognition: Unsupervised Learning of Task-discriminative Features for Segmentation and Imitation
Abstract: Discovering task subsequences from a continuous video stream facilitates a robot imitation of sequential tasks. In this research, we develop unsupervised learning of the task subsequences which does not require a human teacher to give the supervised label of the subsequence. Task-discriminative feature, in the form of sparsely activated cells called task capsules, is proposed for self-training to preserve spatio-semantic information of a visual input. The task capsules are sparsely and exclusively activated with respect to the spatio-semantic context of the task subsequence: a type and location of the object. Therefore, the generalized purpose in multiple videos is unsupervisedly discovered according to the spatio-semantic context, and the demonstration is segmented into the task subsequences in an object-centric way. In comparison with the existing studies on unsupervised task segmentation, our work has the following distinct contribution: 1) the task provided as a video stream can be segmented without any pre-defined knowledge, 2) the trained features preserve spatio-semantic information so that the segmentation is object-centric. Our experiment shows that the recognition of the task subsequence can be applied to robot imitation for a sequential pick-and-place task by providing the semantic and location information of the object to be manipulated.
Bibtex
@article{park2021spatio,
title={Spatio-semantic Task Recognition: Unsupervised Learning of Task-discriminative Features for Segmentation and Imitation},
author={Park, J Hyeon and Kim, Jigang and Kim, H Jin},
journal={International Journal of Control, Automation and Systems},
volume={19},
number={10},
pages={3409--3418},
year={2021},
publisher={Springer}
}