Multisensory Learning Framework for Robot Drumming
The hype about sensorimotor learning is currently reaching high fever, thanks to the latest advancement in deep learning. In this paper, we present an open-source framework for collecting large-scale, time-synchronised synthetic data from highly disparate sensory modalities, such as audio, video, and proprioception, for learning robot manipulation tasks. We demonstrate the learning of non-linear sensorimotor mappings for a humanoid drumming robot that generates novel motion sequences from desired audio data using cross-modal correspondences. We evaluate our system through the quality of its cross-modal retrieval, for generating suitable motion sequences to match desired unseen audio or video sequences.
Submission to IROS 2018's workshop on Crossmodal Learning for Intelligent Robotics, 2nd Edition.
Authors: A. Barsky, C. Zito, H. Mori, T. Ogata and J.L. Wyatt.
The project is part of the collaboration between the IRLab at the University of Birmingham (UK) and the Department of Art and Media at the Waseda University (JP).
Link to the Website
https://www2.informatik.uni-hamburg.de/wtm/WorkshopCLIR18/index.php