We propose to make use of audio vibration sensing with a deep neural network named PouringNet to predict the liquid height from the audio fragment during the robotic pouring task. PouringNet is trained on our collected real-world pouring dataset with multimodal sensing data, which contains more than 3000 recordings of audio, force feedback, video and trajectory data of the human hand that performs the pouring task. Each record represents a complete pouring procedure. We conduct several evaluations on PouringNet with our dataset and robotic hardware. The results demonstrate that our PouringNet generalizes well across different liquid containers, positions of the audio receiver, initial liquid heights and types of liquid, and facilitates a more robust and accurate audio-based perception for robotic pouring.
- Project website: https://lianghongzhuo.github.io/AudioPouring/
- Preprint: https://arxiv.org/abs/1903.00650
- Video: https://www.youtube.com/watch?v=Za8dDjGFE1k
- Contact: [email protected], [email protected]
-
Pouring with different cups
The cups are marked as #1-6 from left to right. And only cup #1/2/3 present in the dataset, while others are not included. -
Pouring with different initial heights on cup #3.
-
Pouring with different microphone positions.
-
conda upgrade --all conda create -n pouring python=2.7 numpy ipython matplotlib mayavi yaml lxml seaborn conda activate pouring conda install -c conda-forge librosa trimesh pyglet pip install rospkg tensorboardx pyassimp==4.1.3 # cpu: conda install tensorflow conda install pytorch-cpu torchvision-cpu -c pytorch # gpu: conda install tensorflow-gpu conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
-
Clone and install this repository:
git clone https://github.com/lianghongzhuo/AudioPouring.git cd AudioPouring AUDIO_POURING_DIR=${PWD} cd audio_pouring python setup.py develop
-
Install dependencies:
- Install portaudio dependencies: according to this,
install packages in this order will not remove any other packages
sudo apt install libsndfile1-dev sudo apt install libjack-jackd2-dev sudo apt install portaudio19-dev
- Make sure your current user name is in
audio
group - Other dependencies (only for robot experiment):
cd ${AUDIO_POURING_DIR} sh audio_pouring_install.sh
- Install portaudio dependencies: according to this,
install packages in this order will not remove any other packages
-
Install following required ROS packages:
-
Bring up audio publishing node:
roslaunch portaudio_transport publish.launch
-
Bring up a scale to get the ground truth height, if you do not have a ROS based scale, directly go to step 4.
-
Run demo code
cd ${AUDIO_POURING_DIR}/audio_pouring python demo.py --cuda --bottle=1 --cavity-height=50
-
(In case when a ROS-based scale is not available) you can also use a normal scale and check the pouring result with the code below:
from audio_pouring.utils.utils import weight2height print(weight2height(cup_id="1", cur_weight=0.02))
-
Data preparation: generate a ~4s segment from a whole pouring sequence (pickle files):
cd ${AUDIO_POURING_DIR}/audio_pouring/model python long_preprocess.py train mt python long_preprocess.py test mt
-
Data preparation: generate npy file list from that segment
cd ${AUDIO_POURING_DIR}/audio_pouring/utils python generate_npy_list.py
-
Network training
cd ${AUDIO_POURING_DIR}/audio_pouring python main_lstm.py --fixed --cuda --gpu=0 --bottle-train=0 --lstm --bs=32 #args: #--fixed : the input audio length is fixed (must set) #--lstm : set to use lstm or gru #--bs : set batch size #--bottle-train: set bottle id, if set to 0, then all the data are used
- create a bottle config csv file and put it at
${AUDIO_POURING_DIR}/audio_pouring/config/bottles
- modify and run the code below:
cd ${AUDIO_POURING_DIR}/audio_pouring/utils python generate_bottle_config.py
- Containing video, audio, force/torque and position information collected during human pouring.
- We plan to release our dataset progressively. Now the audio part is available at Google Drive.
If you find this paper and code useful in your research, please consider citing:
@article{liang2019AudioPouring,
title={Making Sense of Audio Vibration for Liquid Height Estimation in Robotic Pouring},
author={Liang, Hongzhuo and Li, Shuang and Ma, Xiaojian and Hendrich Norman and Gerkmann Timo and Zhang, Jianwei},
journal={arXiv preprint arXiv:1903.00650},
year={2019}
}