This is the code repository of Autonomous Character-Scene Interaction Synthesis from Text Instruction at SIGGRAPH Asia 2024
arXiv | Project Page | Dataset | Demo
Please download the LINGO dataset from Google Drive. The content inside the download link will be continuously updated to ensure you have access to the most recent data.
Explanation of the files and folders of the LINGO dataset:
- Scene (folder): This folder contains the occupancy grid for indoor scenes in LINGO dataset, indicated by each file name. The scenes are mirrored for augmentation.
- Scene_vis (folder): This folder contains the occupancy grid for another set of indoor scenes, which we used to test our model and visualize the motions.
- language_motion_dict (folder): This folder contains wrapped infomation of each motion segment we used to train our model.
- human_pose.npy: This file contains a (N x 63) array, where each row corresponds to the 63-dimensional SMPL-X body_pose parameter of one frame of MoCap data. The data is a concatenation of all motion segments.
- human_orient.npy: This file contains a (N x 3) array corresponding to the global_orient parameter of SMPL-X.
- transl_aligned.npy: This file contains a (N x 3) array corresponding to the transl parameter of SMPL-X.
- human_joints_aligned.npy: This file contains a (N x 28 x 3) array corresponding to the selected joints 3D location (y-up) of SMPL-X.
- scene_name.pkl: This file contains a (N, ) list corresponding to the scene name of each frame.
- start_idx.npy: This file contains a (M x 3) array corresponding to the start frame index of each motion segment.
- end_idx.npy: This file contains a (M x 3) array corresponding to the end frame index of each motion segment.
- text_aug.pkl: This file contains a (M, ) list corresponding to the text annotations of each motion segment.
- left_hand_inter_frame.npy: This file contains a (M, ) array stores frame IDs where left hand-object contact occurs. And it contains -1 values for motion segments with no left hand-object contact.
- right_hand_inter_frame.npy: This file contains a (M, ) array stores frame IDs where right hand-object contact occurs. And it contains -1 values for motion segments with no right hand-object contact.
- clip_features.npy: This file contains the preprocessed CLIP features of text annotations in LINGO dataset.
- text2features_idx.pkl: This file stores a dictionary that maps text annotations to their corresponding CLIP feature vectors.
- norm_inter_and_loco__16frames.npy: This file is a (2, 3) array containing the range of joint coordinates along x, y, and z axes, used for normalizing joint locations.
Note: N represents the total number of frames in the LINGO dataset, while M represents the number of motion segments. This dataset is provided in mirrored form.
To run the code, you need to have the following installed:
- Python 3.8+
- Required Python packages (specified in
requirements.txt
)
-
Clone the Repository:
git clone [email protected]:mileret/lingo-release.git
-
Download Checkpoints, Data, and SMPL-X Models:
- Download the necessary files and folders from this link.
- Extract
lingo_utils.zip
, and place the four files and folders (dataset
,ckpts
,smpl_models
,vis.blend
) at the root of the project directory.
-
Install Python Packages:
pip install -r requirements.txt
-
Install Blender:
- We use Blender for visualization of the result.
- Please download Blender3.6 from its official website.
- (Optional) Then, download SMPL-X Blender Add-on and activate it in Blender.
-
Get Model Input:
Open
vis.blend
with Blender. Change thetext
,start_location
,end_goal
andhand_goal
. Then runget_input
invis.blend
. -
Inference:
To synthesis human motions using our model, run
cd code python sample_lingo.py
-
Visualization in Blender:
Run
vis_output
invis.blend
.The generated human motion will be displayed in Blender.
This README provides instructions on setting up and training our model using the LINGO dataset.
Before you begin, make sure you have the following software installed:
pip install -r requirements.txt
Navigate to the code
directory:
cd code
To start training the model, run the training script from the command line:
python train_lingo.py
The training script will automatically load the dataset, set up the model, and commence training sessions using the configurations in ./code/config
folder.
@inproceedings{jiang2024autonomous,
title={Autonomous character-scene interaction synthesis from text instruction},
author={Jiang, Nan and He, Zimo and Wang, Zi and Li, Hongjie and Chen, Yixin and Huang, Siyuan and Zhu, Yixin},
booktitle={SIGGRAPH Asia 2024 Conference Papers},
pages={1--11},
year={2024}
}