GitHub - oakink/OakInk2: 🌴[CVPR 2024] OakInk2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion

A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion

🔧 Dataset Toolkit

Xinyu Zhan* · Lixin Yang* · Yifei Zhao · Kangrui Mao · Hanlin Xu
Zenan Lin · Kailin Li · Cewu Lu†

CVPR 2024

This repo contains the OakInk2 dataset toolkit (oakink2_toolkit) -- a Python package that provides data loading, splitting, and visualization.

Setup dataset files.

Download tarballs from [huggingface](https://huggingface.co/datasets/kelvin34501/OakInk-v2).
You will need the data tarball and the preview version annotation tarball for at least one sequence, the object_raw tarball, the object_repair tarball and the program tarball.
Organize these files as follow:
```
data
|-- data
|   `-- scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS
|-- anno_preview
|   `-- scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS.pkl
|-- object_raw
|-- object_repair
`-- program
```

OakInk2 Toolkit

Install the package.
```
pip install .
```
Optionally, install it with editable flags:
```
pip install -e .
```
Check the installation.
```
python -c 'from oakink2_toolkit.dataset import OakInk2__Dataset'
```
It the command runs without error, the installation is successful.

OakInk2 Preview-Tool

Setup the enviroment.
1. Create a virtual env of python 3.10. This can be done by either conda or python package venv.
  1. conda approach
```
conda create -p ./.conda python=3.10
conda activate ./.conda
```
  2. venv approach First use pyenv or other tools to install a python intepreter of version 3.10. Here 3.10.14 is used as example:
```
pyenv install 3.10.14
pyenv shell 3.10.14
```
    Then create a virtual environment:
```
python -m venv .venv --prompt oakink2_preview
. .venv/bin/activate
```
2. Install the dependencies.
  
  Make sure all bundled dependencies are there.
```
git submodule update --init --recursive --progress
```
  Use pip to install the packages:
```
pip install -r req_preview.txt
```
Download the SMPL-X model(version v1.1) and place the files at asset/smplx_v1_1.

The directory structure should be like:
```
asset
`-- smplx_v1_1
   `-- models
        |-- SMPLX_NEUTRAL.npz
        `-- SMPLX_NEUTRAL.pkl
```

Launch the preview tool:

python -m launch.viz.gui --cfg config/gui__preview.yml

Or use the shortcut:

oakink2_viz_gui --cfg config/gui_preview.yml

(Optional) Preview task in segments.
1. Download the MANO model(version v1.2) and place the files at asset/mano_v1_2.
  
  The directory structure should be like:
```
asset
`-- mano_v1_2
    `-- models
        |-- MANO_LEFT.pkl
        `-- MANO_RIGHT.pkl
```
2. Launch the preview segment tool (press enter to proceed). Note seq_key should contain '/' rather than '++' as directory separator.
```
python -m oakink2_preview.launch.viz.seg_3d --seq_key scene_0x__y00z/00000000000000000000__YYYY-mm-dd-HH-MM-SS
```
  Or use the shortcut:
```
oakink2_viz_seg3d --seq_key scene_0x__y00z/00000000000000000000__YYYY-mm-dd-HH-MM-SS
```
(Optional) View the introductory video on youtube.

Dataset Format

data/scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS

This stores the captured multi-view image streams. Stream from different cameras are stored in different subdirectories.

scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS
|-- <serial 0>
|   |-- <frame id 0>.jpg
|   |-- <frame id 1>.jpg
|   |-- ...
|   `-- <frame id N>.jpg
|-- ...
`-- <serial 3>
    |-- <frame id 0>.jpg
    |-- <frame id 1>.jpg
    |-- ...
    `-- <frame id N>.jpg

anno/scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS.pkl

This pickle stores a dictonary under the following format:

{
    'cam_def': dict[str, str],                      # camera serial to name mapping
    'cam_selection': list[str],                     # selected camera names
    'frame_id_list': list[int],                     # image frame id list in current seq 
    'cam_intr': dict[str, dict[int, np.ndarray]],   # camera intrinsic matrix [3, 3]
    'cam_extr': dict[str, dict[int, np.ndarray]],   # camera extrinsic matrix [4, 4]
    'mocap_frame_id_list': list[int],               # mocap frame id list in current seq
    'obj_list': list[str],                          # object part id list in current seq
    'obj_transf': dict[str, dict[int, np.ndarray]], # object transformation matrix [4, 4]
    'raw_smplx': dict[int, dict[str, torch.Tensor]],# raw smplx data
    'raw_mano':  dict[int, dict[str, torch.Tensor]],# raw mano data
}

The raw smplx data is structured as follows:

{
    'body_shape':       torch.Tensor[1, 300],
    'expr_shape':       torch.Tensor[1, 10],
    'jaw_pose':         torch.Tensor[1, 1, 4],
    'leye_pose':        torch.Tensor[1, 1, 4],
    'reye_pose':        torch.Tensor[1, 1, 4],
    'world_rot':        torch.Tensor[1, 4],
    'world_tsl':        torch.Tensor[1, 3],
    'body_pose':        torch.Tensor[1, 21, 4],
    'left_hand_pose':   torch.Tensor[1, 15, 4],
    'right_hand_pose':  torch.Tensor[1, 15, 4],
}

where world_rot, body_pose, {lh,rh}_hand_pose are quaternions in [w,x,y,z] format. The lower body of body_pose, jaw_pose, {l,r}eye_pose are not used.

The raw mano data is structured as follows:

{
    'rh__pose_coeffs':  torch.Tensor[1, 16, 4],
    'lh__pose_coeffs':  torch.Tensor[1, 16, 4],
    'rh__tsl':          torch.Tensor[1, 3],
    'lh__tsl':          torch.Tensor[1, 3],
    'rh__betas':        torch.Tensor[1, 10],
    'lh__betas':        torch.Tensor[1, 10],
}

where {lh,rh}__pose_coeffs are quaternions in [w,x,y,z] format.

object_{raw,scan}/obj_desc.json

This stores the object description in the following format:
```
{
    obj_id: {
        "obj_id": str,
        "obj_name": str,
    }
}
```
object_{raw,scan}/align_ds

This directory stores the object models.
```
align_ds
|-- obj_id
|   |-- *.obj/ply
|   |-- ...
`-- ...
```
program/program_info/scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS.json
```
{
    (str(lh_interval), str(rh_interval)): {
        "primitive": str,
        "obj_list: list[str],
        "interaction_mode": str,        # [lh_main, rh_main, bh_main]
        "primitive_lh": str,
        "primitive_rh": str,
        "obj_list_lh": list[str],
        "obj_list_rh": list[str],
    }
}
```
- {lh,rh}_interval: the interval of the primitive in the sequence. If None, the corresponding hand is not available (e.g. doing something else) in current primitive.
- primitive: the primitive id.
- obj_list: the object list involved in the primitive.
- interaction_mode: the interaction mode of the primitive. lh_main means the left hand is the main hand for affordance implementation. Similarly, rh_main means the right hand is the main hand, and bh_main means both hands are main hands.
- primitive_{lh,rh}: the primitive id for the left/right hand.
- obj_list_{lh,rh}: the object list involved in the left/right hand.

program/desc_info/scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS.json

{
    (str(lh_interval), str(rh_interval)): {
        "seg_desc": str,                # textual description of current primitive
    }
}

program/initial_condition_info/scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS.json

{
    (str(lh_interval), str(rh_interval)): {
        "initial_condition": list[str], # initial condition for the complex task
        "recipe": list[str],            # requirements to complete for the complex task
    }
}

program/pdg/scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS.json

{
    "id_map": dict[interval, int],      # map from interval to primitive id
    "v": list[int],                     # list of vertices
    "e": list[list[int]],               # list of edges
}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
asset/smplx_extra		asset/smplx_extra
config		config
doc		doc
script		script
src		src
thirdparty		thirdparty
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
req_preview.txt		req_preview.txt
req_toolkit.txt		req_toolkit.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion

🔧 Dataset Toolkit

CVPR 2024

Setup dataset files.

OakInk2 Toolkit

OakInk2 Preview-Tool

Dataset Format

About

Releases

Contributors 2

Languages

oakink/OakInk2

Folders and files

Latest commit

History

Repository files navigation

A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion

🔧 Dataset Toolkit

CVPR 2024

Setup dataset files.

OakInk2 Toolkit

OakInk2 Preview-Tool

Dataset Format

About

Topics

Resources

Stars

Watchers

Forks

Releases

Contributors 2

Languages