Xinyu Zhan*
·
Lixin Yang*
·
Yifei Zhao
·
Kangrui Mao
·
Hanlin Xu
Zenan Lin
·
Kailin Li
·
Cewu Lu†
This repo contains the OakInk2 dataset toolkit (oakink2_toolkit) -- a Python package that provides data loading, splitting, and visualization.
Download tarballs from [huggingface](https://huggingface.co/datasets/kelvin34501/OakInk-v2).
You will need the data tarball and the preview version annotation tarball for at least one sequence, the object_raw tarball, the object_repair tarball and the program tarball.
Organize these files as follow:
```
data
|-- data
| `-- scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS
|-- anno_preview
| `-- scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS.pkl
|-- object_raw
|-- object_repair
`-- program
```
-
Install the package.
pip install .
Optionally, install it with editable flags:
pip install -e .
-
Check the installation.
python -c 'from oakink2_toolkit.dataset import OakInk2__Dataset'
It the command runs without error, the installation is successful.
-
Setup the enviroment.
-
Create a virtual env of python 3.10. This can be done by either
conda
or python packagevenv
.-
conda
approachconda create -p ./.conda python=3.10 conda activate ./.conda
-
venv
approach First usepyenv
or other tools to install a python intepreter of version 3.10. Here 3.10.14 is used as example:pyenv install 3.10.14 pyenv shell 3.10.14
Then create a virtual environment:
python -m venv .venv --prompt oakink2_preview . .venv/bin/activate
-
-
Install the dependencies.
Make sure all bundled dependencies are there.
git submodule update --init --recursive --progress
Use
pip
to install the packages:pip install -r req_preview.txt
-
-
Download the SMPL-X model(version v1.1) and place the files at
asset/smplx_v1_1
.The directory structure should be like:
asset `-- smplx_v1_1 `-- models |-- SMPLX_NEUTRAL.npz `-- SMPLX_NEUTRAL.pkl
-
Launch the preview tool:
python -m launch.viz.gui --cfg config/gui__preview.yml
Or use the shortcut:
oakink2_viz_gui --cfg config/gui_preview.yml
-
(Optional) Preview task in segments.
-
Download the MANO model(version v1.2) and place the files at
asset/mano_v1_2
.The directory structure should be like:
asset `-- mano_v1_2 `-- models |-- MANO_LEFT.pkl `-- MANO_RIGHT.pkl
-
Launch the preview segment tool (press enter to proceed). Note
seq_key
should contain '/' rather than '++' as directory separator.python -m oakink2_preview.launch.viz.seg_3d --seq_key scene_0x__y00z/00000000000000000000__YYYY-mm-dd-HH-MM-SS
Or use the shortcut:
oakink2_viz_seg3d --seq_key scene_0x__y00z/00000000000000000000__YYYY-mm-dd-HH-MM-SS
-
-
(Optional) View the introductory video on youtube.
-
data/scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS
This stores the captured multi-view image streams. Stream from different cameras are stored in different subdirectories.
scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS |-- <serial 0> | |-- <frame id 0>.jpg | |-- <frame id 1>.jpg | |-- ... | `-- <frame id N>.jpg |-- ... `-- <serial 3> |-- <frame id 0>.jpg |-- <frame id 1>.jpg |-- ... `-- <frame id N>.jpg
-
anno/scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS.pkl
This pickle stores a dictonary under the following format:
{ 'cam_def': dict[str, str], # camera serial to name mapping 'cam_selection': list[str], # selected camera names 'frame_id_list': list[int], # image frame id list in current seq 'cam_intr': dict[str, dict[int, np.ndarray]], # camera intrinsic matrix [3, 3] 'cam_extr': dict[str, dict[int, np.ndarray]], # camera extrinsic matrix [4, 4] 'mocap_frame_id_list': list[int], # mocap frame id list in current seq 'obj_list': list[str], # object part id list in current seq 'obj_transf': dict[str, dict[int, np.ndarray]], # object transformation matrix [4, 4] 'raw_smplx': dict[int, dict[str, torch.Tensor]],# raw smplx data 'raw_mano': dict[int, dict[str, torch.Tensor]],# raw mano data }
The raw smplx data is structured as follows:
{ 'body_shape': torch.Tensor[1, 300], 'expr_shape': torch.Tensor[1, 10], 'jaw_pose': torch.Tensor[1, 1, 4], 'leye_pose': torch.Tensor[1, 1, 4], 'reye_pose': torch.Tensor[1, 1, 4], 'world_rot': torch.Tensor[1, 4], 'world_tsl': torch.Tensor[1, 3], 'body_pose': torch.Tensor[1, 21, 4], 'left_hand_pose': torch.Tensor[1, 15, 4], 'right_hand_pose': torch.Tensor[1, 15, 4], }
where
world_rot
,body_pose
,{lh,rh}_hand_pose
are quaternions in[w,x,y,z]
format. The lower body ofbody_pose
,jaw_pose
,{l,r}eye_pose
are not used.The raw mano data is structured as follows:
{ 'rh__pose_coeffs': torch.Tensor[1, 16, 4], 'lh__pose_coeffs': torch.Tensor[1, 16, 4], 'rh__tsl': torch.Tensor[1, 3], 'lh__tsl': torch.Tensor[1, 3], 'rh__betas': torch.Tensor[1, 10], 'lh__betas': torch.Tensor[1, 10], }
where
{lh,rh}__pose_coeffs
are quaternions in[w,x,y,z]
format. -
object_{raw,scan}/obj_desc.json
This stores the object description in the following format:
{ obj_id: { "obj_id": str, "obj_name": str, } }
-
object_{raw,scan}/align_ds
This directory stores the object models.
align_ds |-- obj_id | |-- *.obj/ply | |-- ... `-- ...
-
program/program_info/scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS.json
{ (str(lh_interval), str(rh_interval)): { "primitive": str, "obj_list: list[str], "interaction_mode": str, # [lh_main, rh_main, bh_main] "primitive_lh": str, "primitive_rh": str, "obj_list_lh": list[str], "obj_list_rh": list[str], } }
- {lh,rh}_interval: the interval of the primitive in the sequence. If
None
, the corresponding hand is not available (e.g. doing something else) in current primitive. - primitive: the primitive id.
- obj_list: the object list involved in the primitive.
- interaction_mode: the interaction mode of the primitive.
lh_main
means the left hand is the main hand for affordance implementation. Similarly,rh_main
means the right hand is the main hand, andbh_main
means both hands are main hands. - primitive_{lh,rh}: the primitive id for the left/right hand.
- obj_list_{lh,rh}: the object list involved in the left/right hand.
- {lh,rh}_interval: the interval of the primitive in the sequence. If
-
program/desc_info/scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS.json
{ (str(lh_interval), str(rh_interval)): { "seg_desc": str, # textual description of current primitive } }
-
program/initial_condition_info/scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS.json
{ (str(lh_interval), str(rh_interval)): { "initial_condition": list[str], # initial condition for the complex task "recipe": list[str], # requirements to complete for the complex task } }
-
program/pdg/scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS.json
{ "id_map": dict[interval, int], # map from interval to primitive id "v": list[int], # list of vertices "e": list[list[int]], # list of edges }