The aim of this project is to provide an efficient pipeline for Action Recognition in Human Robot Interaction.
The whole 3D human pose is estimated and used to understand which action inside the support set the human is performing. Action can be easily added or removed from the support set in any moment. The Open-Set score confirms or rejects the Few-Shot prediction to avoid false positives. The Mutual Gaze Constraint can be added to an action as additional filter.
This repository contains different modules:
- hpe: a Human Pose Estimation module, that is an accelerated version of MetrABS with the usage of Nvidia TensorRT
- ar: an Action Recognition module, that is the evolution of the implementation of our paper One-Shot Open-Set Skeleton-Based Action Recognition by Stefano Berti, Andrea Rosasco, Michele Colledanchise, Lorenzo Natale with Istituto Italiano di Tecnologia
- focus: a Focus Detection Module, that uses MPIIGaze to do Gaze Estimation and checks for an intersection with the camera
The program is divided into two parts:
- source.py runs on the host machine, it connects to the RealSense (or webcam), it provides frames to main.py, it visualizes the results with the VISPYVisualizer
- main.py runs either in a Conda environment or in a Docker, it is responsible for all the computation part.
Since the hpe modules is accelerated with TensorRT engines that requires to be built on the target machine, we provide the engines build over the Dockerfile, that allows for a fast installation. Check here the instruction to install the Human Pose Estimation module.
Follow the instruction inside the README.md of every module: hpe, ar, and focus. Install Vispy and pyrealsense2 and build the Docker image with:
docker build -t ecub .
To run, start two separate processes:
python manager.py
python source.py
Launch the main script with the following command (replace PATH with %cd% in Windows or {$pwd} on Ubuntu):
docker run -it --rm --gpus=all -v "PATH":/home/ecub ecub:latest python main.py