Skip to content

Latest commit

 

History

History
210 lines (151 loc) · 20.7 KB

PPHuman_QUICK_STARTED_en.md

File metadata and controls

210 lines (151 loc) · 20.7 KB

English | 简体中文

Quick Start for PP-Human

Contents

Environment Preparation

Environment requirements: PaddleDetection>= release/2.4 or develop version

Installation of PaddlePaddle and PaddleDetection

# PaddlePaddle CUDA10.1
python -m pip install paddlepaddle-gpu==2.2.2.post101 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html

# PaddlePaddle CPU
python -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple

#Clone PaddleDetection repositories
cd <path/to/clone/PaddleDetection>
git clone https://github.com/PaddlePaddle/PaddleDetection.git

# Install dependencies
cd PaddleDetection
pip install -r requirements.txt
  1. For installation details, please refer to Installation Tutorials
  2. If you need TensorRT inference acceleration (speed measurement), please install PaddlePaddle with TensorRT version. You can download and install it from the PaddlePaddle Installation Package or follow the [Instructions](https://www. paddlepaddle.org.cn/inference/master/optimize/paddle_trt.html) or use docker, or self-compiling to prepare the environment.

Model Download

PP-Human provides object detection, attribute recognition, behaviour recognition and ReID pre-trained models for different applications. Developers can download them directly.

Task End-to(ms) Model Solution Model Size
Pedestrian Detection (high precision) 25.1ms Multi-Object Tracking 182M
Pedestrian Detection (Lightweight) 16.2ms Multi-Object Tracking 27M
Pedestrian Tracking (high precision) 31.8ms Multi-Object Tracking 182M
Pedestrian Tracking (Lightweight) 21.0ms Multi-Object Tracking 27M
Attribute Recognition (high precision) Single Person 8.5ms Object Detection
Attribute Recognition
Object Detection:182M
Attribute Recogniton:86M
Attribute Recognition (Lightweight) Single Person 7.1ms Object Detection
Attribute Recogniton
Object Detection:182M
Attribute Recogniton:86M
Falling Detection Single Person 10ms Multi-Object Tracking
Keypoint Detection
Skeleton Based Action Recognition
Multi-Object Tracking:182M
Keypoint Detection:101M
Skeleton Based Action Recognition:21.8M
Breaking-In Detection 31.8ms Multi-Object Tracking Multi-Object Tracking:182M
Fighting Detection 19.7ms Video Classification 90M
Smoking Detection Single Person 15.1ms Object Detection
Object Detection Based On Body ID
Object Detection:182M
Object Detection Based On Body ID:27M
Phone-calling Detection Single Person 6.0ms Object Detection
Image Classification Based On Body ID
Object Detection:182M
Image Classification Based On Body ID:45M

Download the model and unzip it into the . /output_inference folder.

In the configuration file, the model path defaults to the download path of the model. If the user does not change it, the corresponding model will be downloaded automatically upon inference.

Note:

  • Model accuracy is tested on fused datasets, which contain both open source and enterprise datasets.
  • ReID model accuracy is tested on the Market1501 dataset
  • Prediction speed is obtained at T4 with TensorRT FP16 enabled, which includes data pre-processing, model inference and post-processing.

Configuration

The PP-Human-related configuration is located in deploy/pipeline/config/infer_cfg_pphuman.yml, and this configuration file contains all the features currently supported by PP-Human. If you want to see the configuration for a specific feature, please refer to the relevant configuration in deploy/pipeline/config/examples/. In addition, the contents of the configuration file can be modified with the -ocommand line parameter. E.g. to modify the model directory of an attribute, developers can run ```-o ATTR.model_dir="DIR_PATH"``.

The features and corresponding task types are as follows.

Input Feature Task Config
Image Attribute Recognition Object Detection Attribute Recognition DET ATTR
Single-camera video Attribute Recognition Multi-Object Tracking Attribute Recognition MOT ATTR
Single-camera video Behaviour Recognition Multi-Object Tracking Keypoint Detection Falling detection MOT KPT SKELETON_ACTION

Take attribute recognition based on video input as an example: Its task type includes multi-object tracking and attributes recognition. The specific configuration is as follows.

crop_thresh: 0.5
attr_thresh: 0.5
visual: True

MOT:
  model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip
  tracker_config: deploy/pipeline/config/tracker_config.yml
  batch_size: 1
  enable: True

ATTR:
  model_dir:  https://bj.bcebos.com/v1/paddledet/models/pipeline/PPLCNet_x1_0_person_attribute_945_infer.zip
  batch_size: 8
  enable: True

Note:

  • If developer needs to carry out different tasks, set the corresponding enables option to be True in the configuration file.

Inference Deployment

  1. Use the default configuration directly or the configuration file in examples, or modify the configuration in infer_cfg_pphuman.yml

    # Example: In pedestrian detection model, specify configuration file path and test image, and image input opens detection model by default
    python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_pphuman.yml --image_file=test_image.jpg --device=gpu
    # Example: In pedestrian attribute recognition, directly configure the examples
    python deploy/pipeline/pipeline.py --config deploy/pipeline/config/examples/infer_cfg_human_attr.yml --video_file=test_video.mp4 --device=gpu
    
  2. Use the command line to enable functions or change the model path.

# Example: Pedestrian tracking, specify config file path, model path and test video. The specified model path on the command line has a higher priority than the config file.
python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_pphuman.yml -o MOT.enable=True MOT.model_dir=ppyoloe_infer/ --video_file=test_video.mp4 --device=gpu

# Example: In behaviour recognition, with fall recognition as an example, enable the SKELETON_ACTION model on the command line
python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_pphuman.yml -o SKELETON_ACTION.enbale=True --video_file=test_video.mp4 --device=gpu
  1. For rtsp stream, use --rtsp RTSP [RTSP ...] parameter to specify one or more rtsp streams. Separate the multiple addresses with a space, or replace the video address directly after the video_file with the rtsp stream address), examples as follows
# Example: Single video stream for pedestrian attribute recognition
python deploy/pipeline/pipeline.py --config deploy/pipeline/config/examples/infer_cfg_human_attr.yml -o visual=False --rtsp rtsp://[YOUR_RTSP_SITE]  --device=gpu
# Example: Multiple-video stream for pedestrian attribute recognition
python deploy/pipeline/pipeline.py --config deploy/pipeline/config/examples/infer_cfg_human_attr.yml -o visual=False --rtsp rtsp://[YOUR_RTSP_SITE1] rtsp://[YOUR_RTSP_SITE2] --device=gpu                                                                      |

Jetson Deployment

Due to the large gap in computing power of the Jetson platform compared to the server, we suggest:

  1. choose a lightweight model, especially for tracking model, ppyoloe_s: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip is recommended
  2. For frame skipping of tracking; we recommend 2 or 3: skip_frame_num: 3

With this recommended configuration, it is possible to achieve higher speeds on the TX2 platform. It has been tested with attribute case, with speeds up to 20fps. The configuration file can be modified directly (recommended) or from the command line (not recommended due to its long fields).

Parameters

Parameters Necessity Implications
--config Yes Path to configuration file
-o Option Overwrite the corresponding configuration in the configuration file
--image_file Option Images to be predicted
--image_dir Option Path to the images folder to be predicted
--video_file Option Video to be predicted, or rtsp stream address (rtsp parameter recommended)
--rtsp Option rtsp video stream address, supports one or more simultaneous streams input
--camera_id Option The camera ID for prediction, default is -1 ( for no camera prediction, can be set to 0 - (number of cameras - 1) ), press q in the visualization interface during the prediction process to output the prediction result to: output/output.mp4
--device Option Running device, options include CPU/GPU/XPU, and the default is CPU.
--output_dir Option The root directory for the visualization results, and the default is output/
--run_mode Option For GPU, the default is paddle, with (paddle/trt_fp32/trt_fp16/trt_int8) as optional
--enable_mkldnn Option Whether to enable MKLDNN acceleration in CPU prediction, the default is False
--cpu_threads Option Set the number of cpu threads, and the default is 1
--trt_calib_mode Option Whether TensorRT uses the calibration function, and the default is False; set to True when using TensorRT's int8 function and False when using the PaddleSlim quantized model
--do_entrance_counting Option Whether to count entrance/exit traffic flows, the default is False
--draw_center_traj Option Whether to map the trace, the default is False
--region_type Option 'horizontal' (default), 'vertical': traffic count direction; 'custom': set break-in area
--region_polygon Option Set the coordinates of the polygon multipoint in the break-in area. No default.
--do_break_in_counting Option Area break-in checks

Solutions

The overall solution for PP-Human v2 is shown in the graph below:

Pedestrian detection

  • Take PP-YOLOE L as the object detection model
  • For detailed documentation, please refer to [PP-YOLOE](... /... /... /... /configs/ppyoloe/) and Multiple-Object-Tracking

Pedestrian tracking

  • Vehicle tracking by SDE solution
  • Adopt PP-YOLOE L (high precision) and S (lightweight) for detection models
  • Adopt the OC-SORT solution for racking module
  • Refer to [OC-SORT](... /... /... /... /configs/mot/ocsort) and Multi-Object Tracking for details

Multi-camera & multi-pedestrain tracking

  • Use PP-YOLOE & OC-SORT to acquire single-camera multi-object tracking trajectory
  • Extract features for each frame using ReID (StrongBaseline network).
  • Match multi-camera trajectory features to obtain multi-camera tracking results.
  • Refer to Multi-camera & multi-pedestrain tracking for details.

Attribute Recognition

  • Use PP-YOLOE + OC-SORT to track the human body.
  • Use PP-HGNet, PP-LCNet (multi-classification model) to complete the attribute recognition. Main attributes include age, gender, hat, eyes, top and bottom dressing style, backpack.
  • Refer to attribute recognition for details.

Behaviour Recognition:

  • Four behaviour recognition solutions are provided:

    1. Behaviour recognition based on skeletal points, e.g. falling recognition
    1. Behaviour recognition based on image classification, e.g. phone call recognition
    1. Behaviour recognition based on detection, e.g. smoking recognition
    1. Behaviour recognition based on Video classification, e.g. fighting recognition
  • For details, please refer to Behaviour Recognition