@misc{CV2018,
author = {Donny You ([email protected])},
howpublished = {\url{https://github.com/donnyyou/PyTorchCV}},
year = {2018}
}
This repository provides source code for some deep learning based cv problems. We'll do our best to keep this repository up to date. If you do find a problem about this repository, please raise it as an issue. We will fix it immediately.
-
- VGG: Very Deep Convolutional Networks for Large-Scale Image Recognition
- ResNet: Deep Residual Learning for Image Recognition
- DenseNet: Densely Connected Convolutional Networks
- MobileNetV2: Inverted Residuals and Linear Bottlenecks
- ResNeXt: Aggregated Residual Transformations for Deep Neural Networks
- SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size
- ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
- ShuffleNet V2: Practical Guidelines for Ecient CNN Architecture Design
-
- DeepLabV3: Rethinking Atrous Convolution for Semantic Image Segmentation
- PSPNet: Pyramid Scene Parsing Network
- DenseASPP: DenseASPP for Semantic Segmentation in Street Scenes
-
- SSD: Single Shot MultiBox Detector
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- YOLOv3: An Incremental Improvement
- FPN: Feature Pyramid Networks for Object Detection
-
- CPM: Convolutional Pose Machines
- OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
-
- Mask R-CNN
- ResNet: Deep Residual Learning for Image Recognition
- CityScapes (Single Scale Whole Image Test): Base LR 0.01, Crop Size 769
Checkpoints | Backbone | Train | Test | mIOU | BS | Iters | Scripts |
---|---|---|---|---|---|---|---|
PSPNet | 3x3-Res101 | train | val | - | 8 | 4W | PSPNet |
DeepLabV3 | 3x3-Res101 | train | val | - | 8 | 4W | DeepLabV3 |
- ADE20K (Single Scale Whole Image Test): Base LR 0.02, Crop Size 520
Checkpoints | Backbone | Train | Test | mIOU | PixelACC | BatchSize | Iters | Scripts |
---|---|---|---|---|---|---|---|---|
PSPNet | 3x3-Res50 | train | val | - | - | 16 | 15W | PSPNet |
DeepLabv3 | 3x3-Res50 | train | val | - | - | 16 | 15W | DeepLabV3 |
PSPNet | 3x3-Res101 | train | val | - | - | 16 | 15W | PSPNet |
DeepLabv3 | 3x3-Res101 | train | val | - | - | 16 | 15W | DeepLabV3 |
- SSD: Single Shot MultiBox Detector
Model | Backbone | Training data | Testing data | mAP | FPS | Setting |
---|---|---|---|---|---|---|
SSD-300 Origin | VGG16 | VOC07+12 trainval | VOC07 test | 0.772 | - | - |
SSD-300 Ours | VGG16 | VOC07+12 trainval | VOC07 test | 0.786 | - | SSD300 |
SSD-512 Origin | VGG16 | VOC07+12 trainval | VOC07 test | 0.798 | - | - |
SSD-512 Ours | VGG16 | VOC07+12 trainval | VOC07 test | 0.808 | - | SSD512 |
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Model | Backbone | Training data | Testing data | mAP | FPS | Setting |
---|---|---|---|---|---|---|
Faster R-CNN Origin | VGG16 | VOC07 trainval | VOC07 test | 0.699 | - | - |
Faster R-CNN Ours | VGG16 | VOC07 trainval | VOC07 test | 0.706 | - | Faster R-CNN |
- YOLOv3: An Incremental Improvement
- OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
- Mask R-CNN
Take OpenPose as an example.
- Train the openpose model
python main.py --hypes hypes/pose/coco/op_coco_pose.json \
--base_lr 0.001 \
--phase train \
--gpu 0 1
- Finetune the openpose model
python main.py --hypes hypes/pose/coco/op_coco_pose.json \
--base_lr 0.001 \
--phase train \
--resume checkpoints/pose/coco/coco_open_pose_65000.pth \
--gpu 0 1
- Test the openpose model(test_img):
python main.py --hypes hypes/pose/coco/op_coco_pose.json \
--phase test \
--resume checkpoints/pose/coco/coco_open_pose_65000.pth \
--test_img val/samples/ski.jpg \
--gpu 0
- Test the openpose model(test_dir):
python main.py --hypes hypes/pose/coco/op_coco_pose.json \
--phase test \
--resume checkpoints/pose/coco/coco_open_pose_65000.pth \
--test_dir val/samples \
--gpu 0