Skip to content

Latest commit

 

History

History
78 lines (55 loc) · 5.85 KB

README.md

File metadata and controls

78 lines (55 loc) · 5.85 KB

1. State-of-the-Art on Your Own Data

In this module, we'll quite a few State-of-the-Art computer vision algorithms. One of the really exciting things about computer vision right now is the amount of high quality, publically available code. For this part of your assignment, your job is to run one publically avaialable algorithm on your own video or images. Your deliverable is a short video, posted to YouTube, showing your results. For example, you could shoot your own video, and use and Mask RCNN to process each frame, and stitch these results together into a short video.

A Sample of The Computer Vision State of the Art in 2019

PROBLEM PAPER CODE
Classification “ResNet” Deep Residual Learning for Image Recognition Implemented in keras, pytorch, fastai
Detection RetinaNet: Focal Loss for Dense Object Detection


Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks


SSD: Single Shot MultiBox Detector


YOLOv3: An Incremental Improvement
Part of FAIR’s Detectron


Part of Tensorflow Object Detection API


Part of Tensorflow Object Detection API


CODE
Semantic Segmentation “Deeplab v3” Rethinking Atrous Convolution for Semantic Image Segmentation CODE
Instance Segmentation Mask R-CNN CODE
Human Pose Estimation OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields CODE
Hand Pose Estimation GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB
Face Detection Selective Refinement Network for High Performance Face Detection CODE
Face Recognition FaceNet: A Unified Embedding for Face Recognition and Clustering CODE
Tracking Fast Online Object Tracking and Segmentation: A Unifying Approach CODE
Depth Estimation Digging Into Self-Supervised Monocular Depth Estimation CODE
Structure from Motion opensfm
Image Generation LARGE SCALE GAN TRAINING FOR HIGH FIDELITY NATURAL IMAGE SYNTHESIS
Face Generation StyleGAN: A Style-Based Generator Architecture for Generative Adversarial Networks CODE
Image to Image Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks CODE
Style Transfer A Closed-form Solution to Photorealistic Image Stylization CODE
Keypoint Detection and Tracking SuperPoint: Self-Supervised Interest Point Detection and Description CODE
Image Captioning Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering CODE
Text to Image StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks CODE

Setup

The Python 3 Anaconda Distribution is the easiest way to get going with the notebooks and code presented here.

(Optional) You may want to create a virtual environment for this repository:

conda create -n cv python=3 
source activate cv

You'll need to install the jupyter notebook to run the notebooks:

conda install jupyter

# You may also want to install nb_conda (Enables some nice things like change virtual environments within the notebook)
conda install nb_conda

This repository requires the installation of a few extra packages, you can install them with:

conda install -c pytorch -c fastai fastai
conda install jupyter
conda install -c conda-forge opencv

(Optional) jupyterthemes can be nice when presenting notebooks, as it offers some cleaner visual themes than the stock notebook, and makes it easy to adjust the default font size for code, markdown, etc. You can install with pip:

pip install jupyterthemes

Recommend jupyter them for presenting these notebook (type into terminal before launching notebook):

jt -t grade3 -cellw=90% -fs=20 -tfs=20 -ofs=20 -dfs=20

Recommend jupyter them for viewing these notebook (type into terminal before launching notebook):

jt -t grade3 -cellw=90% -fs=14 -tfs=14 -ofs=14 -dfs=14