Skip to content

Latest commit

 

History

History
40 lines (35 loc) · 2.95 KB

README.md

File metadata and controls

40 lines (35 loc) · 2.95 KB

DANCING-AI

  1. Extraction of pose coordinates from dance videos using openpose human pose estimation.
  2. Training LSTM network on extracted coordinates using songs as input and coordinates as output.
  3. Trained lstm is used to predict dance coordinates for the remaining song( 95% of the audio is used for training and remaining 5% for predictions ).
  4. Display output videos by joining predicted coordinates to generate dancing human stick figures.

Requirements

   opencv-contrib-python==4.7.0.72
   pandas==2.0.1
   librosa==0.10.0.post2
   moviepy==1.0.3
   yt-dlp==2023.3.4
   tensorflow==2.12.0
   keras==2.12.0

Training/Demo Open In Colab

  1. Run get_data.py to download videos and audios to data folder. You can add youtube videos links to "video_links.txt" file for downloading. Alternatively you can copy videos( '.mp4' format ) and audios( '.wav' format ) directly to the data folder.
  2. Download pretrained weights for pose estimation from here. Download pose_iter_440000.caffemodel and save it in "models" folder.
  3. Run main.py to train lstm and display predicted dance video.
 python main.py --video "path to input video" --audio "path to input audio" --background "path to background image" --display
 Example - python main.py --video data/0.mp4 --audio data/0.wav --background inputs/bg0.jpg --display

   #Note - If the gpu-ram is 3 GB or less, Reduce memory-limit in this line to a value less than your gpu-ram.

Pose estimation using openpose

Predictions

References

  1. https://www.learnopencv.com/deep-learning-based-human-pose-estimation-using-opencv-cpp-python/
  2. https://github.com/CMU-Perceptual-Computing-Lab/openpose
  3. https://python-pytube.readthedocs.io/en/latest/
  4. https://zulko.github.io/moviepy/
  5. https://librosa.org/librosa/
  6. https://www.youtube.com/channel/UCX9y7I0jT4Q5pwYvNrcHI_Q