Skip to content

Latest commit

 

History

History
85 lines (65 loc) · 1.94 KB

README.md

File metadata and controls

85 lines (65 loc) · 1.94 KB

Music Generation

Music Generation Using MusicGen and Audiocraft

Uses Streamlit as gui.

Meta recently released MusicGen.
It can generate short new pieces of music based on text prompts,
which can optionally be aligned to an existing melody.

MusicGen is based on a Transformer model.
MusicGen predicts the next section in a music sequence.

The researchers decompose the audio data into smaller components
using Meta's EnCodec audio tokenizer.
A single-stage model that processes tokens in parallel.
MusicGen is fast and efficient but does require a bit of VRAM to run.
Currently around 16GB of vram for the smallest model.

The researchers used 20k hours of licensed music for training.
In particular, a internal dataset of 10000 high-quality music tracks
and as music data from Shutterstock and Pond5.

A sample output can be found here:
https://github.com/vluz/MusicGeneration/blob/main/20240116-211520.wav


Open a command prompt and cd to a new directory of your choosing:

(optional; recommended) Create a virtual environment with:

python -m venv "venv"
venv\Scripts\activate

To install do:

git clone https://github.com/vluz/MusicGeneration.git
cd MusicGeneration
pip install -r requirements.txt

Here is a tested set of requirements updated 11-02-2024:

audiocraft==1.3.0a1
streamlit==1.30.0
torch==2.1.2+cu121
torchaudio==2.1.2+cu121

On first run it may download several models.
The GUI may be blank or unresponsive for the duration of the setup
It will take quite some time, both on reqs above and on first run.
Please allow it time to finish.
All runs after the first are then faster to load.

To run do:
streamlit run mg.py

Gui will open on your default browser


TODO: Take adavantage of caching to speed up the app
TODO: Use experimental garbage collect to limit memory use

Note: Do not use this for production, it's untested