Skip to content

Synthesis of percussion sounds using sinusoidal modelling, DDSP noise synthesis, and a neural source filter approach.

Notifications You must be signed in to change notification settings

jorshi/drumblender

Repository files navigation

Differentiable Modelling of Percussive Audio with Transient and Spectral Synthesis

Demo Paper

Jordie Shier1, Franco Caspe1, Andrew Robertson2,
Mark Sandler1, Charalampos Saitis1, and Andrew McPherson3

1 Centre for Digital Music, Queen Mary University of London
2 Ableton AG
3 Dyson School of Design Engineering, Imperial College London

Abstract

Differentiable digital signal processing (DDSP) techniques, including methods for audio synthesis, have gained attention in recent years and lend themselves to interpretability in the parameter space. However, current differentiable synthesis methods have not explicitly sought to model the transient portion of signals, which is important for percussive sounds. In this work, we present a unified synthesis framework aiming to address transient generation and percussive synthesis within a DDSP framework. To this end, we propose a model for percussive synthesis that builds on sinusoidal modeling synthesis and incorporates a modulated temporal convolutional network for transient generation. We use a modified sinusoidal peak picking algorithm to generate time-varying non-harmonic sinusoids and pair it with differentiable noise and transient encoders that are jointly trained to reconstruct drumset sounds. We compute a set of reconstruction metrics using a large dataset of acoustic and electronic percussion samples that show that our method leads to improved onset signal reconstruction for membranophone percussion instruments.

Install

Clone the repo and then install the drumblender package. Requires Python version 3.9 or greater.

pip install --upgrade pip
pip install -e ".[modal]"

If you don't need to run modal extraction on audio, for example, if using a pre-processed dataset for training. You can exclude the optional modal requirements. The difference is that modal includes nn-audio which depends on an older version of numpy.

pip install -e .

Pre-trained Models

A set of pre-trained models are available in a submodule, to download:

git submodule update --init --recursive

Inference

To resynthesise an audio sample using a trained model:

drumblender-synth config checkpoint input output

For example:

drumblender-synth models/forum-acusticum-2023/noise_parallel_transient_params.yaml models/forum-acusticum-2023/noise_parallel_transient_params.ckpt audio/a_snare.wav snare_resynth.wav

Dataset

We used a private dataset of commercially recorded and produced one-shot acoustic and electronic drum samples. We unfortunately can't share those, but we've included a config using Freesound One-Shot Percussive Sounds to provide an example of dataset configuration and training.

To download a pre-processed version of that dataset:

wget https://pub-814e66019388451395cf43c0b6f10300.r2.dev/drumblender-freesound-v0.tar.gz
mkdir -p dataset/freesound
tar -zxf drumblender-freesound-v0.tar.gz -C dataset/freesound

Training

Use the drumblender command to train a new model.

drumblender fit -c cfg/01_noise_params.yaml --data cfg/data/freesound.yaml

Training config files can be found in the directory cfg. The configuration files in the root of that directory are the configurations used to train and test the different model configurations presented in the Forum Acusticum paper.

We used the PyTorch Lightning LightningCLI.

Testing

Pass test as an argument to drumblender to test a trained model. For example, to test a model on the test set of the Freesound Percussive One-Shot dataset:

drumblender test -c models/forum-acusticum-2023/noise_parallel_transient_params.yaml --ckpt models/forum-acusticum-2023/noise_parallel_transient_params.ckpt --data cfg/data/freesound.yaml --trainer.logger CSVLogger --model.test_metrics cfg/metrics/drumblender_metrics.yaml

The --trainer.logger argument overrides the logging configuration in the saved yaml file. --model.test_metrics adds extra metrics used in the evaluation for the paper.

To run this command on a CPU you can add the argument --trainer.accelerator cpu

System Information

For Forum Acusticum 2023, experiments were run using

  • Python version: 3.10.10.
  • GPU: Tesla V100-PCIE-16GB

The exact Python packages in the environment during model training are in train-packages.txt.

For Developers

To install dev requirements and pre-commit hooks:

$ pip install -e ".[dev]"

Install pre-commit hooks if developing and contributing:

$ pre-commit install

About

Synthesis of percussion sounds using sinusoidal modelling, DDSP noise synthesis, and a neural source filter approach.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published