Skip to content

Glossary

Mateusz Korzeniowski edited this page Apr 12, 2018 · 3 revisions

Audio representation glossary

  • Audio tag - ID3 tag embedded in audio files (MP3, FLAC etc.; WAV does not support ID3 tags) containing info like artist name, song title, genre etc.; details: https://en.wikipedia.org/wiki/ID3
  • Audio meta - audio stream metadata; contains information on sample rate, bit depth, frames count, file size etc.; when it comes to MP3 file, contains also bitrate (kb/s)
  • Sample rate - number of frames per second; see Sampling
  • Audio frame - single point of audio-level measurement in time-domain
  • Bit depth - number of bits for storing audio-wave level; standard: 16 bit; see Audio bit depth

VAMP plugin glossary

  • VAMP plugin - system library that does direct feature extraction; see Vamp plugins page; plugin have single set of parameters defined
  • VAMP plugin key - string in format "plugin_vendor:plugin_name"; single plugin contains multiple outputs
  • VAMP plugin full key - audiopyle-specific string in format "plugin_vendor:plugin_name:plugin_output" - analysis is can be done only using this combination, so from Audiopyle perspective, single analysis consists of full-key combination, audio file name and plugin config
  • Plugin config - set of parameters passed to the vampy library and further to the VAMP plugin; contains two mandatory parameters (block_size and step_size, both fallbacks to plugins preferred values in case of valueof 0)
  • Block size / window size - size of a block (in frames count) analyzed in single step by VAMP plugin
  • Block size increment / window increment - frames count by which block size is incremented between steps

VAMP feature glossary

  • Audio feature - raw feature from VAMP plugin output, wrapped in audiopyle-specific class abstraction
  • Variable step feature - one of two feature types from VAMP plugin output; basically, list of dictionaries, each one with timestamp, value and label (optionally)
  • Constant step feature - second feature type; dictionary containing feature_step value (seconds between subsequent measures) and vector or matrix of measured values

Signal processing glossary

  • Constant-Q, constant-Q transform - transforms a signal data series to the frequency domain; think of it as of these fancy frequency-bar-charts in your music player
  • Spectrogram - 3D chart containing time-frequency-volume of a song
  • RMS - the peak value is the highest voltage that the waveform will ever reach, like the peak is the highest point on a mountain. The RMS (Root-Mean-Square) value is the effective value of the total waveform. It is really the area under the curve.
  • BPM - beats per minute, used to express song tempo
  • tuning frequency - ?
  • chromagram - ?

Toolbox:

Clone this wiki locally