Skip to content

Latest commit

 

History

History
20 lines (17 loc) · 840 Bytes

README.md

File metadata and controls

20 lines (17 loc) · 840 Bytes

Simple Pooling Front-ends for Efficient Audio Classification

This repository contains the code and models of "Simple Pooling Front-ends for Efficient Audio Classification " [ICASSP 2023].

See usage in main.py:

  • frontend: mel-spectrogram computation & temporal dimension reduction (using spectral pooling)
  • backbone: CNN-based model for audio feature extraction (pre-trained model is available)
  • input: [batch, waveform]
  • embedding: [batch, T, 512]

Citation

@inproceedings{liu2022simple,
  title={Simple Pooling Front-ends For Efficient Audio Classification},
  author={Liu, Xubo and Liu, Haohe and Kong, Qiuqiang and Mei, Xinhao and Plumbley, Mark D and Wang, Wenwu},
  booktitle = {IEEE International Conference on Acoustic, Speech and Signal Procssing (ICASSP)},
  year = {2023}
}