Skip to content

Commit

Permalink
renaming to naive - updating readme
Browse files Browse the repository at this point in the history
  • Loading branch information
jorshi committed Jul 9, 2021
1 parent 917dc93 commit 8c526f6
Show file tree
Hide file tree
Showing 5 changed files with 14 additions and 4 deletions.
14 changes: 12 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# HEAR 2021 Baseline

A simple DSP-based audio embedding consisting of a Mel-frequency spectrogram followed
by a random projection. Serves as the baseline model for the HEAR 2021 and implements
by a random projection. Serves as the naive baseline model for the HEAR 2021 and implements
the [common API](https://neuralaudio.ai/hear2021-holistic-evaluation-of-audio-representations.html#common-api)
required by the competition evaluation.

Expand All @@ -26,6 +26,16 @@ git clone https://github.com/neuralaudio/hear-baseline.git
python3 -m pip install ./hear-baseline
```

### Naive Baseline Model
The naive baseline model produces log-scaled Mel-frequency spectrograms using a
256-band Mel filter. Each frame of the spectrogram is then projected to 4096
dimensions using a random projection matrix. Weights for the projection matrix were
generated by sampling a normal distribution and are stored in this repository in the
file `saved_models/naive_baseline.pt`.

Using a random projection is less efficient
than a CNN but is one of the simplest models to implement from a coding perspective.

### Usage

Audio embeddings can be computed using one of two methods: 1)
Expand All @@ -38,7 +48,7 @@ import torch
import hearbaseline

# Load model with weights - located in the root directory of this repo
model = hearbaseline.load_model("./baseline_weights.pt")
model = hearbaseline.load_model("saved_models/naive_baseline.pt")

# Create a batch of 2 white noise clips that are 2-seconds long
# and compute scene embeddings for each clip
Expand Down
2 changes: 1 addition & 1 deletion hearbaseline/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
from .baseline import load_model, get_scene_embeddings, get_timestamp_embeddings
from .naive import load_model, get_scene_embeddings, get_timestamp_embeddings
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion tests/test_baseline.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
get_timestamp_embeddings,
)
from hearbaseline.util import frame_audio
import hearbaseline.baseline as baseline
import hearbaseline.naive as baseline


torch.backends.cudnn.deterministic = True
Expand Down

0 comments on commit 8c526f6

Please sign in to comment.