You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to manually implement this in another language. Can you confirm this codebase is the one that produced the results in "Speech Denoising without Clean Training Data: a Noise2Noise Approach"? There are a few discrepancies I've noticed so far:
A model complexity of (45//1.414) would seem to result in 31 encoder channels (and 62 for deeper ones), rather than 32 as described.
The complex batchnorm module seems to implement batchnorm separately on real and imaginary components of the complex number, rather than using the whitening approach described in “Deep complex networks"
Similarly the masking process seems to multiply real and imaginary components of the spectrogram separately rather than using complex multiplication.
I appreciate any insight you may have about these points.
The text was updated successfully, but these errors were encountered:
Hi,
I'm trying to manually implement this in another language. Can you confirm this codebase is the one that produced the results in "Speech Denoising without Clean Training Data: a Noise2Noise Approach"? There are a few discrepancies I've noticed so far:
I appreciate any insight you may have about these points.
The text was updated successfully, but these errors were encountered: