Align Larger Audio File #13

rajibul97 · 2021-11-03T02:29:24Z

Hi @cschaefer26,
You have done nice job. I'm using your repo. But while aligning larger audio (> 1 minute) with its character (phone) sequence at inference period, the number of predicted values in duration file (. npy file) does not match with the number of characters (phones) that I input with the audio file. What is the problem here? I want to use pretrained model (trained on bangla dataset [audio, phoneme sequence] ) for phoneme duration prediction.So accuracy is a major concern for me.

Note that: While training, I have used 10-15 second larger audio files and corresponding transcriptions (phoneme sequences). And I customized your code (preprocess.py and extract_durations.py) to fit the inference for single audio and its transcription.

cschaefer26 · 2021-11-05T13:30:14Z

Hi, did you ensure that all the audio files were preprocessed before training? Because the preprocessing builds up a phoneme sett from the training data. I'd suspect that you apply the model to new files with unknown phonemes that get filtered out (that's just a guess).

rajibul97 · 2021-12-12T03:00:24Z

Hi @cschaefer26 , your guess is correct. I applied the model with new files containing unknown phonemes. Thanks for your reply. However, when I want to align an audio (with intermediate silences which are actually inherent) and its phoneme sequence, the accuracy of predicted durations for phones is quite low. As intermediate silence parts are merged with phones' duration. Any suggestion please......

rajibul97 closed this as completed Dec 5, 2021

rajibul97 reopened this Dec 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align Larger Audio File #13

Align Larger Audio File #13

rajibul97 commented Nov 3, 2021 •

edited

Loading

cschaefer26 commented Nov 5, 2021

rajibul97 commented Dec 12, 2021

Align Larger Audio File #13

Align Larger Audio File #13

Comments

rajibul97 commented Nov 3, 2021 • edited Loading

cschaefer26 commented Nov 5, 2021

rajibul97 commented Dec 12, 2021

rajibul97 commented Nov 3, 2021 •

edited

Loading