The sample level prediction function could be incorrect? (correct me if Im wrong) #18

guozixunnicolas · 2019-11-03T06:26:40Z

Hi there,

Thank you for your work! It's lot's of help.

But I think this code has some discrepancy with the original paper and original theano implementation and may lead to error. In original paper and code, in Sample Level prediction, sample input is partitioned into overlapping frames with length frame_size. For example, if the seq_input is (batch, seq_len), sample level input would consist of seq_input[:, 0:frame_size], seq_input[:, 1:frame_size+1], seq_input[:, 2:frame_size+2]... As a result sample level input would have shape [total_number_of_overlapping_frames(batch*seq_len), frame_size]. In the original theano implemention, function images2neibs did the work, you can find it here: https://github.com/soroushmehr/sampleRNN_ICLR2017/blob/2a3dbdf9eb00f03e64adf58e6780e2a48b9ff6dc/models/two_tier/two_tier.py#L394

I am confused whether this has been implemented in the sample_level_prediction function? I found this issue because I cannot generate useful audio when frame_size is other than 2.

Also please dont hesitate to correct me if I am wrong somewhere.

Best regards,

Nic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The sample level prediction function could be incorrect? (correct me if Im wrong) #18

The sample level prediction function could be incorrect? (correct me if Im wrong) #18

guozixunnicolas commented Nov 3, 2019

The sample level prediction function could be incorrect? (correct me if Im wrong) #18

The sample level prediction function could be incorrect? (correct me if Im wrong) #18

Comments

guozixunnicolas commented Nov 3, 2019