RuntimeError: Input tensor has to be 2D. - When using Web GUI demo with own audio(.mp3) #14

Ztfrederickzheng · 2024-08-08T14:20:39Z

INFO:werkzeug:127.0.0.1 - - [09/Aug/2024 14:15:00] "POST /upload HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [09/Aug/2024 14:15:00] "GET /uploads/testing2.MP3?latency=320 HTTP/1.1" 500 -
Traceback (most recent call last):
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/flask/app.py", line 1498, in call
return self.wsgi_app(environ, start_response)
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/flask/app.py", line 1476, in wsgi_app
response = self.handle_exception(e)
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/flask/app.py", line 1473, in wsgi_app
response = self.full_dispatch_request()
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/flask/app.py", line 882, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request
rv = self.dispatch_request()
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return]
File "/home/zheng/fuchengzheng/steamspeech/StreamSpeech/demo/app.py", line 909, in uploaded_file
run(path)
File "/home/zheng/fuchengzheng/steamspeech/StreamSpeech/demo/app.py", line 836, in run
action=agent.policy()
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/zheng/fuchengzheng/steamspeech/StreamSpeech/demo/app.py", line 468, in policy
feature = self.feature_extractor(self.states.source)
File "/home/zheng/fuchengzheng/steamspeech/StreamSpeech/demo/app.py", line 100, in call
waveform, sample_rate = convert_waveform(
File "/home/zheng/fuchengzheng/steamspeech/StreamSpeech/fairseq/fairseq/data/audio/audio_utils.py", line 60, in convert_waveform
converted, converted_sample_rate = ta_sox.apply_effects_tensor(
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/torchaudio/sox_effects/sox_effects.py", line 156, in apply_effects_tensor
return sox_ext.apply_effects_tensor(tensor, sample_rate, effects, channels_first)
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/torch/ops.py", line 1061, in call
return self._op(*args, **(kwargs or {}))
RuntimeError: Input tensor has to be 2D.

thetushargoyal · 2024-08-26T13:54:08Z

hey, were you able to solve this issue?

annalenahansen · 2024-10-20T08:58:43Z

I fixed it by converting to mono using ffmpeg
ffmpeg -i output.wav -ar 16000 -ac 1 output_mono_16khz.wav

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Input tensor has to be 2D. - When using Web GUI demo with own audio(.mp3) #14

RuntimeError: Input tensor has to be 2D. - When using Web GUI demo with own audio(.mp3) #14

Ztfrederickzheng commented Aug 8, 2024

thetushargoyal commented Aug 26, 2024

annalenahansen commented Oct 20, 2024

RuntimeError: Input tensor has to be 2D. - When using Web GUI demo with own audio(.mp3) #14

RuntimeError: Input tensor has to be 2D. - When using Web GUI demo with own audio(.mp3) #14

Comments

Ztfrederickzheng commented Aug 8, 2024

thetushargoyal commented Aug 26, 2024

annalenahansen commented Oct 20, 2024