Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Input tensor has to be 2D. - When using Web GUI demo with own audio(.mp3) #14

Open
Ztfrederickzheng opened this issue Aug 8, 2024 · 2 comments

Comments

@Ztfrederickzheng
Copy link

INFO:werkzeug:127.0.0.1 - - [09/Aug/2024 14:15:00] "POST /upload HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [09/Aug/2024 14:15:00] "GET /uploads/testing2.MP3?latency=320 HTTP/1.1" 500 -
Traceback (most recent call last):
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/flask/app.py", line 1498, in call
return self.wsgi_app(environ, start_response)
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/flask/app.py", line 1476, in wsgi_app
response = self.handle_exception(e)
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/flask/app.py", line 1473, in wsgi_app
response = self.full_dispatch_request()
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/flask/app.py", line 882, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request
rv = self.dispatch_request()
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return]
File "/home/zheng/fuchengzheng/steamspeech/StreamSpeech/demo/app.py", line 909, in uploaded_file
run(path)
File "/home/zheng/fuchengzheng/steamspeech/StreamSpeech/demo/app.py", line 836, in run
action=agent.policy()
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/zheng/fuchengzheng/steamspeech/StreamSpeech/demo/app.py", line 468, in policy
feature = self.feature_extractor(self.states.source)
File "/home/zheng/fuchengzheng/steamspeech/StreamSpeech/demo/app.py", line 100, in call
waveform, sample_rate = convert_waveform(
File "/home/zheng/fuchengzheng/steamspeech/StreamSpeech/fairseq/fairseq/data/audio/audio_utils.py", line 60, in convert_waveform
converted, converted_sample_rate = ta_sox.apply_effects_tensor(
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/torchaudio/sox_effects/sox_effects.py", line 156, in apply_effects_tensor
return sox_ext.apply_effects_tensor(tensor, sample_rate, effects, channels_first)
File "/home/zheng/anaconda3/envs/streamspeech/lib/python3.10/site-packages/torch/ops.py", line 1061, in call
return self
._op(*args, **(kwargs or {}))
RuntimeError: Input tensor has to be 2D.

@thetushargoyal
Copy link

hey, were you able to solve this issue?

@annalenahansen
Copy link

I fixed it by converting to mono using ffmpeg
ffmpeg -i output.wav -ar 16000 -ac 1 output_mono_16khz.wav

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants