-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some question about train #8
Comments
The first error sounds like some sort of hardware, driver, or pytorch error. It is probably unrelated to the code of this repository - maybe check your CUDA and pytorch installations. |
It seems you loaded a false model, the output should be 80, whice matches the num_speech_features |
I encounter the problem when I reproduce the normalizers.pkl by running make_normalizers() in read_emg.py. Obviously, doing so resulted in the pkl being different from the original files in the repository . Do you know why this is? Thanks for your contribution! @dgaddy |
It's been quite a while so I don't really remember, but it's possible I may have manually adjusted the normalizers to scale down the size of the inputs or outputs. Sometimes larger values for inputs or outputs can make training less stable. You could try adjusting them and see if that helps. (Inputs seems more likely to help. You would want to increase the normalizer feature_stddevs values to decrease the feature scales. Multiplying by something like 2 or 5 seems reasonable. It might also help to compare the values in your normalizers file vs the one in the repository.) |
Epoch 1, Batch 3, Loss: 7.225614070892334
Train step: 2it [00:05, 2.95s/it]
Traceback (most recent call last):
File "/mnt/e/code/silent_speech/transduction_model.py", line 365, in
main()
File "/mnt/e/code/silent_speech/transduction_model.py", line 361, in main
model = train_model(trainset, devset, device, save_sound_outputs=save_sound_outputs)
File "/mnt/e/code/silent_speech/transduction_model.py", line 260, in train_model
loss.backward() # 反向传播
File "/home/ffy/anaconda3/envs/ffy112/lib/python3.9/site-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/home/ffy/anaconda3/envs/ffy112/lib/python3.9/site-packages/torch/autograd/init.py", line 200, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA error: unknown error
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.What problem did I encounter? I lowered the size of the batch, but it didn't work and the error still occurred
The text was updated successfully, but these errors were encountered: