You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 1, 2021. It is now read-only.
I want to run 6 GPUs which will start 6 luajit jobs. However, the system only starts 5 GPUs sometimes. Currently, I will restart the training at this moment. Do you have any idea?
Thank you,
Chien-Lin
The text was updated successfully, but these errors were encountered:
We found the reason is because of "/ipc/DiscoveredTree.lua:15: ERROR: (/home/chienh/big/twitter/torch-ipc/src/cliser.c, 318): (9, Bad file descriptor)".
And, this error only happens when the server is busy on other jobs. Do you have any idea?
Hi,
I want to run 6 GPUs which will start 6 luajit jobs. However, the system only starts 5 GPUs sometimes. Currently, I will restart the training at this moment. Do you have any idea?
Thank you,
Chien-Lin
The text was updated successfully, but these errors were encountered: