You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Due to the immense action space, placement is actually a difficult RL learning task, so it is common to face such situation. There are several possible reasons for your curve, such as lack of pre-training or overfitting. To make training more stable, tuning hyperparameters such as reward funtion and learning rate may be helpful as well.
when I train, it do not seem to coverge
The text was updated successfully, but these errors were encountered: