Releases: trackmania-rl/tmrl
Release 0.6.6
Release 0.6.6
Version 0.6.6
fixes bugs in the continual learning and continual evaluation setting, and updates the competition tutorial so that its default hyperparameters make sense with the current reward scaling of the TrackMania environment.
- It is now possible to use infinite numbers of episodes or samples
- The SAC hyperparameters in the competition tutorial have been adjusted to be meaningful, wandb logging has also been deactivated by default in this script (uncomment if needed)
- The competition evaluation script now works properly
Release 0.6.5
Minor release 0.6.5
Release 0.6.5 removes the broken dependency on the keyboard
library.
If you wish to manually decide where to start and stop reward recording in TrackMania, install the keyboard
library manually (pip install keyboard
) and use the --use-keyboard
modifier on top of --record-reward
. Then you can press e
to start recording and q
to stop recording. If you don't use this modifier, reward recording starts when you launch the script (wait for the confirmation to display before you start driving) and ends when you cross the finish line.
Release 0.6.4
Minor release 0.6.4
Fixes #94
Release 0.6.3
Minor release 0.6.3
This release fixes TrackMania tools for Linux.
--check-environment
now works for LIDAR on Linux--record-reward
does not usekeyboard
anymore on Linux
Release 0.6.2
Minor release 0.6.2
Release 0.6.2
introduces a mechanism for automatically healing the tmrl
installation when the TmrlData
folder when it is missing, as pip sometimes silently fails to download it during installation.
Release 0.6.1
Release 0.6.1
- Same as
0.6.1
but fixed the PyPI installer.
Release 0.6.0
Major release 0.6.0
This release introduces support for non-real-time environments in the TMRL library, and support for Linux in the TrackMania pipeline.
Version 0.6.0
is backward-incompatible and requires a clean installation.
Major changes
- TrackMania example pipeline (see
config.json
):- Support for Linux
- Support for saving replays automatically
- Support for reward shaping
- TMRL library:
- Support for non-real-time environments and Trainer/Worker synchronization
- Generic training pipeline (in particular, introduced a generic
Memory
class for lazy developers, compatible with random sampling in 1-step TD learning)
Minor changes
- TrackMania example pipeline (see
config.json
):- More fine-tuning options for SAC
- Set the default Adam Betas to the RL-compatible setting described by Mahmood et al. 2023 in an attempt to avoid policy collapse
- The default hyperparameters changed for full vision-based training instead of LIDAR training
Release 0.5.3
Compatibility update 0.5.3
This update introduces compatibility fixes with latest versions of libraries such as pandas
and torch
.
Linux users: patience, tmrl 0.6.0
is on its way... 😎
Release 0.5.2
Minor release
Release 0.5.2 fixes a bug in the SAC implementation provided in version 0.5.1, which had been inadvertently pushed from a development branch.
Note: this version introduces a new way of recording replays in TrackMania, but this is not officially supported yet and won't work without the corresponding OpenPlanet script. If you want to use this feature before the next version is out, please contact us.
Release 0.5.1
Release 0.5.1
This release complies with rtgym>=0.9
, which in turns complies with the gymnasium
signature of the reset
function.
In case you are using custom rtgym
interfaces in tmrl
, you will want to update your reset
implementations. This is straightforward, you can just replace:
def reset(self):
with:
def reset(self, seed=None, options=None):