Tutel v0.1.1
What's New in v0.1.1:
- Enable fp16 support for AMDGPU.
- Using NVRTC for JIT compilation if available.
- Add new system_init interface for initializing NUMA settings in distributed GPUs.
- Extend more gating types: Top3Gate & Top4Gate.
- Allow high level to change capacity value in Tutel fast dispatcher.
- Add custom AllToAll extension for old Pytorch version without builtin AllToAll operator support.
How to Setup:
python3 -m pip install --user https://github.com/microsoft/tutel/archive/refs/tags/v0.1.1.tar.gz
Contributors: @jspark1105 , @ngoyal2707 , @guoshzhao, @ghostplant .