-
Notifications
You must be signed in to change notification settings - Fork 198
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
float8 training: add static scaling (#760)
Summary: This is useful for things such as: * activation_with_bounded_range -> linear (can set static scale to activation range) * bounding weight scales to known quantities if the modeling user can guarantee their magnitude throughout training We don't have signal yet that this is useful for production things, but it would be good to land this to enable easy experimentation. Test Plan: Unit and integration tests pass: ``` ./test/test_everything.sh // note that there is a failure in `test_fsdp2.py` which is present on main ``` Use float8 profiling script to see GPU kernel time go down as we enable static scaling on a toy model: https://gist.github.com/vkuzo/b2cf46f7cccb691125566873859ca39d Reviewers: Subscribers: Tasks: Tags:
- Loading branch information
Showing
12 changed files
with
449 additions
and
61 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.