-
Notifications
You must be signed in to change notification settings - Fork 536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for automatically registering models to UC at the end of training #618
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What scale of models can this take? What happens if I try shoving in a 30b?
Should this be here instead of the mlflow logger as a public fn that this callback calls?
@mvpatel2000 large models work. at least 30b, haven't had cluster space to run 70b yet. this does call the mlflow logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok lgtm
fixed typing for mosaicmllogger fixed formatting, moved logging logic out of logger builder removed callback moved all to train.py Convert to DataSpec and add token counts that include padding (#676) Add support for automatically registering models to UC at the end of training (#618) removed callback from init removed callback from init removed generate callback sorted imports formatting changes
Adds support for directly logging models to mlflow during training for easy deployment on databricks model serving. Tested and was able to deploy both an MPT model and a llama model succesfully.
along for the ride: fixing the hf checkpointer to work with non fsdp models as well