Slaying OOMs - Mark Saroufim & Jane Xu, Meta
INT8 mixed
Lecture 6 Optimizing Optimizers
- https://www.youtube.com/watch?v=hIop0mWKPHc
- https://docs.google.com/presentation/d/13WLCuxXzwu5JRZo0tAfW0hbKHQMvFw4O/edit
Others
- https://github.com/gpu-mode/resource-stream#cuda-grandmasters
- https://srush.github.io/annotated-mamba/hard.html
- https://github.com/gpu-mode/triton-index
weighted_cross_entropy and weighted_token_accuracy
- https://github.com/imoneoi/openchat/blob/master/ochat/models/unpadded_mistral.py#L44-L51
- https://www.tensorflow.org/api_docs/python/tf/nn/weighted_cross_entropy_with_logits
Everything you need to know about Fine-tuning and Merging LLMs: Maxime Labonne