[docs] revamp documentation (#17)

Co-authored-by: Sean Hughes <[email protected]>
ServiceNow · Nov 18, 2024 · 5eeef37 · 5eeef37
1 parent f4053af
commit 5eeef37
Show file tree

Hide file tree

Showing 43 changed files with 1,415 additions and 762 deletions.
diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -108,6 +108,7 @@ echo "=== END OF ENVIRONMENT INFORMATION ==="
 # 📝 Additional Context
 
 Include any other information that may help us understand the issue, such as:
+
 - Recent changes to the configuration or code.
 - Whether the issue occurs consistently or intermittently.
 - Any troubleshooting steps you have already tried.
diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
@@ -25,40 +25,45 @@ List the key changes introduced in this PR:
 1. Change A
 2. Change B
 
-# ✅ Checklist
+## ✅ Checklist
 
 Make sure the following tasks are completed before submitting the PR:
 
-### General:
-- [ ] 📜 I have read and followed the [contributing guidelines](CONTRIBUTING.md).
+### General
+
+- [ ] 📜 I have read and followed the [contributing guidelines](https://servicenow.github.io/Fast-LLM/developers/contributing).
+- [ ] 🏷️ I am using a clear and descriptive PR title that summarizes the key change or feature introduced.
 - [ ] 🎉 The functionality is complete, and I have tested the changes.
 - [ ] 📝 I have updated the documentation if needed.
 - [ ] ⚠️ The change does not introduce any new issues (e.g., runtime warnings, type checker errors, linting problems, unhandled edge cases).
 - [ ] 🧩 I have commented my code, especially in hard-to-understand areas.
 
-### Dependencies and Configuration:
+### Dependencies and Configuration
+
 - [ ] 🐋 I have updated the Docker configuration or dependencies, if applicable.
 - [ ] 🔄 I have ensured compatibility with the existing setup after dependency changes.
 
-### Testing:
+### Testing
+
 - [ ] 🧪 I have added or updated tests to cover my changes.
 - [ ] ✔️ New and existing tests pass locally with my changes.
 - [ ] 🚦 I have tested these changes on GPUs and verified training stability.
 - [ ] 🏋️ I have tested the changes on realistic training workloads, if applicable.
 
-### Performance Impact:
+### Performance Impact
+
 - [ ] 📊 I have run benchmarks where applicable to evaluate the performance impact.
 - [ ] ✅ The benchmarks show no performance regression.
 - [ ] 🚀 The benchmarks indicate a potential performance improvement.
 - [ ] ⚠️ The benchmarks indicate a potential performance degradation.
 - [ ] 📈 I have provided benchmark results and detailed any performance impact below, if applicable.
 
-# 📊 Performance Impact Details
+## 📊 Performance Impact Details
 
 If there is any impact on performance, describe it and provide benchmark results, if applicable:
 
 ---
 
-# 📝 Additional Notes
+## 🗒️ Additional Notes
 
 Include any additional context, information, or considerations here, such as known issues, follow-up tasks, or backward compatibility concerns.
diff --git a/.gitignore b/.gitignore
@@ -8,6 +8,7 @@ __pycache__/
 
 # Doc build
 .cache
+site
 
 # Distribution / packaging
 *.egg-info/
@@ -27,3 +28,11 @@ venv.bak/
 # Project specifics
 /.idea/
 /.vscode/
+
+# Devenv
+.devenv*
+devenv.local.nix
+devenv.*
+
+# direnv
+.direnv
diff --git a/.markdownlint.yaml b/.markdownlint.yaml
@@ -0,0 +1,35 @@
+# See https://github.com/DavidAnson/markdownlint/blob/v0.32.1/schema/.markdownlint.yaml for schema documentation
+
+# Default state for all rules
+default: true
+
+# MD007/ul-indent : Unordered list indentation : https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md007.md
+MD007:
+  # Spaces for indent
+  indent: 2
+
+# MD010/no-hard-tabs : Hard tabs : https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md010.md
+MD010:
+  # Include code blocks
+  code_blocks: false
+  # Fenced code languages to ignore
+  ignore_code_languages: []
+  # Number of spaces for each hard tab
+  spaces_per_tab: 2
+
+# MD013/line-length : Line length : https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md013.md
+MD013: false
+
+# MD024/no-duplicate-heading : Multiple headings with the same content : https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md024.md
+MD024: false
+
+# MD030/list-marker-space : Spaces after list markers : https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md030.md
+MD030:
+  # Spaces for single-line unordered list items
+  ul_single: 1
+  # Spaces for single-line ordered list items
+  ol_single: 1
+  # Spaces for multi-line unordered list items
+  ul_multi: 1
+  # Spaces for multi-line ordered list items
+  ol_multi: 1
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -48,3 +48,7 @@ repos:
         args:
             - "--config"
             - "./pyproject.toml"
+-   repo: https://github.com/markdownlint/markdownlint
+    rev: v0.11.0
+    hooks:
+    -   id: markdownlint
diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md
@@ -1,8 +1,8 @@
-### ServiceNow Open Source Code-of-Conduct
+# ServiceNow Open Source Code-of-Conduct
 
 This code of conduct provides guidelines for participation in ServiceNow-managed open-source communities and projects.
 
-**Discussion forum guidelines**
+## Discussion forum guidelines
 
 Communities thrive when members support each other and provide useful feedback.
 
@@ -11,12 +11,12 @@ Communities thrive when members support each other and provide useful feedback.
 - User Contributions must not include material that is defamatory, obscene, indecent, abusive, offensive, harassing, violent, hateful, inflammatory or otherwise objectionable.
 - Lively and collegial discussions are always encouraged in a healthy community. It is okay to argue facts but not okay to argue personalities or personal beliefs.
 - Do not use text formats such as all caps or bold that may be read as annoying, rude or send a strong message.
-- Do not publish anyone’s private personal information without their explicit consent.
+- Do not publish anyone's private personal information without their explicit consent.
 - Avoid using abbreviations or terminology that others may not understand. An abbreviation may mean something to you but in another context or country, it may have another meaning.
 - Be accountable for your actions by correcting your mistakes and indicating where you have changed a previous post of yours.
 - Mark content as correct and helpful, and provide feedback. If you read a discussion post that you find helpful, we encourage you to leave a positive vote and comment in the replies. If you find a post that is unhelpful, please provide more information in the issue comments.
 
-**Issue board guidelines**
+## Issue board guidelines
 
 Many open-source projects provide an Issues board, with similar functionality to a Discussions forum. The same rules from the discussion forum guidelines apply to the Issues board.
 
@@ -25,22 +25,22 @@ ServiceNow suggests the following technical support pathways for open-source pro
 1. Clearly identify and document the issue or question you have.
 2. View the Documentation.
 3. Search the Discussions.
-4. Search the project knowledge base or Wiki for known errors, useful solutions, and troubleshooting tips.
-5. Check the project guidelines in the [`CONTRIBUTING.md`](CONTRIBUTING.md) file if you would like details on how you can submit a change. Community contributions are valued and appreciated!
-6. Log an Issue if it hasn’t already been logged. If the issue has already been logged by another user, vote it up, and add a comment with additional or missing information. Do your best to choose the correct category when logging a new issue. This will make it easier to differentiate bugs from new feature requests or ideas. If after logging an issue you find the solution, please close your issue and provide a comment with the solution. This will help the project owners and other users.
+4. Search the project documentation for known errors, useful solutions, and troubleshooting tips.
+5. Check the project contribution guidelines if you would like details on how you can submit a change. Community contributions are valued and appreciated!
+6. Log an Issue if it hasn't already been logged. If the issue has already been logged by another user, vote it up, and add a comment with additional or missing information. Do your best to choose the correct category when logging a new issue. This will make it easier to differentiate bugs from new feature requests or ideas. If after logging an issue you find the solution, please close your issue and provide a comment with the solution. This will help the project owners and other users.
 7. Contact the project team contributors of the project to see if they can help as a last resort only.
 
-**Repositories**
+## Repositories
 
 - Read and follow the license instructions
-- Remember to include citations if you use someone else’s work in your own project. Use the [`CITATION.cff`](CITATION.cff) to find the correct project citation reference.
-- ‘Star’ project repos to save for future reference.
-- ‘Watch’ project repos to get notifications of changes – this can get noisy for some projects, so only watch the ones you really need to track closely.
+- Remember to include citations if you use someone else's work in your own project. Use the [`CITATION.cff`](CITATION.cff) to find the correct project citation reference.
+- ‘Star' project repos to save for future reference.
+- ‘Watch' project repos to get notifications of changes – this can get noisy for some projects, so only watch the ones you really need to track closely.
 
-**Enforcement and reporting**
+## Enforcement and reporting
 
-We encourage community members and users to help each other and to resolve issues amongst themselves as much as possible. If a matter cannot be resolved in good faith within the means available, please reach out to a team member or email [email protected].
+We encourage community members and users to help each other and to resolve issues amongst themselves as much as possible. If a matter cannot be resolved in good faith within the means available, please reach out to a team member or email [[email protected]](mailto:[email protected]).
 
-**ServiceNow Disclaimer.**
+## ServiceNow Disclaimer
 
 We may, but are under no obligation to, monitor or censor comments made by users or content provided by contributors and we are not responsible for the accuracy, completeness, appropriateness or legality of anything posted, depicted or otherwise provided by third‑party users and we disclaim any and all liability relating thereto.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -1,62 +1,3 @@
-# Contributing to Fast-LLM 🚀
+# Contributing to Fast-LLM
 
-Thank you for your interest in contributing to Fast-LLM! We're thrilled to have you here, and your support is invaluable in helping us accelerate LLM training to full speed. This guide will walk you through the steps to contribute, from reporting issues to submitting changes and setting up your development environment.
-
-If you have questions or want to start a discussion, feel free to [open a discussion](https://github.com/ServiceNow/Fast-LLM/discussions) on our GitHub page.
-
-## Getting Started
-
-To get started with contributing to Fast-LLM, follow these steps to set up your environment:
-
-1. **Set Up the Development Environment**: Fast-LLM is built on [PyTorch](https://pytorch.org/) and [Triton](https://triton-lang.org/). Check out our [setup guide](https://servicenow.github.io/Fast-LLM/development/setup) for instructions on getting everything ready, including the development environment and dependencies.
-2. **Learn Our Best Practices**: Get familiar with our [development best practices](https://servicenow.github.io/Fast-LLM/development/dev-practices/), which cover code style, pre-commit hooks, and testing strategies.
-3. **Launch Fast-LLM Locally or with Docker**: Need help getting started? Follow the instructions in the [launching section](https://servicenow.github.io/Fast-LLM/development/launching) to get Fast-LLM up and running.
-
-## How to Report a Bug 🐞
-
-Found a bug? Let's squash it together! [Open an issue](https://github.com/ServiceNow/Fast-LLM/issues/new/choose) and select "Bug report." Please include as much information as possible:
-
-- Steps to reproduce the issue.
-- What you expected to happen versus what actually happened.
-- Logs, Fast-LLM configuration, and error messages.
-- Details about your environment setup (e.g., CUDA hardware, PyTorch version, CUDA version).
-
-If you're familiar with the codebase, consider adding a failing unit test to demonstrate the problem (optional, but helpful!).
-
-## Proposing Changes
-
-Before diving into code, [open an issue](https://github.com/ServiceNow/Fast-LLM/issues) to discuss your proposal. This is especially important if you're planning significant changes or adding new dependencies. Once your idea is approved, follow these steps:
-
-1. **Fork the Repository**: [Fork Fast-LLM](https://github.com/ServiceNow/Fast-LLM/fork) to your own GitHub account.
-2. **Clone Your Fork Locally**: Use `git clone` to bring the code to your local machine.
-3. **Create a New Branch**: Name your branch descriptively, such as `feature/awesome-feature` or `fix/nasty-bug`.
-4. **Make Your Changes**: Work your magic! Don't forget to add or update tests, benchmarks, or configurations as needed.
-5. **Create a Properly Titled Pull Request**: When you're ready to open a PR, make sure to use a clear and descriptive title that follows our [PR title guidelines](https://servicenow.github.io/Fast-LLM/development/pr-title-guidelines). This title will become the commit message for the squashed merge.
-6. **Push to Your Fork**: Push the branch to your GitHub fork.
-7. **Open a Pull Request**: [Submit a pull request](https://github.com/ServiceNow/Fast-LLM/compare) to the `main` branch. Reference the original issue number and provide a brief summary of your changes.
-
-### Guidelines for a Successful Pull Request
-
-Here are some tips to ensure your pull request gets reviewed and merged promptly:
-
-- **Follow our coding standards**: Stick to our [development best practices](https://servicenow.github.io/Fast-LLM/development/dev-practices/) to keep the code clean and consistent.
-- **Write tests**: Verify your changes with unit tests for new features or bug fixes.
-- **Test on GPUs and real-world workloads**: Since Fast-LLM is all about training large language models, make sure your changes work smoothly in GPU environments and on typical training setups.
-- **Run benchmarks and performance tests**: Make sure your changes don't slow things down. If there's any impact on performance, provide benchmark results to back it up.
-- **Avoid introducing new issues**: Check that there are no new runtime warnings, type checker errors, linting problems, or unhandled edge cases.
-- **Comment non-trivial code**: Make your code easy to understand for others.
-- **Keep sensitive data out**: Make sure your code or commit messages don't expose private or proprietary information.
-- **Use the [PR template](https://github.com/ServiceNow/Fast-LLM/blob/main/.github/PULL_REQUEST_TEMPLATE.md)**: Complete the checklist to make sure everything is in order before hitting submit.
-
-## Seeking Help or Clarification
-
-If you're unsure about something or need help, you've got options:
-
-- **GitHub Discussions**: [Start a discussion](https://github.com/ServiceNow/Fast-LLM/discussions) if you need advice or just want to chat.
-- **Project Maintainers**: Mention a maintainer in an issue or pull request if you need a review or guidance.
-
-## Contributors
-
-We're grateful for all the awesome contributors who help make Fast-LLM better. Join our contributors' list and make your first contribution!
-
-To learn more about the team and maintainers, visit our [About page](https://servicenow.github.io/Fast-LLM/about-us/).
+Please refer to the [contributing guidelines](https://servicenow.github.io/Fast-LLM/developers/contributing) for more information on how to contribute to Fast-LLM.
diff --git a/README.md b/README.md
@@ -14,7 +14,11 @@ Made with ❤️ by [ServiceNow Research][servicenow-research]
 
 ## Overview
 
-Fast-LLM is a new open-source library for training large language models, built on [PyTorch][pytorch] and [Triton][triton]. It is extremely fast, scales to large clusters, supports a wide range of model architectures, and is easy to use. Unlike commercial frameworks like Megatron-LM, which are largely closed off and fragmented across forks, Fast-LLM is fully open-source and encourages community-driven development. Researchers can freely customize and optimize as needed, making it a flexible and hackable alternative that combines the speed of specialized tools with the openness of libraries like [Hugging Face Transformers][transformers].
+Fast-LLM is a cutting-edge open-source library for training large language models with exceptional speed, scalability, and flexibility. Built on [PyTorch][pytorch] and [Triton][triton], Fast-LLM empowers AI teams to push the limits of generative AI, from research to production.
+
+Optimized for training models of all sizes—from small 1B-parameter models to massive clusters with 70B+ parameters—Fast-LLM delivers faster training, lower costs, and seamless scalability. Its fine-tuned kernels, advanced parallelism techniques, and efficient memory management make it the go-to choice for diverse training needs.
+
+As a truly open-source project, Fast-LLM allows full customization and extension without proprietary restrictions. Developed transparently by a community of professionals on GitHub, the library benefits from collaborative innovation, with every change discussed and reviewed in the open to ensure trust and quality. Fast-LLM combines professional-grade tools with unified support for GPT-like architectures, offering the cost efficiency and flexibility that serious AI practitioners demand.
 
 > [!NOTE]
 > Fast-LLM is not affiliated with Fast.AI, FastHTML, FastAPI, FastText, or other similarly named projects. Our library's name refers to its speed and efficiency in language model training.
@@ -25,7 +29,7 @@ Fast-LLM is a new open-source library for training large language models, built
     - ⚡️ Optimized kernel efficiency and reduced overheads.
     - 🔋 Optimized memory usage for best performance.
     - ⏳ Minimizes training time and cost.
-  
+
 2. 📈 **Fast-LLM is Highly Scalable**:
     - 📡 Distributed training across multiple GPUs and nodes using 3D parallelism (Data, Tensor, and Pipeline).
     - 🔗 Supports sequence length parallelism to handle longer sequences effectively.
@@ -49,7 +53,7 @@ Fast-LLM is a new open-source library for training large language models, built
 
 5. 🌐 **Fast-LLM is Truly Open Source**:
     - ⚖️ Licensed under [Apache 2.0][license] for maximum freedom to use Fast-LLM at work, in your projects, or for research.
-    - 💻 Fully developed on GitHub with a public [roadmap][roadmap] and transparent [issue tracking][issues].
+    - 💻 Transparently developed on GitHub with public [roadmap][roadmap] and [issue tracking][issues].
     - 🤝 Contributions and collaboration are always welcome!
 
 ## Usage

diff --git a/SECURITY.md b/SECURITY.md
@@ -16,7 +16,7 @@ If you find a vulnerability in ServiceNow systems, products, or network infrastr
 If you find a vulnerability in this open-source project published by the ServiceNow Research team, please email [[email protected]](mailto:[email protected]) to report your findings.
 
 We will process your report as soon as possible, depending on the severity of your report. We appreciate everyone's help in disclosing vulnerabilities in a responsible manner.
- 
+
 ## Guidelines
 
 Please follow the guidelines below when [disclosing vulnerabilities](https://www.servicenow.com/company/trust/privacy/responsible-disclosure.html):