Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[$$ BOUNTY] Add Phi-2 (2.7B) Model to TT-Buda Model Demos #21

Closed
Shubhamsaboo opened this issue Mar 5, 2024 · 14 comments
Closed

[$$ BOUNTY] Add Phi-2 (2.7B) Model to TT-Buda Model Demos #21

Shubhamsaboo opened this issue Mar 5, 2024 · 14 comments
Assignees
Labels

Comments

@Shubhamsaboo
Copy link
Contributor

Shubhamsaboo commented Mar 5, 2024

Background:

TT-Buda, developed by Tenstorrent, is a growing collection of model demos showcasing the capabilities of AI models running on Tenstorrent hardware. These demonstrations cover a wide range of applications, aiming to provide insights and inspiration for developers and researchers interested in advanced AI implementations.

Bounty Objective:

We are excited to announce a bounty for contributing a new AI model demonstration to the TT-Buda repository. This is an opportunity for AI enthusiasts, researchers, and developers to showcase their skills, contribute to cutting-edge AI research, and earn rewards.

Task Details:

Integrate Phi-2 (2.7B) into the TT-Buda demonstrations.

Requirements:

  • The submission must include a comprehensive README.md detailing the model's architecture, implementation details, and usage instructions.
  • The model should be fully functional and tested on Tenstorrent hardware, ensuring compatibility and performance optimization.
  • Include sample inputs and outputs, demonstrating the model's capabilities.
  • Provide documentation on any dependencies and installation procedures.

Contribution Guidelines:

  • Fork the TT-Buda model demos repository.
  • Create a new directory within the model_demos folder following the naming convention: model_yourModelName.
  • Ensure your code adheres to the coding standards and guidelines provided in the repository's CONTRIBUTING.md file.
  • Submit a pull request with a detailed description of your model and any relevant information that will help reviewers understand and evaluate your contribution.

Evaluation Criteria:

  • Innovation and Relevance: How does the model contribute new ideas or solutions? Is it relevant to current challenges in AI?
  • Implementation Quality: Code readability, structure, and adherence to best practices.
  • Performance: Efficiency and performance on Tenstorrent hardware.
  • Documentation: Clarity and completeness of the accompanying documentation.

Rewards:

Contributions will be evaluated by the Tenstorrent team, and the best contribution will be eligible for $500 cash bounty.

Get Started with Grayskull DevKit

Dive into AI development with the Grayskull DevKit, your gateway to exploring Tenstorrent's hardware. Paired with TT-Buda and TT-Metalium software approaches, it offers a solid foundation for AI experimentation. Secure your kit here.

Connect on Discord

Join our Discord to talk AI, share your journey, and get support from the Tenstorrent community and team. Let's innovate together!

@Shubhamsaboo Shubhamsaboo added the good first issue Good for newcomers label Mar 5, 2024
@tenstorrent tenstorrent deleted a comment from algora-pbc Mar 14, 2024
@tenstorrent tenstorrent deleted a comment from Lemmynjash Mar 19, 2024
@EwoutH
Copy link

EwoutH commented Apr 23, 2024

phi-3-mini is now released:

The 3.8 billion parameter model should easily fit on the 8GB LPDDR4 if quantized to 8-bit. Maybe the bounty could be updated for Phi-3.

@EwoutH
Copy link

EwoutH commented Apr 23, 2024

Phi-3-mini weights are released: https://huggingface.co/collections/microsoft/phi-3-6626e15e9585a200d2d761e3

@notashes
Copy link

hello, would love to give it a shot. could you please assign me to the issue? (both phi2 or phi3 works)

@Shubhamsaboo
Copy link
Contributor Author

Sounds good, done @edgerunnergit!

I recommend you to check with the community members on Discord, if somebody is actually working on it. Would help you with the start or you can even tag team to solve this challenge.

@Shubhamsaboo Shubhamsaboo added bounty and removed good first issue Good for newcomers labels Apr 26, 2024
@EwoutH
Copy link

EwoutH commented May 7, 2024

@edgerunnergit do you have something to share at this point already?

@notashes
Copy link

notashes commented May 8, 2024

@EwoutH I had a rough look but couldn't actually start yet.
I would still love to give it a go. However I'm blocked till friday for final exams.

@EwoutH
Copy link

EwoutH commented May 21, 2024

Microsoft released some new Phi-3 models!

The vision model is only 4.15B params so that should be able to run on 8GB Grayskull cards when quantized to 8 bit.

Medium is likely too large (14B), but small (7.39B) might fit using the block floating point format Grayskull supports, BFP4 (see also #59).

@mvkvc
Copy link

mvkvc commented May 22, 2024

Got my Grayskull e75 this week, started working on phi-3-mini-4k and will share any progress.

@EwoutH
Copy link

EwoutH commented Jul 20, 2024

Phi-3-mini had a June 2024 Update, which makes it even more performant than before. See benchmarks for the 4k and 128k variants.

@mvkvc mvkvc mentioned this issue Jul 21, 2024
@Shubhamsaboo
Copy link
Contributor Author

FYI: @jush has opened the PR for adding Phi-2 and will be able to add it to the model demo. Let me know if any of you are working on Phi-3.

@JushBJJ
Copy link
Contributor

JushBJJ commented Jul 26, 2024

PR for Phi 2:

#117

@Shubhamsaboo Shubhamsaboo assigned JushBJJ and unassigned notashes Jul 29, 2024
@Shubhamsaboo
Copy link
Contributor Author

Claimed by @JushBJJ. It will be closed once it's merged into main.

Congrats Jush!

@EwoutH
Copy link

EwoutH commented Sep 9, 2024

Awesome news!

Would also really love phi-3.5-mini-instruct support, as a potential next goal. It's 3.82B params (so a bit bigger than the 2.78B Phi-2), but also way more capable, with really impressive results for such a small model.

But great to hear that "small" LLMs are now running on Tenstorrent hardware! Very curious how fast it is (tokens/s) at which power level (watt) and efficiency (tokens/watt).

@Shubhamsaboo
Copy link
Contributor Author

Thanks @EwoutH. Will be looking into that for future bounties.

Closing this one as merged to main. Congrats again @JushBJJ.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants