Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial on implementing tree of thoughts(ToT) framework using a model #26726

Open
rajveer43 opened this issue Oct 11, 2023 · 23 comments
Open

Tutorial on implementing tree of thoughts(ToT) framework using a model #26726

rajveer43 opened this issue Oct 11, 2023 · 23 comments
Labels
Feature request Request for a new feature

Comments

@rajveer43
Copy link
Contributor

rajveer43 commented Oct 11, 2023

Feature request

Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices. Our experiments show that ToT significantly enhances language models' problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords. For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%. Code repo with all prompts

Motivation

a comprehensive tutorial on implementing tree of thoughts using any open source model will give users more understanding about it.

Your contribution

https://github.com/princeton-nlp/tree-of-thought-llm
https://arxiv.org/abs/2305.10601

@LysandreJik
Copy link
Member

cc @gante @patrickvonplaten @MKhalusova who have been working on a rework of our generation docs!

@gante
Copy link
Member

gante commented Oct 24, 2023

Hi @rajveer43 👋 My apologies for the delayed response, I am still catching up on notifications from my recent holidays 🤗

We would love to host a comprehensive tutorial about Tree of Thoughts! My suggestion would be to:

  1. Write a community blog post with the comprehensive tutorial (instructions on how to do it here; Example of a high-quality community blog post here). I'd be happy to review it if you're interested!
  2. We amplify it on social media, to expand its reach
  3. On the yet-to-be-created "advanced generation use cases" documentation page in transformers, we would add a very short demo, linking back to your blog post

What do you think? 🤗

@rajveer43
Copy link
Contributor Author

Hi @rajveer43 👋 My apologies for the delayed response, I am still catching up on notifications from my recent holidays 🤗

We would love to host a comprehensive tutorial about Tree of Thoughts! My suggestion would be to:

  1. Write a community blog post with the comprehensive tutorial (instructions on how to do it here; Example of a high-quality community blog post here). I'd be happy to review it if you're interested!
  2. We amplify it on social media, to expand its reach
  3. On the yet-to-be-created "advanced generation use cases" documentation page in transformers, we would add a very short demo, linking back to your blog post

What do you think? 🤗

@gante I am also excited to see a demo of Tree of Thoughts added to the "advanced generation use cases" documentation page in Transformers. I think this will be a valuable resource for the community.

I would be happy to write a comprehensive tutorial about Tree of Thoughts for the Hugging Face community blog post. I will try my best to make it as informative and helpful as possible, and I will be sure to include instructions on how to use it, as well as examples of its use cases.

Would you guide me on which model is best suited for it?.

@MKhalusova
Copy link
Contributor

Feel free to ping me for the blog post PR review (in addition to @gante ).

@gante
Copy link
Member

gante commented Oct 24, 2023

@rajveer43 if you have positive results with a 7B model, preferably a 7B model whose access is fully open (e.g. Llama 2 is NOT fully open, as it requires filling in a form), then that would be my suggestion. 7B models can be loaded by most people :)

If you have no model preference, then I'd like to point to our Zephyr model, or to have a look in the LLM leaderboard

@rajveer43
Copy link
Contributor Author

rajveer43 commented Oct 25, 2023

@rajveer43 if you have positive results with a 7B model, preferably a 7B model whose access is fully open (e.g. Llama 2 is NOT fully open, as it requires filling in a form), then that would be my suggestion. 7B models can be loaded by most people :)

If you have no model preference, then I'd like to point to our Zephyr model, or to have a look in the LLM leaderboard

7B version will be appropriate, There are basically three tasks of ToT

  1. Game of 24
  2. Creative writing
  3. crosswords

the model card state that
image

so using Zephyr will not be that much useful. some other model like mistral or Fuyu
may be a better choice.

the task in ToT is type of Text Generation and question answering`

@rajveer43
Copy link
Contributor Author

this is still under development

@huggingface huggingface deleted a comment from github-actions bot Nov 21, 2023
@rajveer43
Copy link
Contributor Author

@gante where should be the location of the tutorial?

@MKhalusova
Copy link
Contributor

@gante where should be the location of the tutorial?
Based on earlier discussion, it should be in a community blog post. More context and instructions in the comment above: #26726 (comment)

@rajveer43
Copy link
Contributor Author

@gante where should be the location of the tutorial?
Based on earlier discussion, it should be in a community blog post. More context and instructions in the comment above: #26726 (comment)

I shoul target Blog repository for the same okay got it.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@rajveer43
Copy link
Contributor Author

under work!

@huggingface huggingface deleted a comment from github-actions bot Jan 23, 2024
@huggingface huggingface deleted a comment from github-actions bot Feb 19, 2024
@huggingface huggingface deleted a comment from github-actions bot Mar 15, 2024
@amyeroberts
Copy link
Collaborator

@rajveer43 Any update on this?

@rajveer43
Copy link
Contributor Author

No, I am not working on this we can close this

@amyeroberts
Copy link
Collaborator

I'll leave it open for now, in case anyone else in the community wants to work on this, and let it close if there's no activity

@amyeroberts amyeroberts added the Feature request Request for a new feature label Mar 18, 2024
@rahulbshrestha
Copy link

rahulbshrestha commented May 15, 2024

Hey @amyeroberts! I wanted to work on this PR but I'm unclear about what is expected. If I create a Jupyter notebook, where I implement tree of thoughts from scratch, and then use it to solve a problem (e.g Game of 24), will that be enough? When will I need to use transformers in this case?

@amyeroberts
Copy link
Collaborator

When will I need to use transformers in this case?

Hi @rahulbshrestha, yes, if this is to be added to the section of advanced uses in the generation section of the docs, then it should use transformers and the library's generate API.

@rahulbshrestha
Copy link

  • Write a community blog post with the comprehensive tutorial (instructions on how to do it here; Example of a high-quality community blog post here). I'd be happy to review it if you're interested!

@gante Hi! I sent a request but haven't been added yet to the blog-explorers organization, therefore, I can't read any of the instructions. Could I be added please (my handle)?

Also, where should I place the blog? I'm thinking of creating a Jupyter notebook here: https://github.com/huggingface/blog, which I'll later change to a .md file. Thanks for the help!

@rahulbshrestha
Copy link

rahulbshrestha commented Jun 7, 2024

Hi @amyeroberts @gante @MKhalusova ! I created a draft notebook here, and I would love to get feedback :)

A couple points:

  • I observed better results with GPT-4 over Mistral-7B, so although I've mentioned both models, the experiments use GPT-4 only. Is this fine or would you prefer I only use an open-source LLM from Hugging Face?
  • I have created a Jupyter notebook, but I'll convert it to a readme.md file in the end

@amyeroberts
Copy link
Collaborator

@rahulbshrestha Thanks for sharing!

I observed better results with GPT-4 over Mistral-7B, so although I've mentioned both models, the experiments use GPT-4 only. Is this fine or would you prefer I only use an open-source LLM from Hugging Face?

An open model please!

@gante
Copy link
Member

gante commented Jun 14, 2024

@rahulbshrestha yeah, using open models is very important for full analysis of the process :) For instance, one might come up with a better strategy by looking at your blog post and by combining it with internal model variables

@LysandreJik
Copy link
Member

cc @aymeric-roucher maybe as well as relevant for Agents

@rahulbshrestha
Copy link

Hi! I used Mistral-7B and got worse results e.g in the Game of 24, the model doesn't come up with the correct solution. What should I do in this case? I don't have resources to test with larger language models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request Request for a new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants