-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tutorial on implementing tree of thoughts(ToT) framework using a model #26726
Comments
cc @gante @patrickvonplaten @MKhalusova who have been working on a rework of our generation docs! |
Hi @rajveer43 👋 My apologies for the delayed response, I am still catching up on notifications from my recent holidays 🤗 We would love to host a comprehensive tutorial about Tree of Thoughts! My suggestion would be to:
What do you think? 🤗 |
@gante I am also excited to see a demo of Tree of Thoughts added to the "advanced generation use cases" documentation page in Transformers. I think this will be a valuable resource for the community. I would be happy to write a comprehensive tutorial about Tree of Thoughts for the Hugging Face community blog post. I will try my best to make it as informative and helpful as possible, and I will be sure to include instructions on how to use it, as well as examples of its use cases. Would you guide me on which model is best suited for it?. |
Feel free to ping me for the blog post PR review (in addition to @gante ). |
@rajveer43 if you have positive results with a 7B model, preferably a 7B model whose access is fully open (e.g. Llama 2 is NOT fully open, as it requires filling in a form), then that would be my suggestion. 7B models can be loaded by most people :) If you have no model preference, then I'd like to point to our Zephyr model, or to have a look in the LLM leaderboard |
7B version will be appropriate, There are basically three tasks of
so using Zephyr will not be that much useful. some other model like mistral or Fuyu the task in |
this is still under development |
@gante where should be the location of the tutorial? |
|
I shoul target Blog repository for the same okay got it. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
under work! |
@rajveer43 Any update on this? |
No, I am not working on this we can close this |
I'll leave it open for now, in case anyone else in the community wants to work on this, and let it close if there's no activity |
Hey @amyeroberts! I wanted to work on this PR but I'm unclear about what is expected. If I create a Jupyter notebook, where I implement tree of thoughts from scratch, and then use it to solve a problem (e.g Game of 24), will that be enough? When will I need to use transformers in this case? |
Hi @rahulbshrestha, yes, if this is to be added to the section of advanced uses in the generation section of the docs, then it should use transformers and the library's generate API. |
@gante Hi! I sent a request but haven't been added yet to the blog-explorers organization, therefore, I can't read any of the instructions. Could I be added please (my handle)? Also, where should I place the blog? I'm thinking of creating a Jupyter notebook here: https://github.com/huggingface/blog, which I'll later change to a .md file. Thanks for the help! |
Hi @amyeroberts @gante @MKhalusova ! I created a draft notebook here, and I would love to get feedback :) A couple points:
|
@rahulbshrestha Thanks for sharing!
An open model please! |
@rahulbshrestha yeah, using open models is very important for full analysis of the process :) For instance, one might come up with a better strategy by looking at your blog post and by combining it with internal model variables |
cc @aymeric-roucher maybe as well as relevant for Agents |
Hi! I used Mistral-7B and got worse results e.g in the Game of 24, the model doesn't come up with the correct solution. What should I do in this case? I don't have resources to test with larger language models. |
Feature request
Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices. Our experiments show that ToT significantly enhances language models' problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords. For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%. Code repo with all prompts
Motivation
a comprehensive tutorial on implementing tree of thoughts using any open source model will give users more understanding about it.
Your contribution
https://github.com/princeton-nlp/tree-of-thought-llm
https://arxiv.org/abs/2305.10601
The text was updated successfully, but these errors were encountered: