The Goal is to get the three main strawberry papers that seemed to have high potential for improving self improvement techniques for AI.
- Go through the news articles around the Q* and Strawberry Leaks to find the papers relating to the project, listed below.
- Rank them by strength of effect, ease of implementation, ease of operationalizing, and raw dankness, etc. Attempt to categorize them into relevant buckets.
- Write up code snippets so that people can treat them as templates.
1 was on self alignment. 1 was on distillation techniques I think. I'm pretty sure the third was lets verify step by step.
An open source repository dedicated to compiling papers and code relating to Strawberry/Q*.
A Very Rough Perplexity Page Describing It
AI Explained Video (Q* Explained)
Start by dumping links to the research papers here:
-
- Description*: Oops, I think this was the OG Star Paper, not the Quiet STar
-
Description: This paper focuses on enhancing the reasoning process of AI models by verifying each step in a reasoning sequence, rather than just the final answer. It proposes a method to significantly improve the accuracy of AI-generated solutions by focusing on process validation.
-
Description: This seminal paper introduces the Transformer model, which has become the foundation for many advanced AI models, including GPT. The paper explains how attention mechanisms can improve the efficiency and effectiveness of neural networks in processing sequences.
-
GSM8K: A Dataset for Math Word Problems
Description: The GSM8K dataset consists of 8,000 diverse and challenging grade-school-level math word problems. It is used to benchmark the performance of AI models in solving mathematical problems, particularly focusing on multi-step reasoning.
-
Description: This paper explores the concept of test-time computation, where additional computational resources are used during the inference phase to generate multiple candidate solutions. The best solution is then selected using a verification model, improving overall accuracy.
-
Description: This paper goes into metacognition and fine grained to coarse grained skill acquisition in ways that transfer between llms.