Orders of Business

The Goal is to get the three main strawberry papers that seemed to have high potential for improving self improvement techniques for AI.

Go through the news articles around the Q* and Strawberry Leaks to find the papers relating to the project, listed below.
Rank them by strength of effect, ease of implementation, ease of operationalizing, and raw dankness, etc. Attempt to categorize them into relevant buckets.
Write up code snippets so that people can treat them as templates.

The 3 "holy grail" papers

1 was on self alignment. 1 was on distillation techniques I think. I'm pretty sure the third was lets verify step by step.

StrawberryFestival

An open source repository dedicated to compiling papers and code relating to Strawberry/Q*.

A Very Rough Perplexity Page Describing It
AI Explained Video (Q* Explained)

Start by dumping links to the research papers here:

STaR -- Self TAught Reasoner

Self Taught Reasoner
- Description*: Oops, I think this was the OG Star Paper, not the Quiet STar
Quiet Star Paper
Improvement on Quiet STaR?
Let's Verify Step by Step

Description: This paper focuses on enhancing the reasoning process of AI models by verifying each step in a reasoning sequence, rather than just the final answer. It proposes a method to significantly improve the accuracy of AI-generated solutions by focusing on process validation.
Attention is All You Need

Description: This seminal paper introduces the Transformer model, which has become the foundation for many advanced AI models, including GPT. The paper explains how attention mechanisms can improve the efficiency and effectiveness of neural networks in processing sequences.
GSM8K: A Dataset for Math Word Problems

Description: The GSM8K dataset consists of 8,000 diverse and challenging grade-school-level math word problems. It is used to benchmark the performance of AI models in solving mathematical problems, particularly focusing on multi-step reasoning.
Test-Time Computation

Description: This paper explores the concept of test-time computation, where additional computational resources are used during the inference phase to generate multiple candidate solutions. The best solution is then selected using a verification model, improving overall accuracy.
Prover Verifier Games for Legibility

Definitely Not Part of the Strawberry/Q* Papers But DAnk and Adjacent

Metacognition

Description: This paper goes into metacognition and fine grained to coarse grained skill acquisition in ways that transfer between llms.

Original Attention is All You Need

OG "Attention is All You Need" Paper

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Orders of Business

The 3 "holy grail" papers

StrawberryFestival

STaR -- Self TAught Reasoner

Definitely Not Part of the Strawberry/Q* Papers But DAnk and Adjacent

Original Attention is All You Need

About

Releases

Packages

License

ian-andrich/StrawberryFestival

Folders and files

Latest commit

History

Repository files navigation

Orders of Business

The 3 "holy grail" papers

StrawberryFestival

STaR -- Self TAught Reasoner

Definitely Not Part of the Strawberry/Q* Papers But DAnk and Adjacent

Original Attention is All You Need

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages