GitHub - JerryYin777/PaperHelper: PaperHelper: Knowledge-Based LLM QA Paper Reading Assistant with Reliable References

PaperHelper: Knowledge-Based LLM QA Paper Reading Assistant with Reliable References

[1] Introduction

Thanks to the Great Ms. Freax for designing the Athenas-Oracle project.

Based on it, we have made improvements and designed the Paper Helper for Machine Learning Scientists. With the effects of RAG Fusion and RAFT (RAG Finetune, fine-tuned using GPT-4-1106-Preview API on the 52,000 MLArxivPapers and ArxivQA dataset as the backend), it can effectively reduce hallucinations and enhance retrieval relevance. We have implemented an end-to-end application of parallel generating, providing useful information to paper readers based on references ranked by relevance. We also incorporated structural relationships to represent the extracted information.

In short, everything is designed to enable a machine learning researcher to read papers more efficiently and provide the most reliable references based on paper citations!

[2] Implementation Details

Overview

The assistant utilizes three tools: search, gather evidence, and answer questions. These tools enable it to find and parse relevant full-text research papers, identify specific sections in the paper that help answer the question, summarize those sections with the context of the question (called evidence), and then generate an answer based on the evidence. It is an agent so that the LLMs orchestrating the tools can adjust the input to paper searches, gather evidence with different phrases, and assess if an answer is complete.

Basic RAG

The basic RAG simply splits the search prompt into simple words in a crude manner, and may produce certain spelling illusions without truly understanding the user's intent.

RAG Fusion with RAFT

Our system also has integrated the RAFT method. This approach enhances the capability of LLMs in specific RAG tasks by leveraging the core idea that if LLMs can "learn" documents in advance, it can improve RAG's performance.

We finetuned the OpenAI API using 52,000 domain-specific papers from the field of machine learning to augment the knowledge of PaperHelper within the machine learning domain, thereby assisting machine learning scientists in reading papers more efficiently and accurately.

Extract Relevance

With the implementation of RAFT, we can extract the reference section at the end of articles more efficiently. First, we use RAG to traverse all the references in the article. Then, based on the knowledge from the LLMs, we refine the information using the top-k algorithm to identify the literature most relevant to the article.

We can find that through the RAFT method, the model integrates cutting-edge knowledge, enabling readers to further explore academic papers based on current information rather than providing outdated and misleading content.

[3] Usage

Use the following command step by step:

Clone the Repository

git clone https://github.com/JerryYin777/PaperHelper.git

Install Dependencies

cd PaperHelper
pip install -r requirements.txt

Set OpenAI API Key

cd .streamlit
touch secrets.toml #input your OPENAI_API_KEY = "sk-yourapikeyhere" here

Start PaperHelper

streamlit run app.py

Note:

Set allow_dangerous_deserialization: bool = True first, where you can find in faiss.py.

You may also embed your pdf first in the application (click the button), or you may raise error Exceptation: Directory index does not exist.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
pdf		pdf
.gitignore		.gitignore
README.md		README.md
agent_helper.py		agent_helper.py
app.py		app.py
embed_pdf.py		embed_pdf.py
llm_helper.py		llm_helper.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PaperHelper: Knowledge-Based LLM QA Paper Reading Assistant with Reliable References

[1] Introduction

[2] Implementation Details

Overview

Basic RAG

RAG Fusion with RAFT

Extract Relevance

[3] Usage

About

Releases

Packages

Languages

JerryYin777/PaperHelper

Folders and files

Latest commit

History

Repository files navigation

PaperHelper: Knowledge-Based LLM QA Paper Reading Assistant with Reliable References

[1] Introduction

[2] Implementation Details

Overview

Basic RAG

RAG Fusion with RAFT

Extract Relevance

[3] Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages