Skip to content

RUC-NLPIR/RAG-Reading-List

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 

Repository files navigation

RAG-Reading-List

This repository is used to collect recent studies on RAG methods, benchmarks, and toolkits. Welcome any updates through pull requests.

📄 Method

Text only

  1. Measuring and Narrowing the Compositionality Gap in Language Models
    • EMNLP 2023 Findings, 2022-10-07, https://aclanthology.org/2023.findings-emnlp.378/
    • Ofir Press, Muru Zhang, Sewon Min, Ludwig Schmidt, Noah Smith, Mike Lewis
    • This paper proposes self-ask, an elicitive prompting strategy that ask the LLM itself to generate the decomposition of a complex query. The Bamboogle dataset is also created by this paper.
  2. Answering Questions by Meta-Reasoning over Multiple Chains of Thought
    • EMNLP 2023, 2023-04-25, https://aclanthology.org/2023.emnlp-main.364/
    • Ori Yoran, Tomer Wolfson, Ben Bogin, Uri Katz, Daniel Deutch, Jonathan Berant
    • This paper proposes a multi-chain reasoning method (meta-CoT) that combining multiple reasoning chains to infer the final answer.
  3. In-Context Retrieval-Augmented Language Models
    • arXiv, 2023-08-01, https://arxiv.org/abs/2302.00083
    • Ori Ram, Yoav Levine, Itay Dalmedigos, Dor Muhlgay, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham
    • Using retrieved results can effectively imrpove LLMs' performance. Retrieve after every $k$ tokens, and the performance can be improved when $k$ is small (more frequent retrieval). Reranking is also helpful.
  4. PlanXRAG: Planning-guided Retrieval Augmented Generation
  5. RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation
  6. Retrieval-Augmented Generation with Estimation of Source Reliability
  7. RuleRAG: Rule-guided retrieval-augmented generation with language models for question answering
  8. SmartRAG: Jointly Learn RAG-Related Tasks From the Environment Feedback
  9. RAGulator: Lightweight Out-of-Context Detectors for Grounded Text Generation
  10. Agentic Information Retrieval
  11. TRACE the Evidence: Constructing Knowledge-Grounded Reasoning Chains for Retrieval-Augmented Generation
    • EMNLP 2024 (Findings), 2024-6-17, https://aclanthology.org/2024.findings-emnlp.496
    • Jinyuan Fang, Zaiqiao Meng, Craig MacDonald
    • This paper proposes method TRACE that extracts logically connected knowledge triples from the retrieved docuemnts.

Multimodal

  1. Murag: Multimodal retrieval-augmented generator for open question answering over images and text.
  2. MLLM IS A STRONG RERANKER: ADVANCING MULTIMODAL RETRIEVAL-AUGMENTED GENERATION VIA KNOWLEDGE-ENHANCED RERANKING AND NOISEINJECTED TRAINING
  3. Retrieval-Augmented Multimodal Language Modeling
  4. Enhancing Multi-modal Multi-hop Question Answering via Structured Knowledge and Unified Retrieval-Generation
  5. An Interactive Multi-modal Query Answering System with Retrieval-Augmented Large Language Models
  6. VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
    • arXiv 2024-10-14, https://arxiv.org/abs/2410.10594
    • Shi Yu, Chaoyue Tang, Bokai Xu, Junbo Cui, Junhao Ran, Yukun Yan, Zhenghao Liu, Shuo Wang, Xu Han, Zhiyuan Liu, Maosong Sun
    • TL;DR: Authors directly view web pages as images instead of spliting text or image information blocks. They use multi-modal retriever and Visual-understanding LLMs to build VisRAG framework and evaluate models on various visual QA tasks.
  7. iRAG: Advancing RAG for Videos with an Incremental Approach
  8. Video Enriched Retrieval Augmented Generation Using Aligned Video Captions
  9. Towards Retrieval Augmented Generation over Large Video Libraries

📊 Benchmark

  1. Not All Languages are Equal: Insights into Multilingual Retrieval-Augmented Generation

Conversational

  1. RAC: Retrieval-augmented Conversation Dataset for Open-domain Question Answering in Conversational Settings

Analysis

  1. Long Context RAG Performance of Large Language Models

🛠️ Toolkit

  1. RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit