Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA
Abstract
Notes Writing enhances iterative RAG by generating concise notes at each step, improving reasoning and performance while minimizing output increase.
Iterative RAG for multi-hop question answering faces challenges with lengthy contexts and the buildup of irrelevant information. This hinders a model's capacity to process and reason over retrieved content and limits performance. While recent methods focus on compressing retrieved information, they are either restricted to single-round RAG, require finetuning or lack scalability in iterative RAG. To address these challenges, we propose Notes Writing, a method that generates concise and relevant notes from retrieved documents at each step, thereby reducing noise and retaining only essential information. This indirectly increases the effective context length of Large Language Models (LLMs), enabling them to reason and plan more effectively while processing larger volumes of input text. Notes Writing is framework agnostic and can be integrated with different iterative RAG methods. We demonstrate its effectiveness with three iterative RAG methods, across two models and four evaluation datasets. Notes writing yields an average improvement of 15.6 percentage points overall, with minimal increase in output tokens.
Community
Iterative RAG for multi-hop QA faces challenges with lengthy contexts and buildup of irrelevant information. We propose NotesWriting that generates concise notes from the retrieved documents at each step, thereby reducing noise and retaining only essential information. This indirectly increases the effective context length of LLMs, enabling them to reason and plan more effectively while processing larger volumes of input text.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration (2025)
- An Analysis of Decoding Methods for LLM-based Agents for Faithful Multi-Hop Question Answering (2025)
- DualRAG: A Dual-Process Approach to Integrate Reasoning and Retrieval for Multi-Hop Question Answering (2025)
- Accelerating Adaptive Retrieval Augmented Generation via Instruction-Driven Representation Reduction of Retrieval Overlaps (2025)
- CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability (2025)
- Memory-Aware and Uncertainty-Guided Retrieval for Multi-Hop Question Answering (2025)
- Hierarchical Document Refinement for Long-context Retrieval-augmented Generation (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper