arxiv:2410.21741

Enhancing Financial Question Answering with a Multi-Agent Reflection Framework

Published on Oct 29, 2024

Authors:

Abstract

A multi-agent framework with critic agents improves performance in financial QA tasks compared to single-agent models.

AI-generated summary

While Large Language Models (LLMs) have shown impressive capabilities in numerous Natural Language Processing (NLP) tasks, they still struggle with financial question answering (QA), particularly when numerical reasoning is required. Recently, LLM-based multi-agent frameworks have demonstrated remarkable effectiveness in multi-step reasoning, which is crucial for financial QA tasks as it involves extracting relevant information from tables and text and then performing numerical reasoning on the extracted data to infer answers. In this study, we propose a multi-agent framework incorporating a critic agent that reflects on the reasoning steps and final answers for each question. Additionally, we enhance our system by adding multiple critic agents, each focusing on a specific aspect of the answer. Our results indicate that this framework significantly improves performance compared to single-agent reasoning, with an average performance increase of 15% for the LLaMA3-8B model and 5% for the LLaMA3-70B model. Furthermore, our framework performs on par with, and in some cases surpasses, larger single-agent LLMs such as LLaMA3.1-405B and GPT-4o-mini, though it falls slightly short compared to Claude-3.5 Sonnet. Overall, our framework presents an effective solution to enhance open-source LLMs for financial QA tasks, offering a cost-effective alternative to larger models like Claude-3.5 Sonnet.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2410.21741 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2410.21741 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2410.21741 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.