Papers
arxiv:2502.12859

PAFT: Prompt-Agnostic Fine-Tuning

Published on Feb 18
· Submitted by kittttttt on Feb 19
Authors:
,
,

Abstract

While Large Language Models (LLMs) adapt well to downstream tasks after fine-tuning, this adaptability often compromises prompt robustness, as even minor prompt variations can significantly degrade performance. To address this, we propose Prompt-Agnostic Fine-Tuning(PAFT), a simple yet effective approach that dynamically adjusts prompts during fine-tuning. This encourages the model to learn underlying task principles rather than overfitting to specific prompt formulations. PAFT operates in two stages: First, a diverse set of meaningful, synthetic candidate prompts is constructed. Second, during fine-tuning, prompts are randomly sampled from this set to create dynamic training inputs. Extensive experiments across diverse datasets and LLMs demonstrate that models trained with PAFT exhibit strong robustness and generalization across a wide range of prompts, including unseen ones. This enhanced robustness improves both model performance and inference speed while maintaining training efficiency. Ablation studies further confirm the effectiveness of PAFT.

Community

Paper author Paper submitter

Large Language Models (LLMs) often struggle with prompt robustness after fine-tuning, as small variations in prompts can lead to significant performance drops. To tackle this issue, we introduce Prompt-Agnostic Fine-Tuning (PAFT), a method that dynamically adjusts prompts during the fine-tuning process. PAFT consists of two key stages:

  1. Candidate Prompt Construction: A diverse set of meaningful synthetic candidate prompts is generated.
  2. Dynamic Fine-Tuning: During fine-tuning, prompts are randomly sampled from this constructed set to create varied training inputs.

Extensive experiments across multiple datasets and LLMs show that models trained with PAFT achieve improved robustness and generalization to a variety of prompts, including those not seen during training. This enhancement leads to better overall model performance and faster inference times, while also ensuring training efficiency. Ablation studies further validate the effectiveness of the PAFT approach.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

The Prompt-Agnostic Fine-Tuning (PAFT) approach you've introduced is a significant advancement in enhancing the robustness of Large Language Models (LLMs) to prompt variations. By dynamically adjusting prompts during fine-tuning, PAFT effectively encourages models to grasp the underlying task principles rather than overfitting to specific prompt formulations. This methodology improves performance drastically.

At Navigable AI, we've been exploring similar strategies to refine our models' responses to varied prompts. Our approach aligns with the principles of PAFT, aiming to ensure that our AI systems deliver consistent and accurate outputs regardless of prompt variations. For those interested in learning more about our methods and applications, we invite you to visit our website.

·
Paper author

Hello! Thank you very much for your insightful analysis and positive feedback on the PAFT strategy. We completely agree with your perspective that PAFT, by dynamically adjusting prompts, significantly enhances the robustness of Large Language Models (LLMs) to variations in prompts. This approach not only mitigates the issue of performance fluctuations across different prompts but, more importantly, encourages the model to deeply understand the essence of downstream tasks, thereby improving overall performance.

This robustness is not only about learning the essence of tasks but also plays a vital role in building user-friendly systems. By reducing the model's dependency on specific prompt formulations, it ensures more consistent and reliable performance across diverse inputs, which is essential for creating intuitive and dependable AI applications.

We are also delighted to see Navigable AI's exploration and implementation of similar strategies. This focus on prompt robustness is a key direction for advancing AI systems. We look forward to more opportunities to exchange ideas and collaborate on driving progress in this field!

Do you have any repository you want to share?

·
Paper author

Thank you for your interest in our work! We’re excited to share that we will soon open-source the code for PAFT, along with all the prompts we’ve generated during our research. We believe this will help the community further explore and build upon the concept of prompt-agnostic fine-tuning.

Paper author Paper submitter
This comment has been hidden

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2502.12859 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2502.12859 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2502.12859 in a Space README.md to link it from this page.

Collections including this paper 1