Papers
arxiv:2506.05629

Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs

Published on Jun 5
ยท Submitted by abhi1nandy2 on Jun 9

Abstract

A new method using input-dependent soft prompting with a self-attention mechanism improves parameter-efficient fine-tuning for large language models, enhancing zero-shot domain transfer.

AI-generated summary

The performance of large language models in domain-specific tasks necessitates fine-tuning, which is computationally expensive and technically challenging. This paper focuses on parameter-efficient fine-tuning using soft prompting, a promising approach that adapts pre-trained models to downstream tasks by learning a small set of parameters. We propose a novel Input Dependent Soft Prompting technique with a self-Attention Mechanism (ID-SPAM) that generates soft prompts based on the input tokens and attends different tokens with varying importance. Our method is simple and efficient, keeping the number of trainable parameters small. We show the merits of the proposed approach compared to state-of-the-art techniques on various tasks and show the improved zero shot domain transfer capability.

Community

Paper author Paper submitter
โ€ข
edited 1 day ago

๐ŸŽฏ ID-SPAM (Input-Dependent Soft Prompting technique with a self-Attention Mechanism) is here! ๐Ÿ“š
๐Ÿง  Efficiently adapt LLMs with input-aware soft prompts using self-attention
โšก Minimal parameters, maximum adaptability โ€” say goodbye to heavy fine-tuning!
๐ŸŒ Superior zero-shot domain transfer across diverse tasks
๐Ÿš€ Accepted at ACL 2025 (Main) Conference

๐Ÿ” ID-SPAM learns to generate smarter prompts by attending to input tokens with varying importance, outperforming state-of-the-art parameter-efficient tuning methods. Compact, scalable, and ready for real-world domains!
Screenshot_2025-06-09-08-35-26-708_com.android.chrome-edit.jpg

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2506.05629 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2506.05629 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2506.05629 in a Space README.md to link it from this page.

Collections including this paper 2