matlok
's Collections
Papers - Fine-tuning
updated
Unleashing the Power of Pre-trained Language Models for Offline
Reinforcement Learning
Paper
•
2310.20587
•
Published
•
16
SELF: Language-Driven Self-Evolution for Large Language Model
Paper
•
2310.00533
•
Published
•
2
QLoRA: Efficient Finetuning of Quantized LLMs
Paper
•
2305.14314
•
Published
•
47
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper
•
2309.14717
•
Published
•
44
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper
•
2310.09263
•
Published
•
39
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language
Models
Paper
•
2401.01335
•
Published
•
64
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Paper
•
2403.15042
•
Published
•
26
Toolformer: Language Models Can Teach Themselves to Use Tools
Paper
•
2302.04761
•
Published
•
11
The Unreasonable Ineffectiveness of the Deeper Layers
Paper
•
2403.17887
•
Published
•
79
InternLM2 Technical Report
Paper
•
2403.17297
•
Published
•
30
LIMA: Less Is More for Alignment
Paper
•
2305.11206
•
Published
•
21
Direct Preference Optimization: Your Language Model is Secretly a Reward
Model
Paper
•
2305.18290
•
Published
•
52
sDPO: Don't Use Your Data All at Once
Paper
•
2403.19270
•
Published
•
41
Deep reinforcement learning from human preferences
Paper
•
1706.03741
•
Published
•
3
Fine-tuning Language Models for Factuality
Paper
•
2311.08401
•
Published
•
28
An Emulator for Fine-Tuning Large Language Models using Small Language
Models
Paper
•
2310.12962
•
Published
•
14
Gecko: Versatile Text Embeddings Distilled from Large Language Models
Paper
•
2403.20327
•
Published
•
48
Model Stock: All we need is just a few fine-tuned models
Paper
•
2403.19522
•
Published
•
10
ReFT: Representation Finetuning for Language Models
Paper
•
2404.03592
•
Published
•
92
UltraFeedback: Boosting Language Models with High-quality Feedback
Paper
•
2310.01377
•
Published
•
5
RL for Consistency Models: Faster Reward Guided Text-to-Image Generation
Paper
•
2404.03673
•
Published
•
15
Stream of Search (SoS): Learning to Search in Language
Paper
•
2404.03683
•
Published
•
30
CantTalkAboutThis: Aligning Language Models to Stay on Topic in
Dialogues
Paper
•
2404.03820
•
Published
•
25
ORPO: Monolithic Preference Optimization without Reference Model
Paper
•
2403.07691
•
Published
•
64
Learn Your Reference Model for Real Good Alignment
Paper
•
2404.09656
•
Published
•
83
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity
Tracking
Paper
•
2402.14811
•
Published
•
4
Comprehensive Survey of Model Compression and Speed up for Vision
Transformers
Paper
•
2404.10407
•
Published
•
1
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of
Instruction Data
Paper
•
2404.12195
•
Published
•
12
Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning
Paper
•
2303.15647
•
Published
•
4
Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer
Paper
•
2205.12148
•
Published
•
2
Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex
Models
Paper
•
2406.15718
•
Published
•
14
In-context Vectors: Making In Context Learning More Effective and
Controllable Through Latent Space Steering
Paper
•
2311.06668
•
Published
•
5
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models
Paper
•
2407.09025
•
Published
•
132
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper
•
2403.13372
•
Published
•
63
Adapting While Learning: Grounding LLMs for Scientific Problems with
Intelligent Tool Usage Adaptation
Paper
•
2411.00412
•
Published
•
9
CLEAR: Character Unlearning in Textual and Visual Modalities
Paper
•
2410.18057
•
Published
•
200
LoRA vs Full Fine-tuning: An Illusion of Equivalence
Paper
•
2410.21228
•
Published
•
2
Cut Your Losses in Large-Vocabulary Language Models
Paper
•
2411.09009
•
Published
•
44
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models
Paper
•
2411.09595
•
Published
•
71
No More Adam: Learning Rate Scaling at Initialization is All You Need
Paper
•
2412.11768
•
Published
•
41