Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 124
Clue-instruct Collection Clue-instruct dataset and different models fine-tuned on it. • 8 items • Updated Nov 8, 2024
Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models Paper • 2408.06663 • Published Aug 13, 2024 • 16
Non Verbis, Sed Rebus: Large Language Models are Weak Solvers of Italian Rebuses Paper • 2408.00584 • Published Aug 1, 2024 • 6
Clue-instruct Collection Clue-instruct dataset and different models fine-tuned on it. • 8 items • Updated Nov 8, 2024
Dynamic Few-Shot Learning for Knowledge Graph Question Answering Paper • 2407.01409 • Published Jul 1, 2024
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models Paper • 2309.03883 • Published Sep 7, 2023 • 34
Tokenizer Adaptation Collection Collection of research on tokenizers' adaptation to specific domains and/or languages. Special focus on sequence compression directions • 4 items • Updated Jul 6, 2024