Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks Paper • 2005.11401 • Published May 22, 2020 • 12
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Paper • 2205.14135 • Published May 27, 2022 • 11
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper • 2305.18290 • Published May 29, 2023 • 48
Stack More Layers Differently: High-Rank Training Through Low-Rank Updates Paper • 2307.05695 • Published Jul 11, 2023 • 22
TinyStories: How Small Can Language Models Be and Still Speak Coherent English? Paper • 2305.07759 • Published May 12, 2023 • 33
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs Paper • 2307.16789 • Published Jul 31, 2023 • 98
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models Paper • 2308.00675 • Published Aug 1, 2023 • 35
Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve Paper • 2309.13638 • Published Sep 24, 2023
Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks Paper • 2307.02477 • Published Jul 5, 2023
Textbooks Are All You Need II: phi-1.5 technical report Paper • 2309.05463 • Published Sep 11, 2023 • 87
LLaMA: Open and Efficient Foundation Language Models Paper • 2302.13971 • Published Feb 27, 2023 • 13
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 242
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size Paper • 1602.07360 • Published Feb 24, 2016 • 1
Physics of Language Models: Part 3.2, Knowledge Manipulation Paper • 2309.14402 • Published Sep 25, 2023 • 6
Wuerstchen: Efficient Pretraining of Text-to-Image Models Paper • 2306.00637 • Published Jun 1, 2023 • 12
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation Paper • 2208.12242 • Published Aug 25, 2022 • 11
Adding Conditional Control to Text-to-Image Diffusion Models Paper • 2302.05543 • Published Feb 10, 2023 • 40
UniAudio: An Audio Foundation Model Toward Universal Audio Generation Paper • 2310.00704 • Published Oct 1, 2023 • 19
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models Paper • 2109.10282 • Published Sep 21, 2021 • 6
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published Apr 29 • 118