-
Attention Is All You Need
Paper • 1706.03762 • Published • 44 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 14 -
Universal Language Model Fine-tuning for Text Classification
Paper • 1801.06146 • Published • 6 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 11
Collections
Discover the best community collections!
Collections including paper arxiv:2005.14165
-
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Paper • 2306.01116 • Published • 31 -
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Paper • 2205.14135 • Published • 11 -
RoFormer: Enhanced Transformer with Rotary Position Embedding
Paper • 2104.09864 • Published • 10 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 11
-
Attention Is All You Need
Paper • 1706.03762 • Published • 44 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 11 -
Learning to summarize from human feedback
Paper • 2009.01325 • Published • 4 -
Training language models to follow instructions with human feedback
Paper • 2203.02155 • Published • 16
-
Retentive Network: A Successor to Transformer for Large Language Models
Paper • 2307.08621 • Published • 170 -
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Paper • 2303.12712 • Published • 2 -
GPT-4 Technical Report
Paper • 2303.08774 • Published • 5 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 9
-
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Paper • 2309.03883 • Published • 33 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 30 -
Agents: An Open-source Framework for Autonomous Language Agents
Paper • 2309.07870 • Published • 42 -
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Paper • 2309.00267 • Published • 47