YaRN: Efficient Context Window Extension of Large Language Models Paper β’ 2309.00071 β’ Published Aug 31, 2023 β’ 71 β’ 4
LongNet: Scaling Transformers to 1,000,000,000 Tokens Paper β’ 2307.02486 β’ Published Jul 5, 2023 β’ 80 β’ 15
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 Paper β’ 2306.02707 β’ Published Jun 5, 2023 β’ 46 β’ 18