Long Context Cost-Optimal Grouped-Query Attention for Long-Context LLMs Paper • 2503.09579 • Published Mar 12 • 5
RNN Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling Paper • 2410.07145 • Published Oct 9, 2024 • 2
Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling Paper • 2410.07145 • Published Oct 9, 2024 • 2