JustDense: Just using Dense instead of Sequence Mixer for Time Series analysis Paper β’ 2508.09153 β’ Published Aug 4 β’ 1
Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers Paper β’ 2407.09941 β’ Published Jul 13, 2024 β’ 1
FIFO-Diffusion: Generating Infinite Videos from Text without Training Paper β’ 2405.11473 β’ Published May 19, 2024 β’ 57
Jamba: A Hybrid Transformer-Mamba Language Model Paper β’ 2403.19887 β’ Published Mar 28, 2024 β’ 111
OLMo: Accelerating the Science of Language Models Paper β’ 2402.00838 β’ Published Feb 1, 2024 β’ 84