SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens Paper • 2508.05305 • Published Aug 7 • 46
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms Paper • 2511.04217 • Published 2 days ago • 8