Exploring the Sparse Frontier: How Researchers from Edinburgh, Cohere, and Meta Are Rethinking Attention Mechanisms for Long-Context LLMs
Sparse attention is emerging as a compelling approach to improve the ability of Transformer-based LLMs to handle long sequences. This is particularly important because the standard self-attention mechanism, central to LLMs, scales poorly with sequence length—its computational cost grows quadratically…





