Category Latest News

Researchers from Fudan University Introduce Lorsa: A Sparse Attention Mechanism That Recovers Atomic Attention Units Hidden in Transformer Superposition

Large Language Models (LLMs) have gained significant attention in recent years, yet understanding their internal mechanisms remains challenging. When examining individual attention heads in Transformer models, researchers have identified specific functionalities in some heads, such as induction heads that predict…

This AI Paper Introduce WebThinker: A Deep Research Agent that Empowers Large Reasoning Models (LRMs) for Autonomous Search and Report Generation

Large reasoning models (LRMs) have shown impressive capabilities in mathematics, coding, and scientific reasoning. However, they face significant limitations when addressing complex information research needs when relying solely on internal knowledge. These models struggle with conducting thorough web information retrieval…