Latest News

StepFun Introduces Step-Audio-AQAA: A Fully End-to-End Audio Language Model for Natural Voice Interaction

Rethinking Audio-Based Human-Computer Interaction Machines that can respond to human speech with equally expressive and natural audio have become a major goal in intelligent interaction systems. Audio-language modeling extends this vision by combining speech recognition, natural language understanding, and audio…

Read MoreStepFun Introduces Step-Audio-AQAA: A Fully End-to-End Audio Language Model for Natural Voice Interaction

EPFL Researchers Unveil FG2 at CVPR: A New AI Model That Slashes Localization Errors by 28% for Autonomous Vehicles in GPS-Denied Environments

Navigating the dense urban canyons of cities like San Francisco or New York can be a nightmare for GPS systems. The towering skyscrapers block and reflect satellite signals, leading to location errors of tens of meters. For you and me,…

Read MoreEPFL Researchers Unveil FG2 at CVPR: A New AI Model That Slashes Localization Errors by 28% for Autonomous Vehicles in GPS-Denied Environments

Building AI-Powered Applications Using the Plan → Files → Code Workflow in TinyDev

In this tutorial, we introduce TinyDev class implementation, a minimal yet powerful AI code generation tool that utilizes the Gemini API to transform simple app ideas into comprehensive, structured applications. Designed to run effortlessly in Notebook, TinyDev follows a clean…

Read MoreBuilding AI-Powered Applications Using the Plan → Files → Code Workflow in TinyDev

Microsoft AI Introduces Code Researcher: A Deep Research Agent for Large Systems Code and Commit History

Rise of Autonomous Coding Agents in System Software Debugging The use of AI in software development has gained traction with the emergence of large language models (LLMs). These models are capable of performing coding-related tasks. This shift has led to…

Read MoreMicrosoft AI Introduces Code Researcher: A Deep Research Agent for Large Systems Code and Commit History

Internal Coherence Maximization (ICM): A Label-Free, Unsupervised Training Framework for LLMs

Post-training methods for pre-trained language models (LMs) depend on human supervision through demonstrations or preference feedback to specify desired behaviors. However, this approach faces critical limitations as tasks and model behaviors become very complex. Human supervision is unreliable in these…

Read MoreInternal Coherence Maximization (ICM): A Label-Free, Unsupervised Training Framework for LLMs

Highlighted at CVPR 2025: Google DeepMind’s ‘Motion Prompting’ Paper Unlocks Granular Video Control

Key Takeaways: Researchers from Google DeepMind, the University of Michigan & Brown university have developed “Motion Prompting,” a new method for controlling video generation using specific motion trajectories. The technique uses “motion prompts,” a flexible representation of movement that can…

Read MoreHighlighted at CVPR 2025: Google DeepMind’s ‘Motion Prompting’ Paper Unlocks Granular Video Control

Sakana AI Introduces Text-to-LoRA (T2L): A Hypernetwork that Generates Task-Specific LLM Adapters (LoRAs) based on a Text Description of the Task

Transformer models have significantly influenced how AI systems approach tasks in natural language understanding, translation, and reasoning. These large-scale models, particularly large language models (LLMs), have grown in size and complexity to the point where they encompass broad capabilities across…

Read MoreSakana AI Introduces Text-to-LoRA (T2L): A Hypernetwork that Generates Task-Specific LLM Adapters (LoRAs) based on a Text Description of the Task