yuraedcel28@gmail.com

yuraedcel28@gmail.com

Multimodal Foundation Models Fall Short on Physical Reasoning: PHYX Benchmark Highlights Key Limitations in Visual and Symbolic Integration

State-of-the-art models show human-competitive accuracy on AIME, GPQA, MATH-500, and OlympiadBench, solving Olympiad-level problems. Recent multimodal foundation models have advanced benchmarks for disciplinary knowledge and mathematical reasoning. However, these evaluations miss a crucial aspect of machine intelligence: physical reasoning, which…

Samsung Researchers Introduced ANSE (Active Noise Selection for Generation): A Model-Aware Framework for Improving Text-to-Video Diffusion Models through Attention-Based Uncertainty Estimation

Video generation models have become a core technology for creating dynamic content by transforming text prompts into high-quality video sequences. Diffusion models, in particular, have established themselves as a leading approach for this task. These models work by starting from…