Latest News

VERINA: Evaluating LLMs on End-to-End Verifiable Code Generation with Formal Proofs

LLM-Based Code Generation Faces a Verification Gap LLMs have shown strong performance in programming and are widely adopted in tools like Cursor and GitHub Copilot to boost developer productivity. However, due to their probabilistic nature, LLMs cannot provide formal guarantees…

Read MoreVERINA: Evaluating LLMs on End-to-End Verifiable Code Generation with Formal Proofs

Do AI Models Act Like Insider Threats? Anthropic’s Simulations Say Yes

Anthropic’s latest research investigates a critical security frontier in artificial intelligence: the emergence of insider threat-like behaviors from large language model (LLM) agents. The study, “Agentic Misalignment: How LLMs Could Be Insider Threats,” explores how modern LLM agents respond when…

Read MoreDo AI Models Act Like Insider Threats? Anthropic’s Simulations Say Yes

Teaching Mistral Agents to Say No: Content Moderation from Prompt to Response

In this tutorial, we’ll implement content moderation guardrails for Mistral agents to ensure safe and policy-compliant interactions. By using Mistral’s moderation APIs, we’ll validate both the user input and the agent’s response against categories like financial advice, self-harm, PII, and…

Read MoreTeaching Mistral Agents to Say No: Content Moderation from Prompt to Response

Building Production-Ready Custom AI Agents for Enterprise Workflows with Monitoring, Orchestration, and Scalability

In this tutorial, we walk you through the design and implementation of a custom agent framework built on PyTorch and key Python tooling, ranging from web intelligence and data science modules to advanced code generators. We’ll learn how to wrap…

Read MoreBuilding Production-Ready Custom AI Agents for Enterprise Workflows with Monitoring, Orchestration, and Scalability

Texas A&M Researchers Introduce a Two-Phase Machine Learning Method Named ‘ShockCast’ for High-Speed Flow Simulation with Neural Temporal Re-Meshing

Challenges in Simulating High-Speed Flows with Neural Solvers Modeling high-speed fluid flows, such as those in supersonic or hypersonic regimes, poses unique challenges due to the rapid changes associated with shock waves and expansion fans. Unlike low-speed flows, where fixed…

Read MoreTexas A&M Researchers Introduce a Two-Phase Machine Learning Method Named ‘ShockCast’ for High-Speed Flow Simulation with Neural Temporal Re-Meshing

Google Researchers Release Magenta RealTime: An Open-Weight Model for Real-Time AI Music Generation

Google’s Magenta team has introduced Magenta RealTime (Magenta RT), an open-weight, real-time music generation model that brings unprecedented interactivity to generative audio. Licensed under Apache 2.0 and available on GitHub and Hugging Face, Magenta RT is the first large-scale music…

Read MoreGoogle Researchers Release Magenta RealTime: An Open-Weight Model for Real-Time AI Music Generation

DeepSeek Researchers Open-Sourced a Personal Project named ‘nano-vLLM’: A Lightweight vLLM Implementation Built from Scratch

The DeepSeek Researchers just released a super cool personal project named ‘nano-vLLM‘, a minimalistic and efficient implementation of the vLLM (virtual Large Language Model) engine, designed specifically for users who value simplicity, speed, and transparency. Built entirely from scratch in…

Read MoreDeepSeek Researchers Open-Sourced a Personal Project named ‘nano-vLLM’: A Lightweight vLLM Implementation Built from Scratch