
Urban

Effects

People
World News

New MIT class uses anthropology to improve chatbots | MIT News
Young adults growing up in the attention economy — preparing for adult life, with social media and chatbots competing for their attention — can easily fall into unhealthy relationships with digital platforms. But what if chatbots weren’t mere distractions from real life? Could they be designed humanely, as moral partners whose digital goal is…

Google AI Introduces Gemini Embedding 2: A Multimodal Embedding Model that Lets Your Bring Text, Images, Video, Audio, and Docs into the Embedding Space
Google expanded its Gemini model family with the release of Gemini Embedding 2. This second-generation model succeeds the text-only gemini-embedding-001 and is designed specifically to address the high-dimensional storage and cross-modal retrieval challenges faced by AI developers building production-grade Retrieval-Augmented Generation (RAG) systems. The Gemini Embedding 2 release marks a significant technical shift in…

How to Build a Self-Designing Meta-Agent That Automatically Constructs, Instantiates, and Refines Task-Specific AI Agents
class MetaAgent: def __init__(self, llm: Optional[LocalLLM] = None): self.llm = llm or LocalLLM() def _capability_heuristics(self, task: str) -> Dict[str, Any]: t = task.lower() needs_data = any(k in t for k in [“csv”, “dataframe”, “pandas”, “dataset”, “table”, “excel”]) needs_math = any(k in t for k in [“calculate”, “compute”, “probability”, “equation”, “optimize”, “derivative”, “integral”]) needs_writing =…

Fish Audio Releases Fish Audio S2: A New Generation of Expressive Text-to-Speech (TTS) with Absurdly Controllable Emotion
The landscape of Text-to-Speech (TTS) is moving away from modular pipelines toward integrated Large Audio Models (LAMs). Fish Audio’s release of S2-Pro, the flagship model within the Fish Speech ecosystem, represents a shift toward open architectures capable of high-fidelity, multi-speaker synthesis with sub-150ms latency. The release provides a framework for zero-shot voice cloning and…

A better method for planning complex visual tasks | MIT News
MIT researchers have developed a generative artificial intelligence-driven approach for planning long-term visual tasks, like robot navigation, that is about twice as effective as some existing techniques.Their method uses a specialized vision-language model to perceive the scenario in an image and simulate actions needed to reach a goal. Then a second model translates those…

How Joseph Paradiso’s sensing innovations bridge the arts, medicine, and ecology | MIT News
Joseph Paradiso thinks that the most engaging research questions usually span disciplines. Paradiso was trained as a physicist and completed his PhD in experimental high-energy physics at MIT in 1981. His father was a photographer and filmmaker working at MIT, MIT Lincoln Laboratory, and the MITRE Corporation, so he grew up in a house where artists,…
Photos taken
Places visited
Contests
Enrolled people
