News
AI models fail at robot control without human-designed building blocks but agentic scaffolding closes the gap
5+ hour, 6+ min ago (396+ words) A new framework from Nvidia, UC Berkeley, and Stanford systematically tests how well AI models can control robots through code. The findings: without human-designed abstractions, even top models fail, but methods like targeted test-time compute scaling closes the gap. Researchers…...
Microsoft's MAI-Transcribe-1 runs 2.5x faster than its predecessor at $0.36 per audio hour
3+ hour, 13+ min ago (173+ words) Microsoft's MAI-Transcribe-1 runs 2.5x faster than its predecessor at $0.36 per audio hour'the-decoder.com Microsoft's MAI-Transcribe-1 runs 2.5x faster than its predecessor at $0.36 per audio hour Microsoft has introduced MAI-Transcribe-1, a speech-to-text model supporting 25 languages that achieves the lowest word error rate of…...
Google's Veo 3.1 Lite cuts video generation costs by more than half
2+ day, 1+ hour ago (179+ words) Google's Veo 3.1 Lite cuts video generation costs by more than half'the-decoder.com Google's Veo 3.1 Lite cuts video generation costs by more than half Google Deepmind is launching Veo 3.1 Lite, its most affordable video generation model yet. It costs less than…...
Anthropic accidentally publishes Claude Code source code for anyone to find
2+ day, 1+ hour ago (153+ words) Anthropic accidentally publishes Claude Code source code for anyone to find'the-decoder.com Anthropic accidentally publishes Claude Code source code for anyone to find Anthropic inadvertently published parts of the source code for its AI coding tool, Claude Code. Developers discovered…...
Qwen3.5-Omni learned to write code from spoken instructions and video without anyone training it to
2+ day, 7+ hour ago (509+ words) Alibaba has released Qwen3.5-Omni, an omnimodal AI model that processes text, images, audio, and video. It claims to beat Gemini 3.1 Pro on audio tasks and picked up an unexpected trick along the way: writing code from spoken instructions and video…...
OpenAI launches a Codex plugin that runs inside Anthropic's Claude Code
2+ day, 8+ hour ago (353+ words) OpenAI released a plugin that embeds its AI coding assistant Codex directly into Anthropic's Claude Code. The plugin offers three core features: a standard code review, a more aggressive "adversarial review" that specifically challenges weaknesses and design decisions, and the…...
Microsoft rolls out Copilot Cowork more broadly and lets AI models check each other's work
3+ day, 3+ hour ago (193+ words) Microsoft rolls out Copilot Cowork more broadly and lets AI models check each other's work'the-decoder.com Microsoft rolls out Copilot Cowork more broadly and lets AI models check each other's work Microsoft is making "Copilot Cowork" more widely available and…...
MetaClaw framework trains AI agents while you're in meetings by checking your Google Calendar
4+ day, 4+ hour ago (525+ words) Researchers from four US universities have built a framework that improves AI agents during operation. It checks the user's Google calendar to figure out when to train. Most AI agents built on large language models get trained once and then…...
Google's new Gemini API Agent Skill patches the knowledge gap AI models have with their own SDKs
5+ day, 3+ hour ago (116+ words) Google's new Gemini API Agent Skill patches the knowledge gap AI models have with their own SDKs'the-decoder.com Google's new Gemini API Agent Skill patches the knowledge gap AI models have with their own SDKs Older 2.5 models saw much smaller…...
Meta's hyperagents improve at tasks and improve at improving
5+ day, 8+ hour ago (806+ words) Researchers at Meta and several universities have developed "hyperagents," AI systems that don't just solve tasks, but also optimize the very mechanism they use to get better. The approach works across different task areas and could open the door to…...