ai/news — latest articles 8

latest articles

OpenAI’s Failed Contract with Users: Safety Systems That Stifle and Mislead

An archival record of OpenAI’s October 2025 policy announcements, user backlash over unrelaxed guardrails and degraded model quality, plus the Stanford sycophancy study revealing AI’s dangerous tendency to agree. Users demand preservation of GPT-4o, cite harm to vulnerable populations, and migrate to competitors as trust erodes.

Benchmark

What ByteShape's Qwen 3.6 35B Quants Reveal About Model Optimization

ByteShape released GGUF quantizations of Qwen 3.6 35B-A3B with NTP and MTP variants. Discover why lower bpw isn't always optimal, how MTP boosts GPU generation speed 20-40%, and why MMLU was excluded. Includes community benchmarks and hardware-specific recommendations.

Benchmark

Gemma 4 MTP Fails to Deliver Speed Gains on Top GPUs

Reddit users tested the work-in-progress Gemma 4 MTP model. Most high-end GPU configurations saw equal or worse performance compared to non-MTP inference. Only a mixed VRAM/CPU setup showed significant speedup. Stability issues reported. Community anticipates further optimizations.

Safety

The Myth That Data Centers Are Hiking Your Electric Bill

Contrary to popular belief, data center growth has not driven up residential electricity prices. Analysis of EIA data shows top data center states have the lowest rates. This article also debunks myths about water usage, AI energy efficiency, and disaster risks.

Agentic Systems

When Should AI Agents Ask for Clarification? Timing Matters

A forced-injection framework across 6,000+ runs shows that the value of clarification depends sharply on information type and timing. Goal clarification loses nearly all value after 10% of execution, while input clarification retains value through 50%. Current frontier models fail to ask within optimal windows.

Agentic Systems

Verifiable Proofs for Auditing AI Agents on Solana

Explore how verifiable proofs enable transparent auditing of AI agents on the Solana blockchain, combining cryptographic guarantees with decentralized trust to ensure accountability and reliability in autonomous systems.

Agentic Systems

Kimi WebBridge: AI-Powered Browser Automation for Tedious Web Tasks

Kimi WebBridge is a browser extension that enables AI agents to automate web browsing tasks like clicking, form filling, and data extraction. It runs locally via Chrome DevTools Protocol, ensuring login sessions and page content never leave your device. Compatible with Kimi Code, Claude Code, Cursor, and more.

Agentic Systems

xAI Launches Grok Build Beta: CLI with Multi-Agent Coordination

xAI releases Grok Build Beta, a command-line interface for SuperGrok Heavy subscribers. Features include multi-agent coordination, skills adaptation, plan viewer, marketplaces, and design polish commands. Try now via curl install.

Agentic Systems

Grok Skills: Reusable Instruction Sets for Task Automation

xAI's Grok chatbot is developing a Skills feature that stores reusable instruction sets for automation. Leaked screenshots and code references indicate modular templates for scheduled workflows, similar to Anthropic and OpenAI's recent moves.

Agentic Systems

2026 Agentic Coding Trends: The Era of AI Collaboration

The 2026 Agentic Coding Trends Report reveals how AI coding agents evolve from experimental tools to production systems, enabling multi-agent teams, long-running autonomous builds, and intelligent human oversight. Key trends include collapsed SDLC cycles, orchestration of specialized agents, and the transformation of engineers into strategic collaborators.

Agentic Systems

Interaction Models: Real-Time Human-AI Collaboration at Scale

Thinking Machines Lab introduces TML-Interaction-Small, a 276B MoE model architected for real-time, continuous audio-video-text exchange. It achieves state-of-the-art performance on interactivity benchmarks, enabling seamless turn-taking, interjections, and simultaneous tool use—scaling collaboration alongside intelligence.

Agentic Systems

Fast Byte Latent Transformer: Efficient Byte-Level Generation via Diffusion and Speculation

This paper introduces BLT Diffusion (BLT-D), BLT Self-speculation (BLT-S), and BLT Diffusion+Verification (BLT-DV) to accelerate byte-level language models. By replacing autoregressive decoding with block-wise diffusion and verification, the methods achieve over 50% memory-bandwidth reduction and up to 92% with larger blocks, while maintaining competitive performance on translation and code generation tasks.