Academic
Page 2 of 3

SkillOpt: Optimizing LLM Behavior with Trainable Skill Documents
SkillOpt optimizes large language model behavior by iteratively refining natural-language "skill documents" through a propose-and-test loop. It uses an optimizer model to suggest edits, applies them under a bounded textual learning rate, and validates improvements, ensuring robust and portable domain adaptation for even closed-source frontier models.

Generative UI: Revolutionizing AI Agent Interactions Beyond Plain Text
Discover Macaron-A2UI, a groundbreaking model that allows AI agents to generate interactive UI elements using a declarative protocol. Learn about its comprehensive corpus construction, A2UI-Bench for structured evaluation, and a two-stage training recipe combining SFT and GRPO to enhance user experience and agent capability.

ProAct: A Proactive AI Assistant Architecture for Anticipatory Computing
This article delves into ProAct, a proactive AI assistant designed to anticipate user needs and acquire information during idle times. By shifting computation from peak interaction periods, ProAct aims to reduce user effort, accelerate task completion, and improve factual grounding through a closed-loop system of prediction, acquisition, and utility-aware delivery.

LLMs Learn to "Sleep" for Deeper Reasoning
This article explores how "LLM sleep," an offline consolidation phase, allows hybrid attention-SSM models to improve deep reasoning by iteratively refining fast-weight memories. Inspired by hippocampal replay, this method addresses the computational bottleneck of context eviction, enhancing performance on complex sequential tasks without increasing prediction-time cost.

Missing Paper Content Hinders Accurate Synthesis
This article highlights the challenges of producing accurate and comprehensive paper summaries when only a title is provided. It emphasizes that a full understanding of research requires complete content, encompassing abstract, methodology, results, and illustrative figures, to ensure an evidence-based synthesis.

Why LLM Agents Fail at Structural Constraints in Backend Code
Learn how LLM agents fail to maintain structural constraints like ORM and architectural patterns in multi-file backend generation. This paper identifies constraint decay, framework sensitivity, and data-layer defects as key challenges for autonomous coding.

SANA-WM: Open-Source Bidirectional World Model for Minute-Long Video
SANA-WM is an efficient open-source world model trained for one-minute video generation. It uses a bidirectional image-to-video diffusion transformer with hybrid linear attention, dual-branch camera control, and a two-stage pipeline. Runs on under 8GB VRAM and generates 60-second 720p clips in 34 seconds on a single RTX 5090.

Europe’s AI Strategy: Sovereignty, Trust, and Coalition-Building
A panel of experts examines Europe's path to AI leadership through digital sovereignty, trust-based regulation, and international partnerships, contrasting US monopolization and China's democratization of AI.

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation
LongLive-2.0 presents the first end-to-end NVFP4 system for long video generation. It introduces Balanced Sequence Parallelism (SP) and NVFP4 quantization to accelerate training and inference. On Blackwell GPUs, W4A4 inference and quantized KV cache reduce memory and boost throughput. A clean training pipeline directly fine-tunes diffusion models into autoregressive models with standalone LoRA for real-time generation. Multi-shot attention sink enables stable streaming. Experiments show up to 2.15× training speedup and 1.84× inference speedup, achieving 45.7 FPS at 5B parameters.

OpenAI’s Failed Contract with Users: Safety Systems That Stifle and Mislead
An archival record of OpenAI’s October 2025 policy announcements, user backlash over unrelaxed guardrails and degraded model quality, plus the Stanford sycophancy study revealing AI’s dangerous tendency to agree. Users demand preservation of GPT-4o, cite harm to vulnerable populations, and migrate to competitors as trust erodes.

The Myth That Data Centers Are Hiking Your Electric Bill
Contrary to popular belief, data center growth has not driven up residential electricity prices. Analysis of EIA data shows top data center states have the lowest rates. This article also debunks myths about water usage, AI energy efficiency, and disaster risks.

When Should AI Agents Ask for Clarification? Timing Matters
A forced-injection framework across 6,000+ runs shows that the value of clarification depends sharply on information type and timing. Goal clarification loses nearly all value after 10% of execution, while input clarification retains value through 50%. Current frontier models fail to ask within optimal windows.