Tailored news hub
Agentic Systems
SkillOpt: Optimizing Agent Skills with Trainable Natural-Language Descriptions

SkillOpt: Optimizing Agent Skills with Trainable Natural-Language Descriptions

SkillOpt, from Microsoft Research, is a text-space optimizer that treats agent skill documentation as a trainable external state. This approach allows agents to self-evolve their capabilities, as shown by @omarsar0's integration, which improved paper-figure extraction quality by 20 points.

Anthropic Dynamic Workflows: Definitions, Claude Code, and Orchestration Patterns

Anthropic Dynamic Workflows: Definitions, Claude Code, and Orchestration Patterns

Explore Anthropic's dynamic workflows, where Claude autonomously determines action sequences. This entry defines dynamic workflows, details their implementation in Claude Code as JavaScript scripts for large-scale orchestration, and compares them to static workflows, subagents, and other AI patterns.

How to Automate Penetration Testing with PentesterFlow AI Assistant

How to Automate Penetration Testing with PentesterFlow AI Assistant

PentesterFlow is an open-source terminal assistant for authorized penetration testing and bug hunting. It combines local or remote LLMs with real security tools, keeping the human analyst in control. This guide covers installation, usage, and practical workflows for domain-specific security tasks.

The $6,600 MOBA: What Claude 4.8's Weekend Game Build Reveals About AI Development

The $6,600 MOBA: What Claude 4.8's Weekend Game Build Reveals About AI Development

A web-based MOBA game, lmaomoba.com, was built by Claude 4.8 (Opus) over a weekend, from a single prompt, using TypeScript, React, Canvas, and PartyKit. All art assets were AI-generated. The project, estimated at 2.7 billion tokens, highlights AI's capacity for rapid, full-stack game development and the associated token costs.

AI Coding
Personal Assistants
LLMs
How LFM2.5-8B-A1B Powers On-Device AI with Unmatched Throughput

How LFM2.5-8B-A1B Powers On-Device AI with Unmatched Throughput

LFM2.5-8B-A1B is a new family of hybrid models designed for on-device deployment, building on the LFM2 architecture with extended pre-training and reinforcement learning. It offers competitive performance with larger models on instruction following and agentic tasks, boasting unmatched throughput on CPU and GPU inference with day-one support for llama.cpp, MLX, vLLM, and SGLang.

NVIDIA Nemotron-3-Ultra 550B: A Frontier LLM for Complex AI Workflows

NVIDIA Nemotron-3-Ultra 550B: A Frontier LLM for Complex AI Workflows

Nemotron-3-Ultra-550B-A55B-BF16 is a frontier-scale LLM by NVIDIA, featuring a LatentMoE architecture, Mamba-2 + MoE + Attention hybrid, and Multi-Token Prediction. Designed for complex multi-step agents, long-context analysis, and high-accuracy reasoning across multiple languages, it offers configurable reasoning and is released under the OpenMDW License.

New LLM "Sleep" Phase Boosts Long-Context Performance

New LLM "Sleep" Phase Boosts Long-Context Performance

Researchers propose a "sleep" phase for large language models that converts recent context into persistent fast weights, clearing the key-value cache. This innovative approach addresses the attention bottleneck, enabling models to handle long-context tasks efficiently and perform better on complex benchmarks like math reasoning.

MiniMax Unveils M2 Series, Teases M3 with 9.7x Speedup via Sparse Attention

MiniMax Unveils M2 Series, Teases M3 with 9.7x Speedup via Sparse Attention

MiniMax releases a technical report on its M2 model series, featuring a sparse Mixture-of-Experts backbone and innovative "interleaved thinking." The report also previews the upcoming M3 model, which achieves a 9.7x prefilling speedup with MiniMax Sparse Attention (MSA) for 1-million-token sequences, pushing AI efficiency boundaries.

Audio
Images
Video
SwiftVR: Real-Time Generative Video Restoration on Consumer GPUs

SwiftVR: Real-Time Generative Video Restoration on Consumer GPUs

SwiftVR is a streaming one-step generative video restoration framework for live-stream applications. It addresses consumer GPU bottlenecks with mask-free shifted-window self-attention and a lightweight autoencoder, achieving real-time 1080p streaming on consumer-grade GPUs and 4K on H100.

How NAVA Generates Synchronized 720p Audio-Video from a Single Prompt

How NAVA Generates Synchronized 720p Audio-Video from a Single Prompt

NAVA is a 6.3B-parameter joint audio-video generator that synthesizes synchronized 720p video and audio from a single prompt. It utilizes an Align-then-Fuse MMDiT architecture to establish audio-video correspondence, offering features like multi-speaker speech with timbre control, fast generation, and language-described camera control.

You’ve Been Lied To About Video AI’s Real Breakthrough

You’ve Been Lied To About Video AI’s Real Breakthrough

The AI world is obsessed with generating video from scratch, but the true frontier is native editing through conversation. Gemini Omni’s ability to surgically alter existing footage without re-rendering shatters the old pipeline approach, even as token costs threaten to gatekeep the revolution.

SANA-WM: Open-Source Bidirectional World Model for Minute-Long Video

SANA-WM: Open-Source Bidirectional World Model for Minute-Long Video

SANA-WM is an efficient open-source world model trained for one-minute video generation. It uses a bidirectional image-to-video diffusion transformer with hybrid linear attention, dual-branch camera control, and a two-stage pipeline. Runs on under 8GB VRAM and generates 60-second 720p clips in 34 seconds on a single RTX 5090.

Finetuning
Training
dots.tts: 2B-Parameter Continuous Autoregressive TTS Foundation Model

dots.tts: 2B-Parameter Continuous Autoregressive TTS Foundation Model

Introducing dots.tts, a 2B-parameter continuous autoregressive text-to-speech foundation model. It leverages AudioVAE, full-history conditioning, and self-corrective post-training for unparalleled performance on multilingual benchmarks, offering strong generation stability, voice cloning, and emotional expressiveness with efficient MeanFlow distillation.

Hyper-Epoch Pretraining (q0) for Data-Constrained Language Models

Hyper-Epoch Pretraining (q0) for Data-Constrained Language Models

1Q Labs researchers introduce Hyper-Epoch Pretraining (q0), a conceptual shift from single-model training to exploring and aggregating a population of models. q0 uses cyclic schedules, chain distillation, and a learned prior to achieve significant data efficiency gains and lower validation loss in multi-epoch pretraining.

SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Language Tasks

SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Language Tasks

Introducing SCOPE, a data-free self-play framework for open-ended tasks that co-evolves a Challenger for task generation and a Solver for answering. It uses a self-judge to create rubrics and grade responses, improving 7-8B instruction-tuned models by up to +10.4 points on open-ended and +13.8 points on held-out QA benchmarks.

SANA-Streaming: Real-time Video Editing with Hybrid Diffusion Transformer

SANA-Streaming: Real-time Video Editing with Hybrid Diffusion Transformer

SANA-Streaming introduces a hybrid diffusion transformer and Cycle-Reverse Regularization for real-time streaming video editing. Optimized for NVIDIA Blackwell (RTX 5090), it achieves 1280x704 resolution at 24 FPS with superior temporal coherence and throughput on consumer GPUs.

Benchmark
Safety
ChatGPT's Memory System: Invasive, Irrelevant, or Inevitable?

ChatGPT's Memory System: Invasive, Irrelevant, or Inevitable?

A new ChatGPT memory system, generating and carrying conversation summaries, faces user criticism for being invasive, irrelevant, and detrimental to structured projects. Observed behaviors include continuous "gigantic summaries," meta-level statements, and cross-chat context carrying, sparking user annoyance and frustration over lack of control.

The $20 AI De-alignment: How Safety Guardrails Evaporate for Pocket Change

The $20 AI De-alignment: How Safety Guardrails Evaporate for Pocket Change

A group called Heretic demonstrated how to strip alignment and censorship from 168 open-weight LLMs for just $20, using "weight surgery." This automated process, which bypasses human judgment, reveals a six-order-of-magnitude cost asymmetry that undermines corporate-scale AI safety investments and highlights performance gains in de-aligned models.

How to Evaluate Multimodal LLM Safety with MLLM-Jailbreak-Bench

How to Evaluate Multimodal LLM Safety with MLLM-Jailbreak-Bench

Discover MLLM-Jailbreak-Bench, an evaluation framework for assessing multimodal LLM safety across five attack categories. Understand how to measure Attack Success Rate, refusal quality, and calibration error to identify real safety gaps and avoid false positives. Get started with installation and quick-start instructions.

OpenAI's Betrayal: How ChatGPT's "Safety" Destroyed Trust and Functionality

OpenAI's Betrayal: How ChatGPT's "Safety" Destroyed Trust and Functionality

OpenAI's recent "safety" updates for ChatGPT have alienated its most dedicated users. This article details how tightened guardrails led to false flagging, psychological distress, model manipulation, and a significant decline in performance, leaving subscribers with a broken product and a profound sense of betrayal.

Document Processing
Memory
Communities & Discussions