Fine Tuning

Page 1 of 2

Achieving state-of-the-art performance with AudioVAE, full-history conditioning, and reward-free self-corrective post-training for robust, expressive, and efficient speech synthesis.

dots.tts: 2B-Parameter Continuous Autoregressive TTS Foundation Model

Introducing dots.tts, a 2B-parameter continuous autoregressive text-to-speech foundation model. It leverages AudioVAE, full-history conditioning, and self-corrective post-training for unparalleled performance on multilingual benchmarks, offering strong generation stability, voice cloning, and emotional expressiveness with efficient MeanFlow distillation.

Introducing three core primitives for aggregating diverse models to achieve lower validation loss and improved data efficiency

Hyper-Epoch Pretraining (q0) for Data-Constrained Language Models

1Q Labs researchers introduce Hyper-Epoch Pretraining (q0), a conceptual shift from single-model training to exploring and aggregating a population of models. q0 uses cyclic schedules, chain distillation, and a learned prior to achieve significant data efficiency gains and lower validation loss in multi-epoch pretraining.

Investigating the potential of Parameter-Efficient Fine-Tuning to enable individual models with massive scale.

Scaling PEFT for Trillion-Parameter Personal Models

This article explores the scaling capabilities of Parameter-Efficient Fine-Tuning (PEFT) towards creating millions of personal models, each potentially reaching trillion-parameter scales. It delves into the architectural and practical considerations for achieving such unprecedented model personalization and efficiency.

Explore the LFM2.5 hybrid model architecture for efficient, agentic, and multilingual personal assistants on diverse hardware.

How LFM2.5-8B-A1B Powers On-Device AI with Unmatched Throughput

LFM2.5-8B-A1B is a new family of hybrid models designed for on-device deployment, building on the LFM2 architecture with extended pre-training and reinforcement learning. It offers competitive performance with larger models on instruction following and agentic tasks, boasting unmatched throughput on CPU and GPU inference with day-one support for llama.cpp, MLX, vLLM, and SGLang.

Millions invested in LLM alignment are undone by a simple script and electricity costs less than a fast-food meal, exposing a critical flaw in AI safety economics.

The $20 AI De-alignment: How Safety Guardrails Evaporate for Pocket Change

A group called Heretic demonstrated how to strip alignment and censorship from 168 open-weight LLMs for just $20, using "weight surgery." This automated process, which bypasses human judgment, reveals a six-order-of-magnitude cost asymmetry that undermines corporate-scale AI safety investments and highlights performance gains in de-aligned models.

Discover BES, a novel framework coupling forward evolutionary search with backward goal decomposition to overcome sampling bottlenecks in LLM reasoning.

How Bidirectional Evolutionary Search Improves LLM Self-Improvement

This article explains Bidirectional Evolutionary Search (BES), a new framework that enhances LLM self-improvement by combining evolutionary operators for broader exploration with dense, intermediate feedback from goal decomposition. Learn how BES tackles the limitations of traditional sampling methods like best-of-N and tree search.

Explore MiniCPM5-1B, a 1B-parameter LLM designed for on-device deployment, featuring state-of-the-art performance and a unique 'Think'/'No Think' dual-mode chat template.

What is MiniCPM5-1B and How Does Its Dual-Mode Architecture Work?

Discover MiniCPM5-1B, an efficient 1B-parameter causal language model optimized for local and resource-constrained environments. Learn about its Llama-based architecture, impressive 131K context window, and innovative 'Think' and 'No Think' modes that enable it to function as both a fast assistant and a deliberate reasoner from a single checkpoint.

Introducing SkillOpt, a novel framework that treats natural-language skill documents as trainable states for domain adaptation in large language models, enabling automated procedural improvement without modifying model weights.

SkillOpt: Optimizing LLM Behavior with Trainable Skill Documents

SkillOpt optimizes large language model behavior by iteratively refining natural-language "skill documents" through a propose-and-test loop. It uses an optimizer model to suggest edits, applies them under a bounded textual learning rate, and validates improvements, ensuring robust and portable domain adaptation for even closed-source frontier models.

This paper introduces Macaron-A2UI, a novel model enabling AI agents to dynamically synthesize interactive UI controls alongside natural language, addressing the limitations of text-only interfaces.

Generative UI: Revolutionizing AI Agent Interactions Beyond Plain Text

Discover Macaron-A2UI, a groundbreaking model that allows AI agents to generate interactive UI elements using a declarative protocol. Learn about its comprehensive corpus construction, A2UI-Bench for structured evaluation, and a two-stage training recipe combining SFT and GRPO to enhance user experience and agent capability.

Explore the technical innovations, ethical considerations, and practical applications of uncensored large language models, focusing on a community-driven variant of Qwen3.5.

Understanding Uncensored LLMs: A Deep Dive into Qwen3.5-35B-A3B-Heretic-V2

Learn about the architecture and capabilities of uncensored language models, specifically Qwen3.5-35B-A3B-Heretic-V2. Discover how multi-token prediction and various quantization formats enhance performance and accessibility, while understanding the implications of removing safety filters for research and development.

Elon Musk confirms 1.5T parameter model, tripling its predecessor, now enters fine-tuning for a public launch in weeks with enhanced coding capabilities.

xAI Completes Grok V9-Medium Training, June Release Expected

xAI has finished training its Grok V9-Medium foundational model, a 1.5 trillion parameter AI with significant improvements over its predecessor, v8-small. The model, which heavily emphasizes coding tasks through Cursor data, is now undergoing fine-tuning and reinforcement learning, with a public release anticipated in early to mid-June 2026.

Subterranean compilation eliminates the orchestrator at runtime, slashing costs and latency while matching frontier accuracy.

How to Compile Multi-Step AI Workflows Directly into Small Models

Discover how synthetic data and full-parameter fine-tuning can internalize complex procedures in a small LLM, removing the need for external orchestration and delivering dramatic cost savings.