Finetuning

Investigating the potential of Parameter-Efficient Fine-Tuning to enable individual models with massive scale.

Scaling PEFT for Trillion-Parameter Personal Models

This article explores the scaling capabilities of Parameter-Efficient Fine-Tuning (PEFT) towards creating millions of personal models, each potentially reaching trillion-parameter scales. It delves into the architectural and practical considerations for achieving such unprecedented model personalization and efficiency.

Discover BES, a novel framework coupling forward evolutionary search with backward goal decomposition to overcome sampling bottlenecks in LLM reasoning.

How Bidirectional Evolutionary Search Improves LLM Self-Improvement

This article explains Bidirectional Evolutionary Search (BES), a new framework that enhances LLM self-improvement by combining evolutionary operators for broader exploration with dense, intermediate feedback from goal decomposition. Learn how BES tackles the limitations of traditional sampling methods like best-of-N and tree search.

This paper introduces Macaron-A2UI, a novel model enabling AI agents to dynamically synthesize interactive UI controls alongside natural language, addressing the limitations of text-only interfaces.

Generative UI: Revolutionizing AI Agent Interactions Beyond Plain Text

Discover Macaron-A2UI, a groundbreaking model that allows AI agents to generate interactive UI elements using a declarative protocol. Learn about its comprehensive corpus construction, A2UI-Bench for structured evaluation, and a two-stage training recipe combining SFT and GRPO to enhance user experience and agent capability.

A CLI tool that estimates VRAM usage for LoRA/QLoRA training on consumer GPUs, with benchmarking and calibration.

Can I Fine-Tune This? — Practical Guide to VRAM Estimation

Learn how to use canifinetune to predict whether your LLM fine-tuning configuration fits on your GPU before downloading weights. Includes memory estimation, feasibility checks, recommendation, benchmarking, and recipe generation for Hugging Face + PEFT + TRL.