home›AI Coding›

What is SmallCode? A Terminal-Native AI Coding Agent

Discover how SmallCode leverages small local LLMs for effective programming tasks on consumer hardware, offering advanced context management and interactive features.

May 26, 2026

#Agents #Dev Tools #Development #LLM #Open Source

Explore SmallCode, a terminal-native AI coding agent designed to make 8B–35B parameter local language models powerful for programming. Learn about its context budget management, patch-first editing, TODO-driven planning, and interactive TUI, enabling efficient development fully locally.

What SmallCode Does

SmallCode is a terminal-native coding agent that makes small local language models (8B–35B parameters) effective at programming tasks. It compensates for the limitations of consumer-hardware models by employing context budget management, forgiving tool-call parsing (JSON, YAML, XML, plain text, auto-repair), patch-first editing with search‑and‑replace, TODO-driven planning, persistent shell sessions, optional cloud escalation, and observability tools. The agent works fully locally by default.

Getting Started

Install via npm: npm install -g smallcode (requires Node.js 18+). Prebuilt binaries for Linux, macOS, and Windows are available through a shell script. Verify with smallcode --help. You need a local LLM server with an OpenAI‑compatible endpoint (LM Studio, Ollama). Optional better-sqlite3 enables code graph and FTS5 memory; if compilation fails a JSON store is used.

Create a .env file with:

SMALLCODE_MODEL=<model-name> (required)
SMALLCODE_BASE_URL=http://localhost:1234/v1 (required)

Optionally add cloud API keys (ANTHROPIC_API_KEY, etc.) for escalation. Run smallcode to start.

Interactive TUI and Slash Commands

Run smallcode inside a project directory. The agent auto-detects runtime and test commands, then presents a conversational interface. Complex tasks are decomposed into TODO plans; it uses tools automatically and validates each step. Slash commands offer control:

/quit, /clear, /stats, /tokens, /budget (context visual), /trace, /eval, /memory, /plan, /model, /profile, /cognition, /mcp, /skill, /plugin, /sessions, /help.

These enable monitoring, session management, and model switching.

Programmatic Usage and Testing

Embed SmallCode as a library. Require Smallcode, instantiate with model and base URL, and call agent.run(). Subscribe to events like tool_start and error. The returned RunResult includes files created/edited, tool calls, success status, and token usage.

const { SmallCode } = require('smallcode');

const agent = new SmallCode({
  model: 'gemma-4-e4b',
  baseUrl: 'http://localhost:1234/v1',
});

const result = await agent.run("create hello.py that prints hello world");
console.log(result.filesCreated);  // ['hello.py']
console.log(result.toolCalls.length);  // 1
console.log(result.success);  // true

agent.on('tool_start', ({ name, args }) => console.log(`Using: ${name}`));
agent.on('tool_end', ({ name, ms }) => console.log(`Done: ${name} (${ms}ms)`));
agent.on('error', (err) => console.error(err));

Run benchmarks with npm run bench:smoke, bench:polyglot, or bench:tools. Results are saved to .smallcode/benchmarks/. Generate regression tests from traces using /trace test <id>.

Configuration Options

Configure via .env or the legacy smallcode.toml. Key variables:

SMALLCODE_MODEL (required) – model name.
SMALLCODE_BASE_URL (required) – endpoint.
ANTHROPIC_API_KEY, OPENAI_API_KEY, DEEPSEEK_API_KEY – opt‑in cloud escalation keys.
SMALLCODE_THINKING_BUDGET – max reasoning tokens (default 2000); disable with THINKING_DISABLE=true.
SMALLCODE_SHELL_PERSIST – keep shell state across turns (default true).
SMALLCODE_WRITE_GUARD – refuse first write to unread file (default true).
SMALLCODE_SNAPSHOT/AUTO_ROLLBACK – pre‑edit snapshots and auto‑revert on hard failure.
SMALLCODE_PLAN – force plan‑then‑execute mode.
SMALLCODE_TEST_RUNNER/TEST_DISABLE – override or disable test‑runner injection.
SMALLCODE_WEB_BROWSE – enable web tools (20B+ model required).
SMALLCODE_MODEL_MEDIUM/STRONG – models for adaptive routing.
TEMP_ADAPT, TRUST_DECAY – adaptive retry and trust‑score flags.

Constraints and Limitations

Models ≤4B struggle with multi‑step tool use; best at 8B–35B.
Tool results are truncated at 4k characters; older results may be evicted under budget pressure.
better-sqlite3 may need native compilation for code graph; otherwise JSON memory is used.
Web browsing is only reliable with 20B+ models.
Cloud escalation charges apply; calls are session‑limited.
Thinking budget can truncate reasoning on models like DeepSeek R1, impacting complex tasks.
Write guard blocks first write to an unread file; second attempt is allowed.
Snapshot auto‑rollback can revert all turn edits; snapshot data persists in .smallcode/snapshots/.

Best Practices

Allow SmallCode to auto‑detect the project runtime, framework, and tests on the first turn to save tool calls.
Use the default heuristic planning; enable SMALLCODE_PLAN=true only if the model drifts.
Keep evidence store on to capture task outcomes for future learning.
Enable read‑before‑write guard and snapshot auto‑rollback for risky edits.
Set SMALLCODE_TEST_RUNNER if auto‑detection fails, to avoid wasted tool calls.
Use persistent shell sessions (SHELL_PERSIST=true) for multi‑step commands.
Monitor context with /budget and /tokens.
Keep adaptive retry temperature on so failed edits vary.
Web browsing is recommended only for 20B+ models; disable for smaller ones.
Generate regression tests from traces after successful complex tasks.

Advanced Features

Snapshot & Rollback: Pre‑turn file checkpoints are created. AUTO_ROLLBACK=true reverts all edits of a turn on validation hard‑fail. Snapshots persist in .smallcode/snapshots/.

Sessions & Memory: Sessions auto‑save; use /sessions list and resume. A persistent scratchpad serves as working memory; the evidence store captures what was tried and failed, aiding future runs.

Skills & Plugins: Six built‑in skills (brainstorming, debugging, tdd, etc.) load from skills/. Manage with /skill list, /skill use. Plugins can be installed via /plugin.

Adaptive Routing: Tracks per‑model failure rates; auto‑switches to SMALLCODE_MODEL_MEDIUM/STRONG when thresholds (0.3/0.6) are crossed.

Cognition & Scaffolding: MarrowScript compiles to TypeScript for caching, retries, and validation. BoneScript scaffolds a complete Node.js backend from a single .bone file.

Configuration & Upgrades: Support for .env and legacy .smallcode.toml. Benchmarks run with npm run bench:smoke, bench:polyglot, bench:tools.

GitHub