What SmallCode Does
SmallCode is a terminal-native coding agent that makes small local language models (8B–35B parameters) effective at programming tasks. It compensates for the limitations of consumer-hardware models by employing context budget management, forgiving tool-call parsing (JSON, YAML, XML, plain text, auto-repair), patch-first editing with search‑and‑replace, TODO-driven planning, persistent shell sessions, optional cloud escalation, and observability tools. The agent works fully locally by default.
Getting Started
Install via npm: npm install -g smallcode (requires Node.js 18+).
Prebuilt binaries for Linux, macOS, and Windows are available through a shell script.
Verify with smallcode --help.
You need a local LLM server with an OpenAI‑compatible endpoint (LM Studio, Ollama).
Optional better-sqlite3 enables code graph and FTS5 memory; if compilation fails a JSON store is used.
Create a .env file with:
SMALLCODE_MODEL=<model-name>(required)SMALLCODE_BASE_URL=http://localhost:1234/v1(required)
Optionally add cloud API keys (ANTHROPIC_API_KEY, etc.) for escalation.
Run smallcode to start.
Interactive TUI and Slash Commands
Run smallcode inside a project directory.
The agent auto-detects runtime and test commands, then presents a conversational interface.
Complex tasks are decomposed into TODO plans; it uses tools automatically and validates each step.
Slash commands offer control:
/quit, /clear, /stats, /tokens, /budget (context visual), /trace, /eval, /memory, /plan, /model, /profile, /cognition, /mcp, /skill, /plugin, /sessions, /help.
These enable monitoring, session management, and model switching.
Programmatic Usage and Testing
Embed SmallCode as a library.
Require Smallcode, instantiate with model and base URL, and call agent.run().
Subscribe to events like tool_start and error.
The returned RunResult includes files created/edited, tool calls, success status, and token usage.
const { SmallCode } = require('smallcode'); const agent = new SmallCode({ model: 'gemma-4-e4b', baseUrl: 'http://localhost:1234/v1', }); const result = await agent.run("create hello.py that prints hello world"); console.log(result.filesCreated); // ['hello.py'] console.log(result.toolCalls.length); // 1 console.log(result.success); // true agent.on('tool_start', ({ name, args }) => console.log(`Using: ${name}`)); agent.on('tool_end', ({ name, ms }) => console.log(`Done: ${name} (${ms}ms)`)); agent.on('error', (err) => console.error(err));
Run benchmarks with npm run bench:smoke, bench:polyglot, or bench:tools.
Results are saved to .smallcode/benchmarks/.
Generate regression tests from traces using /trace test <id>.
Configuration Options
Configure via .env or the legacy smallcode.toml.
Key variables:
- SMALLCODE_MODEL (required) – model name.
- SMALLCODE_BASE_URL (required) – endpoint.
- ANTHROPIC_API_KEY, OPENAI_API_KEY, DEEPSEEK_API_KEY – opt‑in cloud escalation keys.
- SMALLCODE_THINKING_BUDGET – max reasoning tokens (default 2000); disable with
THINKING_DISABLE=true. - SMALLCODE_SHELL_PERSIST – keep shell state across turns (default true).
- SMALLCODE_WRITE_GUARD – refuse first write to unread file (default true).
- SMALLCODE_SNAPSHOT/AUTO_ROLLBACK – pre‑edit snapshots and auto‑revert on hard failure.
- SMALLCODE_PLAN – force plan‑then‑execute mode.
- SMALLCODE_TEST_RUNNER/TEST_DISABLE – override or disable test‑runner injection.
- SMALLCODE_WEB_BROWSE – enable web tools (20B+ model required).
- SMALLCODE_MODEL_MEDIUM/STRONG – models for adaptive routing.
TEMP_ADAPT,TRUST_DECAY– adaptive retry and trust‑score flags.
Constraints and Limitations
- Models ≤4B struggle with multi‑step tool use; best at 8B–35B.
- Tool results are truncated at 4k characters; older results may be evicted under budget pressure.
better-sqlite3may need native compilation for code graph; otherwise JSON memory is used.- Web browsing is only reliable with 20B+ models.
- Cloud escalation charges apply; calls are session‑limited.
- Thinking budget can truncate reasoning on models like DeepSeek R1, impacting complex tasks.
- Write guard blocks first write to an unread file; second attempt is allowed.
- Snapshot auto‑rollback can revert all turn edits; snapshot data persists in
.smallcode/snapshots/.
Best Practices
- Allow SmallCode to auto‑detect the project runtime, framework, and tests on the first turn to save tool calls.
- Use the default heuristic planning; enable
SMALLCODE_PLAN=trueonly if the model drifts. - Keep evidence store on to capture task outcomes for future learning.
- Enable read‑before‑write guard and snapshot auto‑rollback for risky edits.
- Set
SMALLCODE_TEST_RUNNERif auto‑detection fails, to avoid wasted tool calls. - Use persistent shell sessions (
SHELL_PERSIST=true) for multi‑step commands. - Monitor context with
/budgetand/tokens. - Keep adaptive retry temperature on so failed edits vary.
- Web browsing is recommended only for 20B+ models; disable for smaller ones.
- Generate regression tests from traces after successful complex tasks.
Advanced Features
Snapshot & Rollback: Pre‑turn file checkpoints are created. AUTO_ROLLBACK=true reverts all edits of a turn on validation hard‑fail.
Snapshots persist in .smallcode/snapshots/.
Sessions & Memory: Sessions auto‑save; use /sessions list and resume.
A persistent scratchpad serves as working memory; the evidence store captures what was tried and failed, aiding future runs.
Skills & Plugins: Six built‑in skills (brainstorming, debugging, tdd, etc.) load from skills/.
Manage with /skill list, /skill use.
Plugins can be installed via /plugin.
Adaptive Routing: Tracks per‑model failure rates; auto‑switches to SMALLCODE_MODEL_MEDIUM/STRONG when thresholds (0.3/0.6) are crossed.
Cognition & Scaffolding: MarrowScript compiles to TypeScript for caching, retries, and validation.
BoneScript scaffolds a complete Node.js backend from a single .bone file.
Configuration & Upgrades: Support for .env and legacy .smallcode.toml.
Benchmarks run with npm run bench:smoke, bench:polyglot, bench:tools.



