home›Agentic Systems›

How to Automate Penetration Testing with PentesterFlow AI Assistant

A practical guide to using PentesterFlow, an open-source terminal assistant for authorized penetration testing and bug hunting, integrating LLMs with real security tools.

June 9, 2026

#Agents #Automation #Dev Tools #LLM #Security

PentesterFlow is an open-source terminal assistant for authorized penetration testing and bug hunting. It combines local or remote LLMs with real security tools, keeping the human analyst in control. This guide covers installation, usage, and practical workflows for domain-specific security tasks.

Introduction and Key Features

PentesterFlow is an open-source terminal assistant for authorized penetration testing and bug hunting. It combines local or remote LLMs with real security tools, keeping the human analyst in control. The agent plans, executes, verifies, reports, and learns—always asking for approval before sensitive actions.

It solves concrete problems: domain-specific workflows for recon, web vulnerabilities, SSRF, SSTI, JWT, GraphQL, and more; prevents hallucinated findings by requiring request/response evidence; supports long engagements with session saving, context compaction, and snapshots; integrates with shell, HTTP, Burp, browser capture, and MCP tools; and enforces human oversight with permission prompts and an explicit YOLO mode for labs.

Every command is copy-pasteable, findings are written as Markdown with curl proofs, and logs are deterministic local files—ensuring reproducibility and auditability.

Installation and Setup

Installation is a single command. The installer verifies the published SHA-256 checksum. You need a supported LLM backend; the quickest start is with a local model via Ollama. Pull a capable model like qwen2.5-coder:32b, then launch pentesterflow. Inside the CLI, use /provider to configure the backend and /target https://app.example.com to set the engagement URL. From there, give high-level tasks.

To resume a previous session, run pentesterflow --resume <session-id>; the tool shows a recap of persistent memory so you can continue without reconstructing context.

# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/PentesterFlow/agent/main/install.sh | sh

# Windows PowerShell
irm https://raw.githubusercontent.com/PentesterFlow/agent/main/install.ps1 | iex

# Pin a specific version and install directory
PENTESTERFLOW_VERSION=v0.1.6 PENTESTERFLOW_INSTALL_DIR="$HOME/.local/bin" \
  sh -c "$(curl -fsSL https://raw.githubusercontent.com/PentesterFlow/agent/main/install.sh)"

Practical Usage

A typical engagement starts by setting a target and issuing a task. For example, testing an orders API for broken access control: the agent loads the webvuln skill, sends an HTTP GET to the endpoint, verifies cross-account access with a BashTool command, and saves a confirmed finding with evidence. The finding file includes a copy-pasteable curl command and raw request material.

You can choose the LLM backend and model interactively with /provider and /model list, or via command-line flags. Supported backends include Ollama, LM Studio, OpenAI-compatible endpoints, Kimi, Groq, and Gemini. API keys are set through environment variables like MOONSHOT_API_KEY, GROQ_API_KEY, or GEMINI_API_KEY.

# Ollama
pentesterflow --backend ollama --model qwen2.5-coder:32b

# LM Studio
pentesterflow --backend lmstudio --model zai-org/glm-4.7-flash

# OpenAI-compatible endpoint
pentesterflow --backend openai-compat \
  --base-url https://api.example.com/v1 \
  --api-key sk-...

# Kimi (requires API key)
MOONSHOT_API_KEY=sk-... pentesterflow --backend kimi --model kimi-k2.6

# Groq
GROQ_API_KEY=gsk_... pentesterflow --backend groq --model openai/gpt-oss-20b

# Gemini
GEMINI_API_KEY=AIza... pentesterflow --backend gemini --model models/gemini-3.5-flash

Slash Commands and Burp Integration

Key slash commands control the session: /provider and /model manage the LLM; /target sets the engagement URL; /plan runs a planning-only turn; /next suggests untested areas; /compact summarizes context into persistent memory; /snapshot writes a redacted context snapshot; /skills manages skill playbooks; /yolo toggles auto-approval for labs; /update fetches the latest release; /reset clears the session; /<skill-name> loads a skill.

For Burp Suite collaboration, start the bridge with pentesterflow --burp [port]. You can send selected Burp requests to PentesterFlow for analysis, queue requests as scan tasks, and import confirmed findings back into Burp issues. The bridge preserves raw requests for evidence and replay.

Configuration and Data Storage

Main configuration lives at ~/.pentesterflow/config.json and stores backend, model, endpoint, and disabled-skill settings. Environment variables set API keys for Kimi (MOONSHOT_API_KEY), Groq (GROQ_API_KEY), and Gemini (GEMINI_API_KEY). Debug logging is enabled with PENTESTERFLOW_DEBUG_SESSION=1.

Important command-line flags: --backend, --model, --base-url, --api-key, --resume <id>, --burp [port], --debug-session, --list-tools, --list-skills, --no-stream, --dangerously-skip-permissions.

Data is stored in several locations:

~/.pentesterflow/sessions/*.json – saved sessions
~/.pentesterflow/context/*.md – redacted snapshots
./.pentesterflow/intelligence/scenarios.jsonl – project intelligence
~/.pentesterflow/builtin-skills/ – shipped skills
./findings/<slug>.md – confirmed findings
~/.pentesterflow/logs/pentesterflow.log – structured logs
~/.pentesterflow/debug/session-*.jsonl – full debug logs (sensitive)

Constraints, Best Practices, and Procedures

Constraints: PentesterFlow must only be used on systems with explicit authorization. By default, sensitive actions require approval; YOLO mode (/yolo on) auto-approves everything and is intended for labs only. Provider-specific tuning exists: Groq uses a compact prompt, LM Studio trims template markers. Catastrophic shell commands are blocked. There is no GUI. Debug logs contain raw target data and must be handled as confidential.

Best practices: Always confirm authorization. Use confirm_finding only after reproducing the issue with a deterministic request and observed response. Leverage coverage tracking and ask /next to identify untested areas. Place project-specific skills in ./.pentesterflow/skills/ and personal reusable ones in ~/.pentesterflow/skills/. Compact long sessions with /compact. Enable debug logging for troubleshooting, but treat output as sensitive. Update from within the tool with /update.

Notable procedures: To install a specific version, set PENTESTERFLOW_VERSION and PENTESTERFLOW_INSTALL_DIR environment variables before running the installer. To update from the CLI, run /update or /update v0.1.6. For Burp collaboration, start PentesterFlow with --burp 9999 and send requests from Burp.

GitHub