CLI & TUI · Automation & AI agents

Automation & AI agents

SecVF's secvf-cli is JSON-native and side-effect-explicit. That makes it a natural target for AI agents — an LLM on the host can spin up VMs, detonate samples, harvest PCAPs, and roll back to clean snapshots, all without touching a GUI. This page is the integration playbook.

Distinction first: SecVF has two AI surfaces and they're easy to confuse. The AI sandbox page covers running AI agents inside ephemeral macOS guest VMs (the guest is the agent's box). This page covers AI agents on the host driving the SecVF tooling itself — orchestrating analysis workflows across many VMs.

Why this matters for malware analysis

Sample triage is the perfect agent task: lots of repetitive setup, well-defined steps, clear success criteria, and a strong feedback signal from logs. A human can analyse three samples a day with care; an agent can sweep three hundred per day and surface the ten that need a human.

What that looks like in practice:

  • Drop a hash, get a verdict. Agent grabs the sample, picks a guest, configures network mode, detonates, captures, runs Suricata over the PCAP, summarises.
  • Compare two samples behaviourally. Agent detonates each in parallel sessions, diffs DNS queries, JA3 hashes, and process trees.
  • FakeNet sweep for IOCs. Agent runs each sample through FakeNet mode, extracts domains/URIs/SNI, deduplicates against existing IOC corpus, files net-new for human review.
  • Continuous integration for analysis tooling. Agent re-runs a fixed sample set on every router-script change to detect regressions in capture pipelines.

The standout property: SecVF doesn't gate this behind a paid API tier. The CLI is the same one you'd type by hand. Anything in the GUI is in the CLI. There's no separate "automation tier."

The JSON contract

Every secvf-cli subcommand accepts --json. Output schema is stable:

// Success
{
  "success": true,
  "message": "optional human-readable message",
  "data":    <subcommand-specific payload>
}

// Failure
{
  "success": false,
  "message": "VM 'foo' not found"
}

Examples:

$ secvf vm list --json | jq '.data[] | {name, status, networkMode}'
{ "name": "kali-router",    "status": "running", "networkMode": "virtual" }
{ "name": "ubuntu-malware", "status": "stopped", "networkMode": "router"  }

$ secvf switch stats --json | jq '.data.packetsForwarded'
14732

$ secvf capture status --json | jq '.data.capturing'
true

Exit codes

CodeMeaning
0Success
1Generic failure
2Argument parsing error
3SecVF.app not running but the operation requires it
4Resource not found (VM, USB device)
5Permission / entitlement failure

Agents can branch on exit code without parsing strings — a small but important property.

Three integration patterns

1. Shell-out from a Python orchestrator

Simplest. Use whatever shell library you like.

import subprocess, json

def secvf(*args):
    r = subprocess.run(
        ["secvf", *args, "--json"],
        capture_output=True, text=True, timeout=120
    )
    if r.returncode != 0:
        raise RuntimeError(f"secvf {args} failed: {r.stderr}")
    return json.loads(r.stdout)["data"]

vms = secvf("vm", "list")
for vm in vms:
    if vm["status"] == "running":
        print(vm["name"])

2. LLM tool-calls

Wrap each CLI verb as a tool the model can call. The tool name maps 1:1 to a subcommand; the parameters map to flags.

// Example: Anthropic tool-use schema
{
  "name": "secvf_vm_create",
  "description": "Create a new VM in SecVF.",
  "input_schema": {
    "type": "object",
    "properties": {
      "name":    { "type": "string" },
      "distro":  { "enum": ["kali","ubuntu","debian","fedora","arch","manjaro"] },
      "cpu":     { "type": "integer", "minimum": 1, "maximum": 8 },
      "ram":     { "type": "integer", "minimum": 1024 },
      "network": { "enum": ["nat","virtual","isolated"] }
    },
    "required": ["name", "distro"]
  }
}

The tool's implementation just shells out:

def secvf_vm_create(name, distro, cpu=2, ram=4096, network="nat"):
    return secvf("vm", "create", name,
                 "--distro", distro,
                 "--cpu", str(cpu),
                 "--ram", str(ram),
                 "--network", network)

3. As an MCP server

See MCP integration below — gives the agent a single connector instead of N tools.

The canonical agent loop

For a typical "analyse this sample" task, the agent does roughly this:

┌─ inputs ─────────────────────────────────┐
│ sample_path, expected_outputs            │
└──────────┬───────────────────────────────┘
           │
           ▼
   secvf vm list  --status stopped
   secvf vm snapshot <clean-vm> --as <session-vm>
   secvf vm start <session-vm>
   secvf capture start -o /tmp/<id>.pcap
   secvf vm copy-to <session-vm> <sample> /tmp/sample
   secvf vm exec   <session-vm> -- /bin/bash /tmp/run.sh
   ▼  ( sleep ~30s, sample runs )
   secvf capture stop
   secvf vm stop  <session-vm> --force
   secvf vm delete <session-vm> --force
           │
           ▼
   suricata -r /tmp/<id>.pcap ...    # external analysis
   summarise IOCs, return verdict

This entire loop is 9 CLI invocations + 1 external tool. Every step has a JSON return shape; failure modes are explicit. An LLM can plan, execute, and recover from each step.

Snapshot-detonate-restore

Pattern in detail. This is the bread and butter of automated triage.

#!/usr/bin/env python3
"""Detonate one sample, harvest the PCAP, restore the lab."""

import json, subprocess, time, uuid, pathlib

def secvf(*args, timeout=300):
    r = subprocess.run(["secvf", *args, "--json"],
                       capture_output=True, text=True, timeout=timeout)
    return json.loads(r.stdout) if r.stdout else {"success": False}

def detonate(sample_path: pathlib.Path, base_vm: str = "ubuntu-malware-clean"):
    sid = uuid.uuid4().hex[:8]
    session_vm = f"sess-{sid}"
    pcap = pathlib.Path(f"/tmp/{sid}.pcap")

    # 1. Snapshot clean VM
    secvf("vm", "snapshot", base_vm, "--as", session_vm)

    # 2. Start everything
    secvf("vm", "start", session_vm)
    secvf("capture", "start", "-o", str(pcap))

    try:
        # 3. Push sample in
        secvf("vm", "copy-to", session_vm, str(sample_path), "/tmp/sample.bin")

        # 4. Run with timeout
        secvf("vm", "exec", session_vm, "--",
              "chmod +x /tmp/sample.bin && /tmp/sample.bin & sleep 60; true",
              timeout=120)
    finally:
        # 5. Tear down, no matter what
        secvf("capture", "stop")
        secvf("vm", "stop", session_vm, "--force")
        secvf("vm", "delete", session_vm, "--force")

    return pcap

if __name__ == "__main__":
    import sys
    pcap = detonate(pathlib.Path(sys.argv[1]))
    print(f"PCAP saved: {pcap}")

Now wrap this in an LLM agent that:

  1. Watches an inbox directory.
  2. For each new file, calls detonate().
  3. Runs Suricata + Zeek over the PCAP.
  4. Asks the model: "Is this malicious? What's the verdict, what IOCs did you find, confidence?"
  5. Files the verdict to a triage queue.

Batch triage from a watchfolder

import time, pathlib
inbox = pathlib.Path("~/malware-inbox").expanduser()
processed = pathlib.Path("~/malware-done").expanduser()

while True:
    for sample in inbox.glob("*"):
        if sample.is_file():
            try:
                pcap = detonate(sample)
                verdict = ask_llm_to_analyze(pcap)         # your LLM call
                (processed / f"{sample.name}.verdict.json").write_text(
                    json.dumps(verdict, indent=2)
                )
                sample.rename(processed / sample.name)
            except Exception as e:
                print(f"failed on {sample}: {e}")
    time.sleep(10)

Run this on a Mac dedicated to analysis, leave it running, drop samples into the inbox folder.

MCP integration (Claude Code & friends)

The Model Context Protocol standardises how AI agents discover and call external tools. SecVF doesn't ship an MCP server today, but the CLI is the right shape for one — every subcommand becomes a tool, every JSON output a structured response.

A minimal MCP server (sketch)

// Node.js, using the official MCP SDK
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { execFile } from "child_process";

const tools = [
  { name: "secvf_vm_list",   description: "List all VMs in SecVF." },
  { name: "secvf_vm_create", description: "Create a new Linux VM.", inputSchema: {...} },
  { name: "secvf_vm_start",  description: "Start a stopped VM.",    inputSchema: {...} },
  // … all subcommands …
];

const server = new Server({ name: "secvf", version: "0.1.0" }, { capabilities: { tools: {} }});

server.setRequestHandler(/* CallTool */, async ({ params }) => {
  const args = mapToolToArgv(params.name, params.arguments);
  const { stdout } = await execFileAsync("secvf", [...args, "--json"]);
  return { content: [{ type: "text", text: stdout }] };
});

Plug that into Claude Code's ~/.claude/mcp.json, and the model can call SecVF operations directly. (See the issue tracker — an official MCP server is on the roadmap.)

Safety rails

Letting an LLM drive a VM platform is powerful and a little dangerous. The rails to put in place:

1. Constrain the VM space

Have the agent operate on a fixed set of named VMs, never the whole library. Don't expose vm delete to the agent without a name allowlist.

2. Always snapshot before mutating

Make vm snapshot a required prelude to any destructive operation. The agent should never edit a "clean" VM — only clones.

3. Network-mode allowlist

Don't let the agent put a sample VM in NAT mode (which lets it reach the public internet through your host). Hardcode --network virtual or --network router.

4. Timeout everything

Every CLI call has a wallclock timeout. Every detonation has a kill-switch. The agent should never enter an unbounded wait.

5. Human-in-the-loop for state-changing operations on shared infra

If the agent's actions affect a router VM that other workflows depend on, gate with a human approval. Detonations on throwaway clones, fine; iptables changes on the live router, ask first.

6. Audit log is non-optional

Every action the agent takes lands in ~/.avf/logs/security-*.log. Tail those during agent runs — see Logging & telemetry. If something goes off-rails, you have the trail.

Don't run agent-driven detonations on a Mac that has anything you care about. Hardware isolation is real, but agent-driven mistakes can still consume disk, lock up RAM, or leave VMs in odd states. Use a dedicated analysis machine.

Observability for agent runs

You want to know what your agent actually did, not what it claimed. The audit logs make this easy.

# Live: every operation the agent performs lands here
tail -F ~/.avf/logs/security-$(date +%F).log \
  | jq -c 'select(.subsystem == "cli" or .subsystem == "vm-lifecycle")'

# Post-hoc: which operations did agent-XYZ do?
jq -c 'select(.data.client == "agent-xyz")' \
  ~/.avf/logs/security-*.log

For agents that should run unattended overnight, set up a smoke check that periodically asserts:

  • Number of running VMs is within expected range
  • Disk usage on ~/.avf hasn't exploded
  • No EMERGENCY events in the last hour
  • Switch packet drops are zero
#!/usr/bin/env bash
# secvf-agent-watchdog.sh — kill the agent if any of these fail
running=$(secvf vm list --status running --json | jq '.data | length')
[ "$running" -gt 8 ] && { echo "too many VMs ($running)"; kill_agent; exit 1; }

emerg=$(jq -c 'select(.severity == "EMERGENCY")' \
  ~/.avf/logs/security-$(date +%F).log | wc -l)
[ "$emerg" -gt 0 ] && { echo "EMERGENCY events"; kill_agent; exit 1; }

# … etc

Future surface (planned)

Items on the roadmap that strengthen this story:

  • Official MCP server. No more bespoke wrappers — install one connector, get the entire CLI as tools.
  • Streaming events. A secvf events --follow command emitting structured events as they happen, for low-latency agent reactions to lifecycle changes.
  • Pre-warmed AI sandbox pool. Sub-second session start by keeping N booted guests waiting on a queue — see AI sandbox perf.
  • YARA hooks during capture. Rules fire as packets are decoded, surfacing matches to the agent in real time.
  • Batch verdicts CLI. Single command to detonate-many, harvest-many, emit-verdicts as JSON.

If you're building on top of SecVF and one of these would unlock something for you, file an issue or open a PR. Agent-driven security tooling is one of the most interesting things we can build here.