Home / Course / Operator architecture

Operator

How the operator works under the hood

8 min read · for anyone who wants to modify, extend, or just understand the moving parts.

The operator is intentionally small and boring. ~1,500 lines of Python across five clear layers. You can read the whole codebase in an afternoon. This page is the map.

The five layers

┌──────────────────────────────────────────────────────────┐
│  ORCHESTRATOR                                            │
│  Claude tool-use loop · plans, picks tools, self-critiques│
└──────────────────────────────────────────────────────────┘
                          │
        ┌─────────────┬───┴────┬──────────────┐
        ▼             ▼        ▼              ▼
     RAG layer   Browser   Tool registry   Policy
     (Chroma)    (human-   (per-module     (dry-run
                 motion)    Python fns)     gating)

1 · The orchestrator (the brain)

orchestrator/agent.py is one file, ~120 lines. It does exactly four things:

Calls rag.retrieve() for the user's task to grab the 8 most relevant chunks of playbook.
Constructs a system prompt with: behavior rules + module overview + the retrieved chunks (the chunks are marked cache_control: ephemeral for prompt caching).
Calls Claude with the system prompt, the tool schema, and the user's task.
Loops: if Claude returns tool_use, execute the tool, append the result, ask again. If it returns text only, that's the final answer.

That's the whole loop. No agents-of-agents, no planning sub-models, no fancy frameworks. One model, one loop, in the same file you're reading.

Why prompt caching matters

The system prompt + retrieved chunks are ~3–6k tokens. Without caching, every tool turn re-bills them. With Anthropic's cache_control, they're billed full the first time and ~10% on subsequent turns within 5 minutes. A 20-step run costs ~$0.10–0.30 instead of ~$1–3.

2 · The RAG layer

orchestrator/rag.py wraps ChromaDB. The transcripts (each JSON has lesson text + timestamps) get chunked into ~400-word windows, embedded with bge-small-en-v1.5 (good, small, free, runs on CPU), and stored locally.

Retrieval supports module filtering: rag.retrieve("scaling rule", module="09") returns Module 09 chunks first. The agent uses this to stay loyal to this playbook's specific moves (e.g. the exact +20% / 48h rule from Module 09) rather than generic "scaling best practices" from training data.

To re-embed after editing the playbook:

python -m orchestrator.rag build

3 · The browser layer (the "click like a human" part)

An abstract BrowserSession with two adapters behind it:

3a · PlaywrightHumanSession

browser/playwright_human.py + browser/humanize.py. Headed Chromium with a persistent profile directory. The interesting code lives in humanize.py:

Mouse paths — every click resolves the target's bounding box, picks an aim point with Gaussian spread around center, then moves there along a cubic Bezier with 3–5 control points. Per-step delay drawn from a log-normal distribution.
Overshoot — 1-in-6 clicks overshoot the target by 5–15 px and correct.
Hover dwell + click gap — 80–250 ms wait at target, 40–120 ms between mousedown and mouseup.
Typing — per-character Gaussian delay (μ=110 ms, σ=35 ms), occasional 300–900 ms thinking pauses at word boundaries, 1-in-40 chars typed wrong and corrected with backspace.
Scroll — wheel events in 60–160 px increments with 50–130 ms gaps, occasional back-scroll re-reads.

These values came from a few published studies of pointer behavior plus open-source projects like Ghost-Cursor. Tweak them in humanize.py; the defaults are conservative.

3b · Stealth patches

browser/stealth_patches.py is one big STEALTH_JS string injected via Playwright's add_init_script before any page loads. It:

Removes navigator.webdriver (the biggest bot tell)
Restores a realistic navigator.plugins array
Sets navigator.languages to ["en-US", "en"]
Fixes the chrome.runtime stub headless drops
Fakes WebGL vendor/renderer to an Intel combo
Injects 1-bit canvas noise (defeats canvas fingerprinting)
Injects audio-context noise (same trick for audio fingerprinting)

3c · MCPChromeSession (the fallback)

For pages where CSS selectors aren't enough — captchas, account-recovery flows, weird interactive UIs — the browser/mcp_chrome.py adapter speaks to a Claude-in-Chrome MCP server. Claude takes a screenshot, reasons about where to click in pixel coordinates, and the MCP server fires real OS-level mouse events on the user's actual Chrome.

Same BrowserSession interface, so any tool can switch adapters per-call without changing its own code.

4 · The tool registry

tools/registry.py exposes a single @tool(...) decorator. Each Python function gets wrapped, its signature converted to a JSONSchema, and registered with metadata:

@tool(
    description="Generate a long-form SEO article in the house style.",
    risk="low",
    params={
        "keyword":   {"type": "string", "required": True},
        "offer_url": {"type": "string", "required": True},
    },
)
def seo_write_article(keyword, offer_url, intent="commercial"):
    ...

The registry auto-generates the tool schema Claude consumes via tool_use. Adding a tool is one decorated function. The agent immediately knows it exists and when to use it (the description is the only hint).

Risk levels

"low" — always allowed (read-only or local file ops).
"high" — gated through policy.requires_confirmation().
"irreversible" — always gated; --auto-approve still prompts.

5 · The policy layer

orchestrator/policy.py is ~30 lines. Before any high-or-irreversible tool fires, it:

Dumps the proposed call (tool name + args) to stderr.
If --dry-run: auto-approves but tags the result so the tool returns "would have called X with Y" instead of actually running.
If --auto-approve and risk < irreversible: approves.
Else: blocking y/N prompt on the terminal.

To plug in Slack or email notifications instead of terminal prompts, replace the stdin.readline() call with whatever you want. Single function. Easy.

Runners

The runners/ folder contains thin entry points that bundle a multi-module task into one CLI invocation:

full_pipeline.py — niche → spy → lander → tracking → traffic test → conversion → scale.
content_machine.py — daily SEO post + YT script.
ban_recovery.py — module 10 unattended.

Each runner is ~30 lines: build the prompt, call Operator().run(AgentRun(...)). To add a new high-level task, copy a runner and rewrite the PROMPT.

Adding your own tool

Say you want to add a Twitter posting tool for organic distribution. Create tools/twitter/__init__.py:

from ..registry import tool

@tool(
    description="Post a thread to the configured Twitter account.",
    risk="high",
    params={
        "tweets": {"type": "array", "required": True,
                   "description": "list of tweet strings; first is the parent"},
    },
)
def twitter_post_thread(tweets):
    # your impl using tweepy or httpx + the v2 API
    return {"posted": True, "thread_url": "..."}

Add twitter to the import list in tools/registry.py. Done. The next agent run will see the new tool and use it when relevant.

Why it's built this way

Decision	Why
One model, no agent frameworks	LangChain/AutoGen-style abstractions are net-negative below ~3 agents. We have one agent. Plain SDK is clearer.
Headed browser default	Headless Chrome has 10× the bot signal of headed. You don't save much CPU running headless on a single-machine deployment.
Persistent profile dir	Cookies, cache, IndexedDB survive runs. Mimics a returning user. Critical for Facebook trust.
Local Chroma, not Pinecone	3MB of transcripts doesn't need a hosted vector DB. Local index is faster and free.
Dry-run as default	The cost of "ran a $30/day campaign you didn't authorize" is much higher than the cost of "didn't run anything yet."
MIT license	The moat in this space is taste in offers + angles, not the automation. Open the automation, lower the floor.

Either dive into the code (the README.md inside the download points at every file), or jump back to the course and read the modules you skipped.

Operator quickstart

Course

All modules