Home / Course / Operator architecture
OperatorHow the operator works under the hood
The operator is intentionally small and boring. ~1,500 lines of Python across five clear layers. You can read the whole codebase in an afternoon. This page is the map.
The five layers
┌──────────────────────────────────────────────────────────┐
│ ORCHESTRATOR │
│ Claude tool-use loop · plans, picks tools, self-critiques│
└──────────────────────────────────────────────────────────┘
│
┌─────────────┬───┴────┬──────────────┐
▼ ▼ ▼ ▼
RAG layer Browser Tool registry Policy
(Chroma) (human- (per-module (dry-run
motion) Python fns) gating)
1 · The orchestrator (the brain)
orchestrator/agent.py is one file, ~120 lines. It does exactly four things:
- Calls
rag.retrieve()for the user's task to grab the 8 most relevant chunks of playbook. - Constructs a system prompt with: behavior rules + module overview + the retrieved chunks (the chunks are marked
cache_control: ephemeralfor prompt caching). - Calls Claude with the system prompt, the tool schema, and the user's task.
- Loops: if Claude returns
tool_use, execute the tool, append the result, ask again. If it returns text only, that's the final answer.
That's the whole loop. No agents-of-agents, no planning sub-models, no fancy frameworks. One model, one loop, in the same file you're reading.
Why prompt caching matters
The system prompt + retrieved chunks are ~3–6k tokens. Without caching, every tool turn re-bills them. With Anthropic's cache_control, they're billed full the first time and ~10% on subsequent turns within 5 minutes. A 20-step run costs ~$0.10–0.30 instead of ~$1–3.
2 · The RAG layer
orchestrator/rag.py wraps ChromaDB. The transcripts (each JSON has lesson text + timestamps) get chunked into ~400-word windows, embedded with bge-small-en-v1.5 (good, small, free, runs on CPU), and stored locally.
Retrieval supports module filtering: rag.retrieve("scaling rule", module="09") returns Module 09 chunks first. The agent uses this to stay loyal to this playbook's specific moves (e.g. the exact +20% / 48h rule from Module 09) rather than generic "scaling best practices" from training data.
To re-embed after editing the playbook:
python -m orchestrator.rag build
3 · The browser layer (the "click like a human" part)
An abstract BrowserSession with two adapters behind it:
3a · PlaywrightHumanSession
browser/playwright_human.py + browser/humanize.py. Headed Chromium with a persistent profile directory. The interesting code lives in humanize.py:
- Mouse paths — every click resolves the target's bounding box, picks an aim point with Gaussian spread around center, then moves there along a cubic Bezier with 3–5 control points. Per-step delay drawn from a log-normal distribution.
- Overshoot — 1-in-6 clicks overshoot the target by 5–15 px and correct.
- Hover dwell + click gap — 80–250 ms wait at target, 40–120 ms between mousedown and mouseup.
- Typing — per-character Gaussian delay (μ=110 ms, σ=35 ms), occasional 300–900 ms thinking pauses at word boundaries, 1-in-40 chars typed wrong and corrected with backspace.
- Scroll — wheel events in 60–160 px increments with 50–130 ms gaps, occasional back-scroll re-reads.
These values came from a few published studies of pointer behavior plus open-source projects like Ghost-Cursor. Tweak them in humanize.py; the defaults are conservative.
3b · Stealth patches
browser/stealth_patches.py is one big STEALTH_JS string injected via Playwright's add_init_script before any page loads. It:
- Removes
navigator.webdriver(the biggest bot tell) - Restores a realistic
navigator.pluginsarray - Sets
navigator.languagesto["en-US", "en"] - Fixes the
chrome.runtimestub headless drops - Fakes WebGL vendor/renderer to an Intel combo
- Injects 1-bit canvas noise (defeats canvas fingerprinting)
- Injects audio-context noise (same trick for audio fingerprinting)
3c · MCPChromeSession (the fallback)
For pages where CSS selectors aren't enough — captchas, account-recovery flows, weird interactive UIs — the browser/mcp_chrome.py adapter speaks to a Claude-in-Chrome MCP server. Claude takes a screenshot, reasons about where to click in pixel coordinates, and the MCP server fires real OS-level mouse events on the user's actual Chrome.
Same BrowserSession interface, so any tool can switch adapters per-call without changing its own code.
4 · The tool registry
tools/registry.py exposes a single @tool(...) decorator. Each Python function gets wrapped, its signature converted to a JSONSchema, and registered with metadata:
@tool(
description="Generate a long-form SEO article in the house style.",
risk="low",
params={
"keyword": {"type": "string", "required": True},
"offer_url": {"type": "string", "required": True},
},
)
def seo_write_article(keyword, offer_url, intent="commercial"):
...
The registry auto-generates the tool schema Claude consumes via tool_use. Adding a tool is one decorated function. The agent immediately knows it exists and when to use it (the description is the only hint).
Risk levels
"low"— always allowed (read-only or local file ops)."high"— gated throughpolicy.requires_confirmation()."irreversible"— always gated;--auto-approvestill prompts.
5 · The policy layer
orchestrator/policy.py is ~30 lines. Before any high-or-irreversible tool fires, it:
- Dumps the proposed call (tool name + args) to stderr.
- If
--dry-run: auto-approves but tags the result so the tool returns "would have called X with Y" instead of actually running. - If
--auto-approveand risk < irreversible: approves. - Else: blocking
y/Nprompt on the terminal.
To plug in Slack or email notifications instead of terminal prompts, replace the stdin.readline() call with whatever you want. Single function. Easy.
Runners
The runners/ folder contains thin entry points that bundle a multi-module task into one CLI invocation:
full_pipeline.py— niche → spy → lander → tracking → traffic test → conversion → scale.content_machine.py— daily SEO post + YT script.ban_recovery.py— module 10 unattended.
Each runner is ~30 lines: build the prompt, call Operator().run(AgentRun(...)). To add a new high-level task, copy a runner and rewrite the PROMPT.
Adding your own tool
Say you want to add a Twitter posting tool for organic distribution. Create tools/twitter/__init__.py:
from ..registry import tool
@tool(
description="Post a thread to the configured Twitter account.",
risk="high",
params={
"tweets": {"type": "array", "required": True,
"description": "list of tweet strings; first is the parent"},
},
)
def twitter_post_thread(tweets):
# your impl using tweepy or httpx + the v2 API
return {"posted": True, "thread_url": "..."}
Add twitter to the import list in tools/registry.py. Done. The next agent run will see the new tool and use it when relevant.
Why it's built this way
| Decision | Why |
|---|---|
| One model, no agent frameworks | LangChain/AutoGen-style abstractions are net-negative below ~3 agents. We have one agent. Plain SDK is clearer. |
| Headed browser default | Headless Chrome has 10× the bot signal of headed. You don't save much CPU running headless on a single-machine deployment. |
| Persistent profile dir | Cookies, cache, IndexedDB survive runs. Mimics a returning user. Critical for Facebook trust. |
| Local Chroma, not Pinecone | 3MB of transcripts doesn't need a hosted vector DB. Local index is faster and free. |
| Dry-run as default | The cost of "ran a $30/day campaign you didn't authorize" is much higher than the cost of "didn't run anything yet." |
| MIT license | The moat in this space is taste in offers + angles, not the automation. Open the automation, lower the floor. |
Next
Either dive into the code (the README.md inside the download points at every file), or jump back to the course and read the modules you skipped.