co browser
Drive one real browser from the shell — call browser functions directly, or hand a task to the AI agent. The browser stays open between commands.
Quick Start: co browser go_to news.ycombinator.com opens a browser. The next command drives the same window.
Quick Start (60 seconds)
The browser stays open between commands. Each co browser ... call drives the same window — your navigation, cookies, and logged-in session persist until you close.
Two Ways to Drive It
Direct function call
co browser go_to x.comDeterministic, instant, free (no LLM). For scripting and exact steps you already know.
Natural language
co browser do "find the cheapest flight"The AI agent figures out the steps. For when you don't want to spell them out.
Both drive the same live browser, so you can mix them — script the boring parts, let the agent handle the hard part:
How It Works
The first co browser command starts a small background daemon that owns one browser. Every later command connects to it over a local socket and drives that same browser. The daemon lives exactly as long as the browser:
co browser go_to x.com ──► starts daemon ──► opens browser ─┐ co browser click "Login" ──────────────────► same browser │ state persists co browser screenshot ──────────────────► same browser │ co browser close ──► browser closes ──► daemon exits ─┘
You never manage the daemon directly — the first command starts it, and close (or closing the window) stops it. There is no separate "start" step.
How a command is dispatched
The first word is compared against the browser's function names:
| You type | What happens |
|---|---|
| co browser go_to x.com | go_to is a function → runs it directly |
| co browser do "..." | do → hands the instruction to the AI agent |
| co browser frobnicate | matches nothing → unknown command (exit 1) |
Quote natural-language instructions: co browser do "click the blue button". A bare word that happens to be a function name (like click) is treated as a direct call, not language.
Discovering Functions
The CLI describes itself — run help to list every callable function with its arguments and a one-line summary (no browser is launched). This is the fastest way, for a person or an AI agent, to find the exact function name and arguments before calling it.
Common Functions
Any function listed by co browser help is callable. The ones you'll reach for most:
Use absolute paths for files. The daemon resolves relative paths against its own working directory (where it was first started), not the directory you run each command from. take_screenshot /tmp/shot.png is predictable; a bare shot.png lands in the daemon's .tmp/ folder.
Scripting
Output is clean stdout, errors go to stderr, and the exit code is 0 on success / 1 on failure — so commands compose like any Unix tool:
Headless vs GUI
By default the browser is visible (a real Chrome window you can watch). Add --headless for scripts/CI:
The mode is fixed when the daemon starts (the first command). To switch modes, co browser close first, then start again with the mode you want.
Natural Language Agent
do runs the full AI browser agent on the live browser and prints its final answer:
This path uses managed keys — run co auth once if you see an authentication message.
Installation
The browser needs Playwright:
Error Messages
Errors print to stderr and exit with code 1. Each one tells you the next step — handy when an AI agent is driving the CLI and needs to self-correct.
Unknown function
Wrong arguments
The usage: line shows the exact signature — pass the missing argument.
Authentication required (only for do)
Playwright not installed
Browser tools not installed. Run: pip install playwright && playwright install chromium
See Also
- • co auth — managed keys for the
doagent - • BrowserAutomation — the browser tools used in your own agents
- • Browser agent — full browser automation in code
ConnectOnion