CLIco browser
DocsCLIco browser

co browser

Drive one real browser from the shell — call browser functions directly, or hand a task to the AI agent. The browser stays open between commands.

Quick Start: co browser go_to news.ycombinator.com opens a browser. The next command drives the same window.

Quick Start (60 seconds)

code
co browser go_to news.ycombinator.com # opens a browser, navigates co browser get_current_url # → https://news.ycombinator.com/ co browser take_screenshot /tmp/shot.png # saves a PNG co browser close # done
output
Navigated to https://news.ycombinator.com/
https://news.ycombinator.com/
Browser closed

The browser stays open between commands. Each co browser ... call drives the same window — your navigation, cookies, and logged-in session persist until you close.

Two Ways to Drive It

Direct function call

co browser go_to x.com

Deterministic, instant, free (no LLM). For scripting and exact steps you already know.

Natural language

co browser do "find the cheapest flight"

The AI agent figures out the steps. For when you don't want to spell them out.

Both drive the same live browser, so you can mix them — script the boring parts, let the agent handle the hard part:

code
co browser go_to myapp.com/login co browser do "log me in and open the billing page" # agent takes over the same window co browser take_screenshot /tmp/billing.png # back to a direct call

How It Works

The first co browser command starts a small background daemon that owns one browser. Every later command connects to it over a local socket and drives that same browser. The daemon lives exactly as long as the browser:

co browser go_to x.com   ──► starts daemon ──► opens browser ─┐
co browser click "Login" ──────────────────► same browser    │  state persists
co browser screenshot    ──────────────────► same browser    │
co browser close         ──► browser closes ──► daemon exits ─┘

You never manage the daemon directly — the first command starts it, and close (or closing the window) stops it. There is no separate "start" step.

How a command is dispatched

The first word is compared against the browser's function names:

You typeWhat happens
co browser go_to x.comgo_to is a function → runs it directly
co browser do "..."do → hands the instruction to the AI agent
co browser frobnicatematches nothing → unknown command (exit 1)

Quote natural-language instructions: co browser do "click the blue button". A bare word that happens to be a function name (like click) is treated as a direct call, not language.

Discovering Functions

The CLI describes itself — run help to list every callable function with its arguments and a one-line summary (no browser is launched). This is the fastest way, for a person or an AI agent, to find the exact function name and arguments before calling it.

code
co browser help
output
Functions:
go_to(url) — Navigate to a URL.
take_screenshot(path=None, full_page=False) — Take a screenshot of the current page...
click(description) — Click on an element using natural language description.
get_links_from_page(domain_filter='') — Extract all unique links from the current page...
...

Common Functions

Any function listed by co browser help is callable. The ones you'll reach for most:

code
co browser go_to <url> # navigate co browser get_current_url # print the current URL co browser get_text # print visible page text co browser take_screenshot /tmp/shot.png [--full-page] co browser click "<description or selector>" co browser type_text_by_selector <css> "<text>" co browser get_links_from_page # one link per line co browser scroll # scroll the main content co browser close # close browser, stop daemon

Use absolute paths for files. The daemon resolves relative paths against its own working directory (where it was first started), not the directory you run each command from. take_screenshot /tmp/shot.png is predictable; a bare shot.png lands in the daemon's .tmp/ folder.

Scripting

Output is clean stdout, errors go to stderr, and the exit code is 0 on success / 1 on failure — so commands compose like any Unix tool:

code
# Capture a value url=$(co browser get_current_url) # Pipe list output (one item per line) co browser get_links_from_page | grep github | wc -l # Fail-fast in a script co browser go_to "$DEPLOY_URL" && co browser take_screenshot /tmp/deployed.png

Headless vs GUI

By default the browser is visible (a real Chrome window you can watch). Add --headless for scripts/CI:

code
co browser --headless go_to example.com # no window co browser go_to example.com # visible window (default)

The mode is fixed when the daemon starts (the first command). To switch modes, co browser close first, then start again with the mode you want.

Natural Language Agent

do runs the full AI browser agent on the live browser and prints its final answer:

code
co browser do "search for wireless headphones and list the top 3 prices"

This path uses managed keys — run co auth once if you see an authentication message.

Installation

The browser needs Playwright:

code
pip install playwright playwright install chromium

Error Messages

Errors print to stderr and exit with code 1. Each one tells you the next step — handy when an AI agent is driving the CLI and needs to self-correct.

Unknown function

code
co browser frobnicate
output
unknown command: frobnicate
Run 'co browser help' to list functions, or 'co browser do "<instruction>"' for natural language.

Wrong arguments

code
co browser go_to
output
TypeError: BrowserAutomation.go_to() missing 1 required positional argument: 'url'
usage: go_to(url)

The usage: line shows the exact signature — pass the missing argument.

Authentication required (only for do)

code
co browser do "find the price"
output
Browser agent requires authentication. Run: co auth

Playwright not installed

Browser tools not installed. Run: pip install playwright && playwright install chromium

See Also

Star us on GitHub

If ConnectOnion saves you time, a ⭐ goes a long way — and earns you a coffee chat with our founder.