> ## Documentation Index
> Fetch the complete documentation index at: https://docs.scrapai.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Browser Service

> Run one warm shared browser that inspect, crawl, and Cloudflare bypass reuse instead of cold-starting a browser per call

The browser service is an optional background process that keeps **one warm browser** running. When it is up, `inspect` (and the `--browser` / `--screenshot` paths, plus the auto-escalation from HTTP to browser) route through it automatically instead of each launching their own browser and re-solving Cloudflare.

<Note>
  Cold-starting a browser for a Cloudflare-protected site is expensive (launch + solve the challenge, roughly 10-15s each time). The service pays that cost once and keeps the browser warm, so repeated inspects — and several agents inspecting different sites at once — are far faster.
</Note>

## When to run it

Start it when a session will do **many browser inspects**:

* Repeated `inspect --screenshot` across a site's sections and sample pages.
* Cloudflare/JS sites where every `inspect` would otherwise cold-start a browser.
* **Parallel processing** — several agents each inspecting a *different* site.

<Tip>
  For a one-off inspect you don't need it. When the service is not running, `inspect` cold-starts its own browser exactly as before — output (`page.html`, `page.png`, the transport report) is identical either way. It is a pure speed-up.
</Tip>

## Commands

<CodeGroup>
  ```bash Start theme={null}
  ./scrapai browser start
  ```

  ```bash Status theme={null}
  ./scrapai browser status
  ```

  ```bash Stop theme={null}
  ./scrapai browser stop
  ```

  ```bash Restart theme={null}
  ./scrapai browser restart
  ```
</CodeGroup>

### start

Launches the background browser and waits until it answers pings.

```bash theme={null}
./scrapai browser start --pool 5 --proxy-type auto
```

<ParamField path="--pool" default="5">
  Max concurrent lanes — one lane per site (see [Parallel crawling with lanes](#parallel-crawling-with-lanes) below).
</ParamField>

<ParamField path="--proxy-type" default="auto">
  Proxy for the service: `auto`, `none`, or any proxy name configured in `.env`.
</ParamField>

<Note>
  On a headless server the browser runs under Xvfb automatically — no windows, no `xvfb-run` needed. If a display is required but Xvfb is missing, `start` tells you to install it: `sudo apt-get install -y xvfb`.
</Note>

If a service is already running, `start` reports its pid and does nothing.

### status

```bash theme={null}
./scrapai browser status
```

Prints `Running (pid ..., port ...).` or `Not running.`

### stop

```bash theme={null}
./scrapai browser stop
```

Gracefully shuts the service down and drops its state file.

### restart

```bash theme={null}
./scrapai browser restart
```

Stops the service and starts it again with its **previous** `--proxy-type` and `--pool` settings. Pass either flag to override just that value:

```bash theme={null}
./scrapai browser restart --pool 10
```

### shot

Screenshot a URL through the running service, reusing the warm browser.

```bash theme={null}
./scrapai browser shot https://example.com --project myproj --screens 2
```

<ParamField path="url" required>
  The page to capture (positional argument).
</ParamField>

<ParamField path="--project" default="default">
  Project name — determines where the screenshot is saved.
</ParamField>

<ParamField path="--screens" default="2">
  Screen-heights to capture. `0` captures the full page.
</ParamField>

The image is written to `<DATA_DIR>/<project>/<domain>/analysis/page.png`, where `<domain>` is the URL host with `www.` stripped and dots replaced by underscores.

<Warning>
  `shot` requires a running service. If none is up it exits with: `No browser service running. Start it: ./scrapai browser start`
</Warning>

## State file

The service records its pid and port in a per-user state file so any scrapai process — `browser start` and every `inspect` caller — can find the one running service:

```bash theme={null}
~/.scrapai/browser_service.json
```

<Note>
  The path is anchored on your home directory (not `$TMPDIR`) on purpose: `$TMPDIR` differs per shell/sandbox on macOS, which would make a service started in one terminal invisible from another. `stop` removes this file; stale files are handled gracefully.
</Note>

## How it works

* **One browser, one window.** The service launches a single browser. Each site gets its own **tab** in that one window.
* **One tab per site (domain-sticky).** A site reuses its tab and its already-solved Cloudflare session, so the second inspect of a site skips the challenge and is much faster. Different sites get different tabs and solve Cloudflare concurrently without interfering.
* **LRU eviction.** When more than `--pool` sites are in play, the least-recently-used tab is closed.

Memory: one shared browser for, say, 5 sites uses roughly **half** of what 5 separate browsers would (one browser baseline instead of five).

## Parallel crawling with lanes

Under the hood the service runs a **lane pool** over the one shared browser. A "lane" is an isolated browser context + page that solves Cloudflare on its own. The pool maps each domain to a lane:

* **Domain-sticky.** The same domain reuses its lane (and its solved CF session); different domains get different lanes and run **in parallel**.
* **LRU eviction.** At most `--pool` lanes exist at once (default 5). When the number of domains exceeds the cap, the least-recently-used lane is closed.
* **Per-domain navigation locks.** A lane has a single page, so two requests for the same domain are serialized on that lane while different domains proceed concurrently.
* **Sessioned lanes.** A lane is tied to the session it was opened with. If the same domain is later requested with a different session (or none), the lane is torn down and reopened — a logged-in lane is never reused unlogged, or vice versa.

Before processing multiple sites in parallel, start the service once:

```bash theme={null}
./scrapai browser start
```

Each agent's `inspect` then shares the one browser (one lane per site) instead of launching its own. Run `./scrapai browser stop` when the batch is done.

<Warning>
  Because all lanes share one browser session, a site that needs to **switch proxies mid-solve** (only when Cloudflare blocks *and* a proxy chain is configured) can disturb other lanes. With direct connections — the normal case — this does not happen.
</Warning>

## Related Guides

<CardGroup cols={2}>
  <Card title="Cloudflare Bypass" icon="shield-halved" href="/guides/cloudflare-bypass">
    Handle Cloudflare-protected sites with browser verification and cookie caching
  </Card>

  <Card title="Proxy Escalation" icon="network-wired" href="/guides/proxy-escalation">
    Combine the service with smart proxy usage
  </Card>
</CardGroup>
