Lesson 04 — Browser Automation: Let AI Control Web Pages for You

Goal: Use OpenClaw's built-in browser tools to let AI automatically browse websites, take screenshots, extract information, and fill out forms.

How It Works

OpenClaw integrates a dedicated Chrome/Chromium instance. AI can use tool calls to:

Navigate to a specified URL
Take screenshots and analyze pages
Click elements and fill out forms
Extract structured data

Your instruction → AI → browser_navigate / browser_action → Chrome → Screenshot/Result → AI analysis → Reply

Prerequisites

Make sure Chrome or Chromium is installed:

# macOS
brew install --cask google-chrome
 
# Or use Chromium
brew install --cask chromium

Basic Usage: Let AI Scrape Web Information

Just describe the task in WebChat or Telegram:

Example 1: Check the Weather

Open https://wttr.in/Shanghai, take a screenshot, and tell me Shanghai's weather today

Example 2: Scrape GitHub Repository Info

Open https://github.com/trending and list today's 5 hottest GitHub projects, including language and star count

Example 3: Read Online Documentation

Open https://docs.python.org/3/library/asyncio.html and summarize the 5 most commonly used asyncio functions

Advanced: Automated Form Filling

Open https://httpbin.org/forms/post
Fill in the following:
- custname: John Doe
- custtel: 555-0100
- custemail: johndoe@example.com
- comments: This is a test order

Then take a screenshot so I can see the result

Advanced: Bulk Data Extraction

Combine with a custom Skill to create a data scraping skill:

~/.openclaw/workspace/skills/data-scraper/SKILL.md:

# Web Data Scraper
 
The user will provide a URL and the data fields they need extracted. You should:
 
1. Use browser_navigate to open the page
2. Use browser_snapshot to get the page content
3. Extract structured data as requested by the user
4. Output in JSON or Markdown table format
 
Notes:
- If the page requires scrolling, take multiple screenshots
- If there's pagination, ask the user whether to scrape additional pages
- Notify the user if you encounter a login wall

Usage:

/data-scraper

URL: https://news.ycombinator.com
Extract: title, link, score, number of comments (first 10 items)

Advanced: Monitor Page Changes

Combined with Cron scheduled tasks (requires cron channel configuration), you can periodically check web pages:

{
  "channels": {
    "cron": {
      "enabled": true,
      "jobs": [
        {
          "cron": "0 9 * * 1-5",
          "message": "Open https://github.com/trending, take a screenshot, summarize today's trending projects and send it to me",
          "channel": "telegram"
        }
      ]
    }
  }
}

This way, every weekday at 9 AM, AI will automatically scrape GitHub Trending and send you the results via Telegram.

Tool Reference

Tools AI can use for browser tasks:

Tool	Function
`browser_navigate`	Open a URL
`browser_snapshot`	Screenshot the current page
`browser_action`	Click, type, scroll, and other interactions
`browser_upload`	Upload a file to a page

Important Notes

Browser tools run in a sandboxed mode and won't affect your everyday browser
Don't use them to log into websites with sensitive accounts (AI will see screenshot contents)
Complex SPA pages (React/Vue apps) may need extra time to load — you can mention this in your instructions

FAQ

Do I need to install ChromeDriver or Selenium separately?

No. OpenClaw uses the Playwright engine to directly control Chrome/Chromium — no manual ChromeDriver installation needed. Just make sure Chrome or Chromium is installed on your system; everything else is handled by OpenClaw automatically.

How do I debug a failed browser task?

Add "take a screenshot for me" to your instructions, and AI will take screenshots at key steps to show you what's happening. You can also check the gateway logs: tail -f /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log | grep browser to see detailed tool call records.

Can it access websites that require login?

Yes, but be mindful of security: AI will see all content in page screenshots, including username/password input fields and private data. Only use login automation on websites you control — don't use it for banking, payment, or other sensitive accounts.

Can it handle React/Vue SPA pages with dynamic loading?

Yes, but you need to specify wait timings in your instructions, such as "wait 3 seconds after the page opens before taking a screenshot" or "wait until the page finishes loading before extracting data." For pages with scroll-to-load content, you can tell AI "scroll down 3 times then take a screenshot."

Will browser automation affect my regular Chrome?

No. OpenClaw launches an independent Chromium instance (in headless mode) that is completely isolated from your everyday browser — cookies and account state don't cross over.

Next Steps

Lesson 05 — Configure multiple models, automatically fall back from expensive Claude to affordable MiniMax