Tutorials 04

Lesson 04 — Browser Automation: Let AI Control Web Pages for You

Goal: Use OpenClaw's built-in browser tools to let AI automatically browse websites, take screenshots, extract information, and fill out forms.


How It Works

OpenClaw integrates a dedicated Chrome/Chromium instance. AI can use tool calls to:

  • Navigate to a specified URL
  • Take screenshots and analyze pages
  • Click elements and fill out forms
  • Extract structured data
Your instruction → AI → browser_navigate / browser_action → Chrome → Screenshot/Result → AI analysis → Reply

Prerequisites

Make sure Chrome or Chromium is installed:

# macOS
brew install --cask google-chrome
 
# Or use Chromium
brew install --cask chromium

Basic Usage: Let AI Scrape Web Information

Just describe the task in WebChat or Telegram:

Example 1: Check the Weather

Open https://wttr.in/Shanghai, take a screenshot, and tell me Shanghai's weather today

Example 2: Scrape GitHub Repository Info

Open https://github.com/trending and list today's 5 hottest GitHub projects, including language and star count

Example 3: Read Online Documentation

Open https://docs.python.org/3/library/asyncio.html and summarize the 5 most commonly used asyncio functions

Advanced: Automated Form Filling

Open https://httpbin.org/forms/post
Fill in the following:
- custname: John Doe
- custtel: 555-0100
- custemail: johndoe@example.com
- comments: This is a test order

Then take a screenshot so I can see the result

Advanced: Bulk Data Extraction

Combine with a custom Skill to create a data scraping skill:

~/.openclaw/workspace/skills/data-scraper/SKILL.md:

# Web Data Scraper
 
The user will provide a URL and the data fields they need extracted. You should:
 
1. Use browser_navigate to open the page
2. Use browser_snapshot to get the page content
3. Extract structured data as requested by the user
4. Output in JSON or Markdown table format
 
Notes:
- If the page requires scrolling, take multiple screenshots
- If there's pagination, ask the user whether to scrape additional pages
- Notify the user if you encounter a login wall

Usage:

/data-scraper

URL: https://news.ycombinator.com
Extract: title, link, score, number of comments (first 10 items)

Advanced: Monitor Page Changes

Combined with Cron scheduled tasks (requires cron channel configuration), you can periodically check web pages:

{
  "channels": {
    "cron": {
      "enabled": true,
      "jobs": [
        {
          "cron": "0 9 * * 1-5",
          "message": "Open https://github.com/trending, take a screenshot, summarize today's trending projects and send it to me",
          "channel": "telegram"
        }
      ]
    }
  }
}

This way, every weekday at 9 AM, AI will automatically scrape GitHub Trending and send you the results via Telegram.


Tool Reference

Tools AI can use for browser tasks:

Tool Function
browser_navigate Open a URL
browser_snapshot Screenshot the current page
browser_action Click, type, scroll, and other interactions
browser_upload Upload a file to a page

Important Notes

  • Browser tools run in a sandboxed mode and won't affect your everyday browser
  • Don't use them to log into websites with sensitive accounts (AI will see screenshot contents)
  • Complex SPA pages (React/Vue apps) may need extra time to load — you can mention this in your instructions

FAQ

Do I need to install ChromeDriver or Selenium separately?

No. OpenClaw uses the Playwright engine to directly control Chrome/Chromium — no manual ChromeDriver installation needed. Just make sure Chrome or Chromium is installed on your system; everything else is handled by OpenClaw automatically.

How do I debug a failed browser task?

Add "take a screenshot for me" to your instructions, and AI will take screenshots at key steps to show you what's happening. You can also check the gateway logs: tail -f /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log | grep browser to see detailed tool call records.

Can it access websites that require login?

Yes, but be mindful of security: AI will see all content in page screenshots, including username/password input fields and private data. Only use login automation on websites you control — don't use it for banking, payment, or other sensitive accounts.

Can it handle React/Vue SPA pages with dynamic loading?

Yes, but you need to specify wait timings in your instructions, such as "wait 3 seconds after the page opens before taking a screenshot" or "wait until the page finishes loading before extracting data." For pages with scroll-to-load content, you can tell AI "scroll down 3 times then take a screenshot."

Will browser automation affect my regular Chrome?

No. OpenClaw launches an independent Chromium instance (in headless mode) that is completely isolated from your everyday browser — cookies and account state don't cross over.


Next Steps

  • Lesson 05 — Configure multiple models, automatically fall back from expensive Claude to affordable MiniMax

Stay up to date with OpenClaw

Follow @lanmiaoai on X for tips, updates and new tutorials.

Follow