Lesson 04 — Browser Automation: Let AI Control Web Pages for You
Goal: Use OpenClaw's built-in browser tools to let AI automatically browse websites, take screenshots, extract information, and fill out forms.
How It Works
OpenClaw integrates a dedicated Chrome/Chromium instance. AI can use tool calls to:
- Navigate to a specified URL
- Take screenshots and analyze pages
- Click elements and fill out forms
- Extract structured data
Your instruction → AI → browser_navigate / browser_action → Chrome → Screenshot/Result → AI analysis → Reply
Prerequisites
Make sure Chrome or Chromium is installed:
# macOS
brew install --cask google-chrome
# Or use Chromium
brew install --cask chromiumBasic Usage: Let AI Scrape Web Information
Just describe the task in WebChat or Telegram:
Example 1: Check the Weather
Open https://wttr.in/Shanghai, take a screenshot, and tell me Shanghai's weather today
Example 2: Scrape GitHub Repository Info
Open https://github.com/trending and list today's 5 hottest GitHub projects, including language and star count
Example 3: Read Online Documentation
Open https://docs.python.org/3/library/asyncio.html and summarize the 5 most commonly used asyncio functions
Advanced: Automated Form Filling
Open https://httpbin.org/forms/post
Fill in the following:
- custname: John Doe
- custtel: 555-0100
- custemail: johndoe@example.com
- comments: This is a test order
Then take a screenshot so I can see the result
Advanced: Bulk Data Extraction
Combine with a custom Skill to create a data scraping skill:
~/.openclaw/workspace/skills/data-scraper/SKILL.md:
# Web Data Scraper
The user will provide a URL and the data fields they need extracted. You should:
1. Use browser_navigate to open the page
2. Use browser_snapshot to get the page content
3. Extract structured data as requested by the user
4. Output in JSON or Markdown table format
Notes:
- If the page requires scrolling, take multiple screenshots
- If there's pagination, ask the user whether to scrape additional pages
- Notify the user if you encounter a login wallUsage:
/data-scraper
URL: https://news.ycombinator.com
Extract: title, link, score, number of comments (first 10 items)
Advanced: Monitor Page Changes
Combined with Cron scheduled tasks (requires cron channel configuration), you can periodically check web pages:
{
"channels": {
"cron": {
"enabled": true,
"jobs": [
{
"cron": "0 9 * * 1-5",
"message": "Open https://github.com/trending, take a screenshot, summarize today's trending projects and send it to me",
"channel": "telegram"
}
]
}
}
}This way, every weekday at 9 AM, AI will automatically scrape GitHub Trending and send you the results via Telegram.
Tool Reference
Tools AI can use for browser tasks:
| Tool | Function |
|---|---|
browser_navigate |
Open a URL |
browser_snapshot |
Screenshot the current page |
browser_action |
Click, type, scroll, and other interactions |
browser_upload |
Upload a file to a page |
Important Notes
- Browser tools run in a sandboxed mode and won't affect your everyday browser
- Don't use them to log into websites with sensitive accounts (AI will see screenshot contents)
- Complex SPA pages (React/Vue apps) may need extra time to load — you can mention this in your instructions
FAQ
Do I need to install ChromeDriver or Selenium separately?
No. OpenClaw uses the Playwright engine to directly control Chrome/Chromium — no manual ChromeDriver installation needed. Just make sure Chrome or Chromium is installed on your system; everything else is handled by OpenClaw automatically.
How do I debug a failed browser task?
Add "take a screenshot for me" to your instructions, and AI will take screenshots at key steps to show you what's happening. You can also check the gateway logs: tail -f /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log | grep browser to see detailed tool call records.
Can it access websites that require login?
Yes, but be mindful of security: AI will see all content in page screenshots, including username/password input fields and private data. Only use login automation on websites you control — don't use it for banking, payment, or other sensitive accounts.
Can it handle React/Vue SPA pages with dynamic loading?
Yes, but you need to specify wait timings in your instructions, such as "wait 3 seconds after the page opens before taking a screenshot" or "wait until the page finishes loading before extracting data." For pages with scroll-to-load content, you can tell AI "scroll down 3 times then take a screenshot."
Will browser automation affect my regular Chrome?
No. OpenClaw launches an independent Chromium instance (in headless mode) that is completely isolated from your everyday browser — cookies and account state don't cross over.
Next Steps
- Lesson 05 — Configure multiple models, automatically fall back from expensive Claude to affordable MiniMax