Table of Contents
💖 Browser4: a lightning-fast, coroutine-safe browser engine for your AI 💖
- 👽 Browser Agents — Fully autonomous browser agents that reason, plan, and execute end-to-end tasks.
- 🤖 Browser Automation — High-performance automation for workflows, navigation, and data extraction.
- ⚙️ Machine Learning Agent - Learns field structures across complex pages without consuming tokens.
- ⚡ Extreme Performance — Fully coroutine-safe; supports 100k ~ 200k complex page visits per machine per day.
- 🧬 Data Extraction — Hybrid of LLM, ML, and selectors for clean data across chaotic pages.
// Give your Agent a mission, not just a script.
val agent = AgenticContexts.getOrCreateAgent()
// The Agent plans, navigates, and executes using Browser4 as its hands and eyes.
val result = agent.run("""
1. Go to amazon.com
2. Search for '4k monitors'
3. Analyze the top 5 results for price/performance ratio
4. Return the best option as JSON
""")📺 Bilibili: https://www.bilibili.com/video/BV1fXUzBFE4L
Prerequisites: Java 17+
-
Clone the repository
git clone https://github.com/platonai/browser4.git cd browser4 -
Configure your LLM API key
Edit application.properties and add your API key.
-
Build the project
./mvnw -DskipTests
-
Run examples
./mvnw -pl examples/browser4-examples exec:java -D"exec.mainClass=ai.platon.pulsar.examples.agent.Browser4AgentKt"If you have encoding problem on Windows:
./bin/run-examples.ps1
Explore and run examples in the
browser4-examplesmodule to see Browser4 in action.
For Docker deployment, see our Docker Hub repository.
Windows Users: You can also build Browser4 as a standalone Windows installer. See the Windows Installer Guide for details.
Autonomous agents that understand natural language instructions and execute complex browser workflows.
val agent = AgenticContexts.getOrCreateAgent()
val task = """
1. go to amazon.com
2. search for pens to draw on whiteboards
3. compare the first 4 ones
4. write the result to a markdown file
"""
agent.run(task)Low-level browser automation & data extraction with fine-grained control.
Features:
- Both live DOM access and offline snapshot parsing
- Direct and full Chrome DevTools Protocol (CDP) control, coroutine safe
- Precise element interactions (click, scroll, input)
- Fast data extraction using CSS selectors/XPath
val session = AgenticContexts.getOrCreateSession()
val agent = session.companionAgent
val driver = session.getOrCreateBoundDriver()
// Load the initial page referenced by your input URL
var page = session.open(url)
// Drive the browser with natural-language instructions
agent.act("scroll to the comment section")
// Read the first matching comment node directly from the live DOM
val content = driver.selectFirstTextOrNull("#comments")
// Snapshot the page to an in-memory document for offline parsing
var document = session.parse(page)
// Map CSS selectors to structured fields in one call
var fields = session.extract(document, mapOf("title" to "#title"))
// Let the companion agent execute a multi-step navigation/search flow
val history = agent.run(
"Go to amazon.com, search for 'smart phone', open the product page with the highest ratings"
)
// Capture the updated browser state back into a PageSnapshot
page = session.capture(driver)
document = session.parse(page)
// Extract additional attributes from the captured snapshot
fields = session.extract(document, mapOf("ratings" to "#ratings"))Ideal for high-complexity data-extraction pipelines with multiple-dozen entities and several hundred fields per entity.
Benefits:
- Extract 10x more entities and 100x more fields compared to traditional methods
- Combine LLM intelligence with precise CSS selectors/XPath
- SQL-like syntax for familiar data queries
val context = AgenticContexts.create()
val sql = """
select
llm_extract(dom, 'product name, price, ratings') as llm_extracted_data,
dom_first_text(dom, '#productTitle') as title,
dom_first_text(dom, '#bylineInfo') as brand,
dom_first_text(dom, '#price tr td:matches(^Price) ~ td, #corePrice_desktop tr td:matches(^Price) ~ td') as price,
dom_first_text(dom, '#acrCustomerReviewText') as ratings,
str_first_float(dom_first_text(dom, '#reviewsMedley .AverageCustomerReviews span:contains(out of)'), 0.0) as score
from load_and_select('https://www.amazon.com/dp/B08PP5MSVB -i 1s -njr 3', 'body');
"""
val rs = context.executeQuery(sql)
println(ResultSetFormatter(rs, withHeader = true))Example code:
- X-SQL to scrape 100+ fields from an Amazon's product page
- X-SQLs to crawl all types of Amazon webpages
Achieve extreme throughput with parallel browser control and smart resource optimization.
Performance:
- 10k ~ 20k complex page visits per machine per day
- Concurrent session management
- Resource blocking for faster page loads
val args = "-refresh -dropContent -interactLevel fastest"
val blockingUrls = listOf("*.png", "*.jpg")
val links = LinkExtractors.fromResource("urls.txt")
.map { ListenableHyperlink(it, "", args = args) }
.onEach {
it.eventHandlers.browseEventHandlers.onWillNavigate.addLast { page, driver ->
driver.addBlockedURLs(blockingUrls)
}
}
session.submitAll(links)📺 Bilibili: https://www.bilibili.com/video/BV1kM2rYrEFC
Automatic, large-scale, high-precision field discovery and extraction powered by self-/unsupervised machine learning — no LLM API calls, no tokens, deterministic and fast.
What it does:
- Learns every extractable field on item/detail pages (often dozens to hundreds) with high precision.
- Open source when browser4 has 10K stars on GitHub.
Why not just LLMs?
- LLM extraction adds latency, cost, and token limits.
- ML-based auto extraction is local, reproducible, and scalable to 100k+ ~ 200k pages/day.
- You can still combine both: use Auto Extraction for structured baseline + LLM for semantic enrichment.
Quick Commands (PulsarRPAPro):
# NOTE: MongoDB required
curl -L -o PulsarRPAPro.jar https://github.com/platonai/PulsarRPAPro/releases/download/v4.6.0/PulsarRPAPro.jarIntegration Status:
- Available today via the companion project PulsarRPAPro.
- Native Browser4 API exposure is planned; follow releases for updates.
Key Advantages:
- High precision: >95% fields discovered; majority with >99% accuracy (indicative on tested domains).
- Resilient to selector churn & HTML noise.
- Zero external dependency (no API key) → cost-efficient at scale.
- Explainable: generated selectors & SQL are transparent and auditable.
👽 Extract data with machine learning agents:
(Coming soon: richer in-repo examples and direct API hooks.)
| Module | Description |
|---|---|
pulsar-core |
Core engine: sessions, scheduling, DOM, browser control |
pulsar-agentic |
Agent implementation, MCP, and skill registration |
pulsar-rest |
Spring Boot REST layer & command endpoints |
browser4-spa |
Single Page Application for browser agents |
browser4-agents |
Agent & crawler orchestration with product packaging |
sdks |
Kotlin/Python SDKs plus tests and examples |
examples |
Runnable examples and demos |
pulsar-tests |
E2E & heavy integration & scenario tests |
SDKs are available under sdks/, current language support includes:
Status: [Available] in repo, [Experimental] in active iteration, [Planned] not in repo, [Indicative] performance target.
- [Available] Problem-solving autonomous browser agents
- [Available] Parallel agent sessions
- [Experimental] LLM-assisted page understanding & extraction
- [Available] Workflow-based browser actions
- [Available] Precise coroutine-safe control (scroll, click, extract)
- [Available] Flexible event handlers & lifecycle management
- [Available] One-line data extraction commands
- [Available] X-SQL extended query language for DOM/content
- [Experimental] Structured + unstructured hybrid extraction (LLM & ML & selectors)
- [Available] High-efficiency parallel page rendering
- [Available] Block-resistant design & smart retries
- [Indicative] 100,000+ complex pages/day on modest hardware
- [Experimental] Advanced anti-bot techniques
- [Available] Proxy rotation via
PROXY_ROTATION_URL - [Available] Resilient scheduling & quality assurance
- [Available] Simple API integration (REST, native, text commands)
- [Available] Rich configuration layering
- [Available] Clear structured logging & metrics
- [Available] Local FS & MongoDB support (extensible)
- [Available] Comprehensive logs & transparency
Join our community for support, feedback, and collaboration!
- GitHub Discussions: Engage with developers and users.
- Issue Tracker: Report bugs or request features.
- Social Media: Follow us for updates and news.
We welcome contributions! See CONTRIBUTING.md for details.
Comprehensive documentation is available in the docs/ directory and on our GitHub Pages site.
Browser4 supports proxy rotation and management to access geo-restricted content.
Quick Start:
- Obtain a list of proxy URLs (e.g., from a proxy provider).
- Configure
PROXY_ROTATION_URLinapplication.properties. - Use the
rotateProxiescommand in your agent scripts.
Example:
agent.run("""
1. Go to a blocked website
2. If blocked, rotate proxy and retry
""")Note: Respect website terms of service and robots.txt rules when scraping.
Apache 2.0 License. See LICENSE for details.
- 💬 WeChat:galaxyeye
- 🌐 Weibo:galaxyeye
- 📧 Email:galaxyeye@live.cn, ivincent.zhang@gmail.com
- 🐦 Twitter:galaxyeye8
- 🌍 WebSite:browser4.io



