Summary
This is a feature parity issue — Crawlee JS already has the @crawlee/stagehand package that integrates Stagehand (the AI Browser Automation Framework by Browserbase) with Crawlee.
Note: This is specifically about browser-based AI automation (Stagehand + Playwright), not about AI/LLM-based HTML parsing for HTTP clients — that is tracked separately in #1593.
Crawlee JS references
What was implemented in Crawlee JS
StagehandCrawler extending BrowserCrawler
- Stagehand
BrowserPlugin wrapping Stagehand for BrowserPool integration
- AI methods on the page object:
page.act() — Natural language browser interactions
page.extract() — Structured data extraction with Zod schemas
page.observe() — Get available page actions
page.agent() — Multi-step autonomous agents
- Full anti-blocking support via
BrowserPool integration
- Browser fingerprinting automatically applied
- Support for LOCAL and BROWSERBASE environments
- Session-based fingerprint caching
- Automatic proxy rotation on blocking
Current state in Crawlee Python
In #1278 we explored Stagehand integration and added a documentation guide showing how to use Stagehand with PlaywrightCrawler. However, this is just a guide — there is no dedicated StagehandCrawler or Stagehand browser plugin in the codebase itself.
Goal
To align with the JS implementation, we should have a dedicated StagehandCrawler and the corresponding Stagehand browser plugin directly in the Crawlee Python codebase — extending BrowserCrawler and integrating Stagehand through the BrowserPool / browser plugin system, rather than relying only on the documentation guide.
Summary
This is a feature parity issue — Crawlee JS already has the
@crawlee/stagehandpackage that integrates Stagehand (the AI Browser Automation Framework by Browserbase) with Crawlee.Note: This is specifically about browser-based AI automation (Stagehand + Playwright), not about AI/LLM-based HTML parsing for HTTP clients — that is tracked separately in #1593.
Crawlee JS references
@crawlee/stagehandpackage for AI-powered browser automation crawlee#3331 —feat: add @crawlee/stagehand package for AI-powered browser automationStagehand guideWhat was implemented in Crawlee JS
StagehandCrawlerextendingBrowserCrawlerBrowserPluginwrapping Stagehand forBrowserPoolintegrationpage.act()— Natural language browser interactionspage.extract()— Structured data extraction with Zod schemaspage.observe()— Get available page actionspage.agent()— Multi-step autonomous agentsBrowserPoolintegrationCurrent state in Crawlee Python
In #1278 we explored Stagehand integration and added a documentation guide showing how to use Stagehand with
PlaywrightCrawler. However, this is just a guide — there is no dedicatedStagehandCrawleror Stagehand browser plugin in the codebase itself.Goal
To align with the JS implementation, we should have a dedicated
StagehandCrawlerand the corresponding Stagehand browser plugin directly in the Crawlee Python codebase — extendingBrowserCrawlerand integrating Stagehand through theBrowserPool/ browser plugin system, rather than relying only on the documentation guide.