|
4 weeks ago | |
---|---|---|
.. | ||
tests | 4 weeks ago | |
1_stagehand_wrapper_agent.py | 4 weeks ago | |
2_stagehand_tools_agent.py | 4 weeks ago | |
3_stagehand_mcp_agent.py | 4 weeks ago | |
4_stagehand_multi_agent_workflow.py | 4 weeks ago | |
README.md | 4 weeks ago | |
requirements.txt | 4 weeks ago |
README.md
Stagehand Browser Automation Integration for Swarms
This directory contains examples demonstrating how to integrate Stagehand, an AI-powered browser automation framework, with the Swarms multi-agent framework.
Overview
Stagehand provides natural language browser automation capabilities that can be seamlessly integrated into Swarms agents. This integration enables:
- 🌐 Natural Language Web Automation: Use simple commands like "click the submit button" or "extract product prices"
- 🤖 Multi-Agent Browser Workflows: Multiple agents can automate different websites simultaneously
- 🔧 Flexible Integration Options: Use as a wrapped agent, individual tools, or via MCP server
- 📊 Complex Automation Scenarios: E-commerce monitoring, competitive analysis, automated testing, and more
Examples
1. Stagehand Wrapper Agent (1_stagehand_wrapper_agent.py
)
The simplest integration - wraps Stagehand as a Swarms-compatible agent.
from examples.stagehand.stagehand_wrapper_agent import StagehandAgent
# Create a browser automation agent
browser_agent = StagehandAgent(
agent_name="WebScraperAgent",
model_name="gpt-4o-mini",
env="LOCAL", # or "BROWSERBASE" for cloud execution
)
# Use natural language to control the browser
result = browser_agent.run(
"Navigate to news.ycombinator.com and extract the top 5 story titles"
)
Features:
- Inherits from Swarms
Agent
base class - Automatic browser lifecycle management
- Natural language task interpretation
- Support for both local (Playwright) and cloud (Browserbase) execution
2. Stagehand as Tools (2_stagehand_tools_agent.py
)
Provides fine-grained control by exposing Stagehand methods as individual tools.
from swarms import Agent
from examples.stagehand.stagehand_tools_agent import (
NavigateTool, ActTool, ExtractTool, ObserveTool, ScreenshotTool
)
# Create agent with browser tools
browser_agent = Agent(
agent_name="BrowserAutomationAgent",
model_name="gpt-4o-mini",
tools=[
NavigateTool(),
ActTool(),
ExtractTool(),
ObserveTool(),
ScreenshotTool(),
],
)
# Agent can now use tools strategically
result = browser_agent.run(
"Go to google.com, search for 'Python tutorials', and extract the first 3 results"
)
Available Tools:
NavigateTool
: Navigate to URLsActTool
: Perform actions (click, type, scroll)ExtractTool
: Extract data from pagesObserveTool
: Find elements on pagesScreenshotTool
: Capture screenshotsCloseBrowserTool
: Clean up browser resources
3. Stagehand MCP Server (3_stagehand_mcp_agent.py
)
Integrates with Stagehand's Model Context Protocol (MCP) server for standardized tool access.
from examples.stagehand.stagehand_mcp_agent import StagehandMCPAgent
# Connect to Stagehand MCP server
mcp_agent = StagehandMCPAgent(
agent_name="WebResearchAgent",
mcp_server_url="http://localhost:3000/sse",
)
# Use MCP tools including multi-session management
result = mcp_agent.run("""
Create 3 browser sessions and:
1. Session 1: Check Python.org for latest version
2. Session 2: Check PyPI for trending packages
3. Session 3: Check GitHub Python trending repos
Compile a Python ecosystem status report.
""")
MCP Features:
- Automatic tool discovery
- Multi-session browser management
- Built-in screenshot resources
- Prompt templates for common tasks
4. Multi-Agent Workflows (4_stagehand_multi_agent_workflow.py
)
Demonstrates complex multi-agent browser automation scenarios.
from examples.stagehand.stagehand_multi_agent_workflow import (
create_price_comparison_workflow,
create_competitive_analysis_workflow,
create_automated_testing_workflow,
create_news_aggregation_workflow
)
# Price comparison across multiple e-commerce sites
price_workflow = create_price_comparison_workflow()
result = price_workflow.run(
"Compare prices for iPhone 15 Pro on Amazon and eBay"
)
# Competitive analysis of multiple companies
competitive_workflow = create_competitive_analysis_workflow()
result = competitive_workflow.run(
"Analyze OpenAI, Anthropic, and DeepMind websites and social media"
)
Workflow Examples:
- E-commerce Monitoring: Track prices across multiple sites
- Competitive Analysis: Research competitors' websites and social media
- Automated Testing: UI, form validation, and accessibility testing
- News Aggregation: Collect and analyze news from multiple sources
Setup
Prerequisites
- Install Swarms and Stagehand:
pip install swarms stagehand
- Set up environment variables:
# For local browser automation (using Playwright)
export OPENAI_API_KEY="your-openai-key"
# For cloud browser automation (using Browserbase)
export BROWSERBASE_API_KEY="your-browserbase-key"
export BROWSERBASE_PROJECT_ID="your-project-id"
- For MCP Server examples:
# Install and run the Stagehand MCP server
cd stagehand-mcp-server
npm install
npm run build
npm start
Use Cases
E-commerce Automation
- Price monitoring and comparison
- Inventory tracking
- Automated purchasing workflows
- Review aggregation
Research and Analysis
- Competitive intelligence gathering
- Market research automation
- Social media monitoring
- News and trend analysis
Quality Assurance
- Automated UI testing
- Cross-browser compatibility testing
- Form validation testing
- Accessibility compliance checking
Data Collection
- Web scraping at scale
- Real-time data monitoring
- Structured data extraction
- Screenshot documentation
Best Practices
- Resource Management: Always clean up browser instances when done
browser_agent.cleanup() # For wrapper agents
-
Error Handling: Stagehand includes self-healing capabilities, but wrap critical operations in try-except blocks
-
Parallel Execution: Use
ConcurrentWorkflow
for simultaneous browser automation across multiple sites -
Session Management: For complex multi-page workflows, use the MCP server's session management capabilities
-
Rate Limiting: Be respectful of websites - add delays between requests when necessary
Testing
Run the test suite to verify the integration:
pytest tests/stagehand/test_stagehand_integration.py -v
Troubleshooting
Common Issues
- Browser not starting: Ensure Playwright is properly installed
playwright install
-
MCP connection failed: Verify the MCP server is running on the correct port
-
Timeout errors: Increase timeout in StagehandConfig or agent initialization
Debug Mode
Enable verbose logging:
agent = StagehandAgent(
agent_name="DebugAgent",
verbose=True, # Enable detailed logging
)
Contributing
We welcome contributions! Please:
- Follow the existing code style
- Add tests for new features
- Update documentation
- Submit PRs with clear descriptions
License
These examples are provided under the same license as the Swarms framework. Stagehand is licensed separately - see Stagehand's repository for details.