History

Kye Gomez 807b0b8990 [SSE -> STREAMABLE HTTP MIGRATION]		2 months ago
..
tests	[SSE -> STREAMABLE HTTP MIGRATION]	2 months ago
1_stagehand_wrapper_agent.py	[FORMAT][Stagehand cleanup into examples/tools/folder]	3 months ago
2_stagehand_tools_agent.py	[IMPROVE][AUTOSWARMBUILDER] [Fix][CouncilOfJudges]	3 months ago
3_stagehand_mcp_agent.py	[SSE -> STREAMABLE HTTP MIGRATION]	2 months ago
4_stagehand_multi_agent_workflow.py	[FORMAT][Stagehand cleanup into examples/tools/folder]	3 months ago
README.md	[SSE -> STREAMABLE HTTP MIGRATION]	2 months ago
requirements.txt	[FORMAT][Stagehand cleanup into examples/tools/folder]	3 months ago

README.md

Stagehand Browser Automation Integration for Swarms

This directory contains examples demonstrating how to integrate Stagehand, an AI-powered browser automation framework, with the Swarms multi-agent framework.

Overview

Stagehand provides natural language browser automation capabilities that can be seamlessly integrated into Swarms agents. This integration enables:

🌐 Natural Language Web Automation: Use simple commands like "click the submit button" or "extract product prices"
🤖 Multi-Agent Browser Workflows: Multiple agents can automate different websites simultaneously
🔧 Flexible Integration Options: Use as a wrapped agent, individual tools, or via MCP server
📊 Complex Automation Scenarios: E-commerce monitoring, competitive analysis, automated testing, and more

Examples

1. Stagehand Wrapper Agent (`1_stagehand_wrapper_agent.py`)

The simplest integration - wraps Stagehand as a Swarms-compatible agent.

from examples.stagehand.stagehand_wrapper_agent import StagehandAgent

# Create a browser automation agent
browser_agent = StagehandAgent(
    agent_name="WebScraperAgent",
    model_name="gpt-4o-mini",
    env="LOCAL",  # or "BROWSERBASE" for cloud execution
)

# Use natural language to control the browser
result = browser_agent.run(
    "Navigate to news.ycombinator.com and extract the top 5 story titles"
)

Features:

Inherits from Swarms Agent base class
Automatic browser lifecycle management
Natural language task interpretation
Support for both local (Playwright) and cloud (Browserbase) execution

2. Stagehand as Tools (`2_stagehand_tools_agent.py`)

Provides fine-grained control by exposing Stagehand methods as individual tools.

from swarms import Agent
from examples.stagehand.stagehand_tools_agent import (
    NavigateTool, ActTool, ExtractTool, ObserveTool, ScreenshotTool
)

# Create agent with browser tools
browser_agent = Agent(
    agent_name="BrowserAutomationAgent",
    model_name="gpt-4o-mini",
    tools=[
        NavigateTool(),
        ActTool(),
        ExtractTool(),
        ObserveTool(),
        ScreenshotTool(),
    ],
)

# Agent can now use tools strategically
result = browser_agent.run(
    "Go to google.com, search for 'Python tutorials', and extract the first 3 results"
)

Available Tools:

NavigateTool: Navigate to URLs
ActTool: Perform actions (click, type, scroll)
ExtractTool: Extract data from pages
ObserveTool: Find elements on pages
ScreenshotTool: Capture screenshots
CloseBrowserTool: Clean up browser resources

3. Stagehand MCP Server (`3_stagehand_mcp_agent.py`)

Integrates with Stagehand's Model Context Protocol (MCP) server for standardized tool access.

from examples.stagehand.stagehand_mcp_agent import StagehandMCPAgent

# Connect to Stagehand MCP server
mcp_agent = StagehandMCPAgent(
    agent_name="WebResearchAgent",
    mcp_server_url="http://localhost:3000/mcp",
)

# Use MCP tools including multi-session management
result = mcp_agent.run("""
    Create 3 browser sessions and:
    1. Session 1: Check Python.org for latest version
    2. Session 2: Check PyPI for trending packages  
    3. Session 3: Check GitHub Python trending repos
    Compile a Python ecosystem status report.
""")

MCP Features:

Automatic tool discovery
Multi-session browser management
Built-in screenshot resources
Prompt templates for common tasks

4. Multi-Agent Workflows (`4_stagehand_multi_agent_workflow.py`)

Demonstrates complex multi-agent browser automation scenarios.

from examples.stagehand.stagehand_multi_agent_workflow import (
    create_price_comparison_workflow,
    create_competitive_analysis_workflow,
    create_automated_testing_workflow,
    create_news_aggregation_workflow
)

# Price comparison across multiple e-commerce sites
price_workflow = create_price_comparison_workflow()
result = price_workflow.run(
    "Compare prices for iPhone 15 Pro on Amazon and eBay"
)

# Competitive analysis of multiple companies
competitive_workflow = create_competitive_analysis_workflow()
result = competitive_workflow.run(
    "Analyze OpenAI, Anthropic, and DeepMind websites and social media"
)

Workflow Examples:

E-commerce Monitoring: Track prices across multiple sites
Competitive Analysis: Research competitors' websites and social media
Automated Testing: UI, form validation, and accessibility testing
News Aggregation: Collect and analyze news from multiple sources

Setup

Prerequisites

Install Swarms and Stagehand:

pip install swarms stagehand

Set up environment variables:

# For local browser automation (using Playwright)
export OPENAI_API_KEY="your-openai-key"

# For cloud browser automation (using Browserbase)
export BROWSERBASE_API_KEY="your-browserbase-key"
export BROWSERBASE_PROJECT_ID="your-project-id"

For MCP Server examples:

# Install and run the Stagehand MCP server
cd stagehand-mcp-server
npm install
npm run build
npm start

Use Cases

E-commerce Automation

Price monitoring and comparison
Inventory tracking
Automated purchasing workflows
Review aggregation

Research and Analysis

Competitive intelligence gathering
Market research automation
Social media monitoring
News and trend analysis

Quality Assurance

Automated UI testing
Cross-browser compatibility testing
Form validation testing
Accessibility compliance checking

Data Collection

Web scraping at scale
Real-time data monitoring
Structured data extraction
Screenshot documentation

Best Practices

Resource Management: Always clean up browser instances when done

browser_agent.cleanup()  # For wrapper agents

Error Handling: Stagehand includes self-healing capabilities, but wrap critical operations in try-except blocks
Parallel Execution: Use ConcurrentWorkflow for simultaneous browser automation across multiple sites
Session Management: For complex multi-page workflows, use the MCP server's session management capabilities
Rate Limiting: Be respectful of websites - add delays between requests when necessary

Testing

Run the test suite to verify the integration:

pytest tests/stagehand/test_stagehand_integration.py -v

Troubleshooting

Common Issues

Browser not starting: Ensure Playwright is properly installed

playwright install

MCP connection failed: Verify the MCP server is running on the correct port
Timeout errors: Increase timeout in StagehandConfig or agent initialization

Debug Mode

Enable verbose logging:

agent = StagehandAgent(
    agent_name="DebugAgent",
    verbose=True,  # Enable detailed logging
)

Contributing

We welcome contributions! Please:

Follow the existing code style
Add tests for new features
Update documentation
Submit PRs with clear descriptions

License

These examples are provided under the same license as the Swarms framework. Stagehand is licensed separately - see Stagehand's repository for details.