You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
swarms/examples/tools/stagehand
Kye Gomez c890b0430b
[IMPROVE][AUTOSWARMBUILDER] [Fix][CouncilOfJudges]
4 weeks ago
..
tests [FORMAT][Stagehand cleanup into examples/tools/folder] 4 weeks ago
1_stagehand_wrapper_agent.py [FORMAT][Stagehand cleanup into examples/tools/folder] 4 weeks ago
2_stagehand_tools_agent.py [IMPROVE][AUTOSWARMBUILDER] [Fix][CouncilOfJudges] 4 weeks ago
3_stagehand_mcp_agent.py [FORMAT][Stagehand cleanup into examples/tools/folder] 4 weeks ago
4_stagehand_multi_agent_workflow.py [FORMAT][Stagehand cleanup into examples/tools/folder] 4 weeks ago
README.md [FORMAT][Stagehand cleanup into examples/tools/folder] 4 weeks ago
requirements.txt [FORMAT][Stagehand cleanup into examples/tools/folder] 4 weeks ago

README.md

Stagehand Browser Automation Integration for Swarms

This directory contains examples demonstrating how to integrate Stagehand, an AI-powered browser automation framework, with the Swarms multi-agent framework.

Overview

Stagehand provides natural language browser automation capabilities that can be seamlessly integrated into Swarms agents. This integration enables:

  • 🌐 Natural Language Web Automation: Use simple commands like "click the submit button" or "extract product prices"
  • 🤖 Multi-Agent Browser Workflows: Multiple agents can automate different websites simultaneously
  • 🔧 Flexible Integration Options: Use as a wrapped agent, individual tools, or via MCP server
  • 📊 Complex Automation Scenarios: E-commerce monitoring, competitive analysis, automated testing, and more

Examples

1. Stagehand Wrapper Agent (1_stagehand_wrapper_agent.py)

The simplest integration - wraps Stagehand as a Swarms-compatible agent.

from examples.stagehand.stagehand_wrapper_agent import StagehandAgent

# Create a browser automation agent
browser_agent = StagehandAgent(
    agent_name="WebScraperAgent",
    model_name="gpt-4o-mini",
    env="LOCAL",  # or "BROWSERBASE" for cloud execution
)

# Use natural language to control the browser
result = browser_agent.run(
    "Navigate to news.ycombinator.com and extract the top 5 story titles"
)

Features:

  • Inherits from Swarms Agent base class
  • Automatic browser lifecycle management
  • Natural language task interpretation
  • Support for both local (Playwright) and cloud (Browserbase) execution

2. Stagehand as Tools (2_stagehand_tools_agent.py)

Provides fine-grained control by exposing Stagehand methods as individual tools.

from swarms import Agent
from examples.stagehand.stagehand_tools_agent import (
    NavigateTool, ActTool, ExtractTool, ObserveTool, ScreenshotTool
)

# Create agent with browser tools
browser_agent = Agent(
    agent_name="BrowserAutomationAgent",
    model_name="gpt-4o-mini",
    tools=[
        NavigateTool(),
        ActTool(),
        ExtractTool(),
        ObserveTool(),
        ScreenshotTool(),
    ],
)

# Agent can now use tools strategically
result = browser_agent.run(
    "Go to google.com, search for 'Python tutorials', and extract the first 3 results"
)

Available Tools:

  • NavigateTool: Navigate to URLs
  • ActTool: Perform actions (click, type, scroll)
  • ExtractTool: Extract data from pages
  • ObserveTool: Find elements on pages
  • ScreenshotTool: Capture screenshots
  • CloseBrowserTool: Clean up browser resources

3. Stagehand MCP Server (3_stagehand_mcp_agent.py)

Integrates with Stagehand's Model Context Protocol (MCP) server for standardized tool access.

from examples.stagehand.stagehand_mcp_agent import StagehandMCPAgent

# Connect to Stagehand MCP server
mcp_agent = StagehandMCPAgent(
    agent_name="WebResearchAgent",
    mcp_server_url="http://localhost:3000/sse",
)

# Use MCP tools including multi-session management
result = mcp_agent.run("""
    Create 3 browser sessions and:
    1. Session 1: Check Python.org for latest version
    2. Session 2: Check PyPI for trending packages  
    3. Session 3: Check GitHub Python trending repos
    Compile a Python ecosystem status report.
""")

MCP Features:

  • Automatic tool discovery
  • Multi-session browser management
  • Built-in screenshot resources
  • Prompt templates for common tasks

4. Multi-Agent Workflows (4_stagehand_multi_agent_workflow.py)

Demonstrates complex multi-agent browser automation scenarios.

from examples.stagehand.stagehand_multi_agent_workflow import (
    create_price_comparison_workflow,
    create_competitive_analysis_workflow,
    create_automated_testing_workflow,
    create_news_aggregation_workflow
)

# Price comparison across multiple e-commerce sites
price_workflow = create_price_comparison_workflow()
result = price_workflow.run(
    "Compare prices for iPhone 15 Pro on Amazon and eBay"
)

# Competitive analysis of multiple companies
competitive_workflow = create_competitive_analysis_workflow()
result = competitive_workflow.run(
    "Analyze OpenAI, Anthropic, and DeepMind websites and social media"
)

Workflow Examples:

  • E-commerce Monitoring: Track prices across multiple sites
  • Competitive Analysis: Research competitors' websites and social media
  • Automated Testing: UI, form validation, and accessibility testing
  • News Aggregation: Collect and analyze news from multiple sources

Setup

Prerequisites

  1. Install Swarms and Stagehand:
pip install swarms stagehand
  1. Set up environment variables:
# For local browser automation (using Playwright)
export OPENAI_API_KEY="your-openai-key"

# For cloud browser automation (using Browserbase)
export BROWSERBASE_API_KEY="your-browserbase-key"
export BROWSERBASE_PROJECT_ID="your-project-id"
  1. For MCP Server examples:
# Install and run the Stagehand MCP server
cd stagehand-mcp-server
npm install
npm run build
npm start

Use Cases

E-commerce Automation

  • Price monitoring and comparison
  • Inventory tracking
  • Automated purchasing workflows
  • Review aggregation

Research and Analysis

  • Competitive intelligence gathering
  • Market research automation
  • Social media monitoring
  • News and trend analysis

Quality Assurance

  • Automated UI testing
  • Cross-browser compatibility testing
  • Form validation testing
  • Accessibility compliance checking

Data Collection

  • Web scraping at scale
  • Real-time data monitoring
  • Structured data extraction
  • Screenshot documentation

Best Practices

  1. Resource Management: Always clean up browser instances when done
browser_agent.cleanup()  # For wrapper agents
  1. Error Handling: Stagehand includes self-healing capabilities, but wrap critical operations in try-except blocks

  2. Parallel Execution: Use ConcurrentWorkflow for simultaneous browser automation across multiple sites

  3. Session Management: For complex multi-page workflows, use the MCP server's session management capabilities

  4. Rate Limiting: Be respectful of websites - add delays between requests when necessary

Testing

Run the test suite to verify the integration:

pytest tests/stagehand/test_stagehand_integration.py -v

Troubleshooting

Common Issues

  1. Browser not starting: Ensure Playwright is properly installed
playwright install
  1. MCP connection failed: Verify the MCP server is running on the correct port

  2. Timeout errors: Increase timeout in StagehandConfig or agent initialization

Debug Mode

Enable verbose logging:

agent = StagehandAgent(
    agent_name="DebugAgent",
    verbose=True,  # Enable detailed logging
)

Contributing

We welcome contributions! Please:

  1. Follow the existing code style
  2. Add tests for new features
  3. Update documentation
  4. Submit PRs with clear descriptions

License

These examples are provided under the same license as the Swarms framework. Stagehand is licensed separately - see Stagehand's repository for details.