Merge branch 'kyegomez:master' into master4

pull/1103/head
Aksh Parekh 3 months ago committed by GitHub
commit 49cf716e2a
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

@ -18,12 +18,12 @@ If you're adding a new integration, please include:
Maintainer responsibilities: Maintainer responsibilities:
- General / Misc / if you don't know who to tag: kye@apac.ai - General / Misc / if you don't know who to tag: kye@swarms.world
- DataLoaders / VectorStores / Retrievers: kye@apac.ai - DataLoaders / VectorStores / Retrievers: kye@swarms.world
- swarms.models: kye@apac.ai - swarms.models: kye@swarms.world
- swarms.memory: kye@apac.ai - swarms.memory: kye@swarms.world
- swarms.structures: kye@apac.ai - swarms.structures: kye@swarms.world
If no one reviews your PR within a few days, feel free to email Kye at kye@apac.ai If no one reviews your PR within a few days, feel free to email Kye at kye@swarms.world
See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/kyegomez/swarms See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/kyegomez/swarms

@ -11,7 +11,7 @@ jobs:
permissions: write-all permissions: write-all
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/first-interaction@v3.0.0 - uses: actions/first-interaction@v3.1.0
with: with:
repo-token: ${{ secrets.GITHUB_TOKEN }} repo-token: ${{ secrets.GITHUB_TOKEN }}
issue-message: issue-message:

@ -60,7 +60,7 @@ representative at an online or offline event.
Instances of abusive, harassing, or otherwise unacceptable behavior may be Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at reported to the community leaders responsible for enforcement at
kye@apac.ai. kye@swarms.world.
All complaints will be reviewed and investigated promptly and fairly. All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the All community leaders are obligated to respect the privacy and security of the

@ -27,7 +27,7 @@
* * * * * * * * * *
If you discover a security vulnerability in any of the above versions, please report it immediately to our security team by sending an email to kye@apac.ai. We take security vulnerabilities seriously and appreciate your efforts in disclosing them responsibly. If you discover a security vulnerability in any of the above versions, please report it immediately to our security team by sending an email to kye@swarms.world. We take security vulnerabilities seriously and appreciate your efforts in disclosing them responsibly.
Please provide detailed information on the vulnerability, including steps to reproduce, potential impact, and any known mitigations. Our security team will acknowledge receipt of your report within 24 hours and will provide regular updates on the progress of the investigation. Please provide detailed information on the vulnerability, including steps to reproduce, potential impact, and any known mitigations. Our security team will acknowledge receipt of your report within 24 hours and will provide regular updates on the progress of the investigation.

@ -1,20 +1,18 @@
import orjson import json
from dotenv import load_dotenv
from swarms.structs.auto_swarm_builder import AutoSwarmBuilder from swarms import AutoSwarmBuilder
load_dotenv()
swarm = AutoSwarmBuilder( swarm = AutoSwarmBuilder(
name="My Swarm", name="My Swarm",
description="My Swarm Description", description="My Swarm Description",
verbose=True, verbose=True,
max_loops=1, max_loops=1,
return_agents=True, execution_type="return-agents",
model_name="gpt-4.1",
) )
result = swarm.run( result = swarm.run(
task="Build a swarm to write a research paper on the topic of AI" task="Build a swarm to write a research paper on the topic of AI"
) )
print(orjson.dumps(result, option=orjson.OPT_INDENT_2).decode()) print(json.dumps(result, indent=2))

@ -0,0 +1,171 @@
# Medical AOP Example
A real-world demonstration of the Agent Orchestration Protocol (AOP) using medical agents deployed as MCP tools.
## Overview
This example showcases how to:
- Deploy multiple medical agents as MCP tools via AOP
- Use discovery tools for dynamic agent collaboration
- Execute real tool calls with structured schemas
- Integrate with keyless APIs for enhanced context
## Architecture
```mermaid
graph LR
A[Medical Agents] --> B[AOP MCP Server<br/>Port 8000]
B --> C[Client<br/>Cursor/Python]
B --> D[Discovery Tools]
B --> E[Tool Execution]
subgraph "Medical Agents"
F[Chief Medical Officer]
G[Virologist]
H[Internist]
I[Medical Coder]
J[Diagnostic Synthesizer]
end
A --> F
A --> G
A --> H
A --> I
A --> J
```
### Medical Agents
- **Chief Medical Officer**: Coordination, diagnosis, triage
- **Virologist**: Viral disease analysis and ICD-10 coding
- **Internist**: Internal medicine evaluation and HCC tagging
- **Medical Coder**: ICD-10 code assignment and compliance
- **Diagnostic Synthesizer**: Final report synthesis with confidence levels
## Files
| File | Description |
|------|-------------|
| `medical_aop/server.py` | AOP server exposing medical agents as MCP tools |
| `medical_aop/client.py` | Discovery client with real tool execution |
| `README.md` | This documentation |
## Usage
### 1. Start the AOP Server
```bash
python -m examples.aop_examples.medical_aop.server
```
### 2. Configure Cursor MCP Integration
Add to `~/.cursor/mcp.json`:
```json
{
"mcpServers": {
"Medical AOP": {
"type": "http",
"url": "http://localhost:8000/mcp"
}
}
}
```
### 3. Use in Cursor
Enable "Medical AOP" in Cursor's MCP settings, then:
#### Discover agents:
```
Call tool discover_agents with: {}
```
#### Execute medical coding:
```
Call tool Medical Coder with: {"task":"Patient: 45M, egfr 59 ml/min/1.73; non-African American. Provide ICD-10 suggestions and coding notes.","priority":"normal","include_images":false}
```
#### Review infection control:
```
Call tool Chief Medical Officer with: {"task":"Review current hospital infection control protocols in light of recent MRSA outbreak in ICU. Provide executive summary, policy adjustment recommendations, and estimated implementation costs.","priority":"high"}
```
### 4. Run Python Client
```bash
python -m examples.aop_examples.medical_aop.client
```
## Features
### Structured Schemas
- Custom input/output schemas with validation
- Priority levels (low/normal/high)
- Image processing support
- Confidence scoring
### Discovery Tools
| Tool | Description |
|------|-------------|
| `discover_agents` | List all available agents |
| `get_agent_details` | Detailed agent information |
| `search_agents` | Keyword-based agent search |
| `list_agents` | Simple agent name list |
### Real-world Integration
- Keyless API integration (disease.sh for epidemiology data)
- Structured medical coding workflows
- Executive-level policy recommendations
- Cost estimation and implementation timelines
## Response Format
All tools return consistent JSON:
```json
{
"result": "Agent response text",
"success": true,
"error": null,
"confidence": 0.95,
"codes": ["N18.3", "Z51.11"]
}
```
## Configuration
### Server Settings
| Setting | Value |
|---------|-------|
| Port | 8000 |
| Transport | streamable-http |
| Timeouts | 40-50 seconds per agent |
| Logging | INFO level with traceback enabled |
### Agent Metadata
Each agent includes:
- Tags for categorization
- Capabilities for matching
- Role classification
- Model configuration
## Best Practices
1. **Use structured inputs**: Leverage the custom schemas for better results
2. **Chain agents**: Pass results between agents for comprehensive analysis
3. **Monitor timeouts**: Adjust based on task complexity
4. **Validate responses**: Check the `success` field in all responses
5. **Use discovery**: Query available agents before hardcoding tool names
## Troubleshooting
| Issue | Solution |
|-------|----------|
| Connection refused | Ensure server is running on port 8000 |
| Tool not found | Use `discover_agents` to verify available tools |
| Timeout errors | Increase timeout values for complex tasks |
| Schema validation | Ensure input matches the defined JSON schema |
## References
- [AOP Reference](https://docs.swarms.world/en/latest/swarms/structs/aop/)
- [MCP Integration](https://docs.swarms.ai/examples/mcp-integration)
- [Protocol Overview](https://docs.swarms.world/en/latest/protocol/overview/)

@ -1,12 +1,18 @@
# AOP Server Setup Example # AOP Server Setup Example
This example demonstrates how to set up an AOP (Agent Orchestration Protocol) server with multiple specialized agents. This example demonstrates how to set up an Agent Orchestration Protocol (AOP) server with multiple specialized agents.
## Complete Server Setup ## Overview
The AOP server allows you to deploy multiple agents that can be discovered and called by other agents or clients in the network. This example shows how to create a server with specialized agents for different tasks.
## Code Example
```python ```python
from swarms import Agent from swarms import Agent
from swarms.structs.aop import AOP from swarms.structs.aop import (
AOP,
)
# Create specialized agents # Create specialized agents
research_agent = Agent( research_agent = Agent(
@ -94,15 +100,9 @@ financial_agent = Agent(
Always provide accurate, well-reasoned financial analysis.""", Always provide accurate, well-reasoned financial analysis.""",
) )
# Create AOP instance # Basic usage - individual agent addition
deployer = AOP( deployer = AOP("MyAgentServer", verbose=True, port=5932)
server_name="MyAgentServer",
port=8000,
verbose=True,
log_level="INFO"
)
# Add all agents at once
agents = [ agents = [
research_agent, research_agent,
analysis_agent, analysis_agent,
@ -111,216 +111,54 @@ agents = [
financial_agent, financial_agent,
] ]
tool_names = deployer.add_agents_batch(agents) deployer.add_agents_batch(agents)
print(f"Added {len(tool_names)} agents: {tool_names}")
# Display server information
server_info = deployer.get_server_info()
print(f"Server: {server_info['server_name']}")
print(f"Total tools: {server_info['total_tools']}")
print(f"Available tools: {server_info['tools']}")
# Start the server
print("Starting AOP server...")
deployer.run() deployer.run()
``` ```
## Running the Server ## Key Components
1. Save the code above to a file (e.g., `aop_server.py`)
2. Install required dependencies:
```bash
pip install swarms
```
3. Run the server:
```bash
python aop_server.py
```
The server will start on `http://localhost:8000` and make all agents available as MCP tools.
## Tool Usage Examples
Once the server is running, you can call the tools using MCP clients:
### Research Agent
```python
# Call the research agent
result = research_tool(task="Research the latest AI trends in 2024")
print(result)
```
### Analysis Agent with Image
```python ### 1. Agent Creation
# Call the analysis agent with an image
result = analysis_tool(
task="Analyze this chart and provide insights",
img="path/to/chart.png"
)
print(result)
```
### Writing Agent with Multiple Images Each agent is created with:
```python - **agent_name**: Unique identifier for the agent
# Call the writing agent with multiple images - **agent_description**: Brief description of the agent's capabilities
result = writing_tool( - **model_name**: The language model to use
task="Write a comprehensive report based on these images", - **system_prompt**: Detailed instructions defining the agent's role and behavior
imgs=["image1.jpg", "image2.jpg", "image3.jpg"]
)
print(result)
```
### Code Agent with Validation ### 2. AOP Server Setup
```python - **Server Name**: "MyAgentServer" - identifies your server
# Call the code agent with expected output - **Port**: 5932 - the port where the server will run
result = code_tool( - **Verbose**: True - enables detailed logging
task="Debug this Python function",
correct_answer="Expected output: Hello World"
)
print(result)
```
### Financial Agent ### 3. Agent Registration
```python - **add_agents_batch()**: Registers multiple agents at once
# Call the financial agent - Agents become available for discovery and remote calls
result = financial_tool(task="Analyze the current market trends for tech stocks")
print(result)
```
## Response Format ## Usage
All tools return a standardized response: 1. **Start the Server**: Run the script to start the AOP server
2. **Agent Discovery**: Other agents or clients can discover available agents
3. **Remote Calls**: Agents can be called remotely by their names
```json ## Server Features
{
"result": "The agent's response to the task",
"success": true,
"error": null
}
```
## Advanced Configuration - **Agent Discovery**: Automatically registers agents for network discovery
- **Remote Execution**: Agents can be called from other network nodes
- **Load Balancing**: Distributes requests across available agents
- **Health Monitoring**: Tracks agent status and availability
### Custom Timeouts and Retries ## Configuration Options
```python - **Port**: Change the port number as needed
# Add agent with custom configuration - **Verbose**: Set to False for reduced logging
deployer.add_agent( - **Server Name**: Use a descriptive name for your server
agent=research_agent,
tool_name="custom_research_tool",
tool_description="Research tool with extended timeout",
timeout=120, # 2 minutes
max_retries=5,
verbose=True
)
```
### Custom Input/Output Schemas ## Next Steps
```python
# Define custom schemas
custom_input_schema = {
"type": "object",
"properties": {
"task": {"type": "string", "description": "The research task"},
"sources": {
"type": "array",
"items": {"type": "string"},
"description": "Specific sources to research"
},
"depth": {
"type": "string",
"enum": ["shallow", "medium", "deep"],
"description": "Research depth level"
}
},
"required": ["task"]
}
# Add agent with custom schemas
deployer.add_agent(
agent=research_agent,
tool_name="advanced_research_tool",
input_schema=custom_input_schema,
timeout=60
)
```
## Monitoring and Debugging
### Enable Verbose Logging
```python
deployer = AOP(
server_name="DebugServer",
verbose=True,
traceback_enabled=True,
log_level="DEBUG"
)
```
### Check Server Status
```python
# List all registered agents
agents = deployer.list_agents()
print(f"Registered agents: {agents}")
# Get detailed agent information
for agent_name in agents:
info = deployer.get_agent_info(agent_name)
print(f"Agent {agent_name}: {info}")
# Get server information
server_info = deployer.get_server_info()
print(f"Server info: {server_info}")
```
## Production Deployment
### External Access
```python
deployer = AOP(
server_name="ProductionServer",
host="0.0.0.0", # Allow external connections
port=8000,
verbose=False, # Disable verbose logging in production
log_level="WARNING"
)
```
### Multiple Servers
```python
# Server 1: Research and Analysis
research_deployer = AOP("ResearchServer", port=8000)
research_deployer.add_agent(research_agent)
research_deployer.add_agent(analysis_agent)
# Server 2: Writing and Code
content_deployer = AOP("ContentServer", port=8001)
content_deployer.add_agent(writing_agent)
content_deployer.add_agent(code_agent)
# Server 3: Financial
finance_deployer = AOP("FinanceServer", port=8002)
finance_deployer.add_agent(financial_agent)
# Start all servers
import threading
threading.Thread(target=research_deployer.run).start()
threading.Thread(target=content_deployer.run).start()
threading.Thread(target=finance_deployer.run).start()
```
This example demonstrates a complete AOP server setup with multiple specialized agents, proper configuration, and production-ready deployment options. - See [AOP Cluster Example](aop_cluster_example.md) for multi-server setups
- Check [AOP Reference](../structs/aop.md) for advanced configuration options
- Explore agent communication patterns in the examples directory

@ -34,13 +34,15 @@ Main class for deploying agents as tools in an MCP server.
|-----------|------|---------|-------------| |-----------|------|---------|-------------|
| `server_name` | `str` | `"AOP Cluster"` | Name for the MCP server | | `server_name` | `str` | `"AOP Cluster"` | Name for the MCP server |
| `description` | `str` | `"A cluster that enables you to deploy multiple agents as tools in an MCP server."` | Server description | | `description` | `str` | `"A cluster that enables you to deploy multiple agents as tools in an MCP server."` | Server description |
| `agents` | `List[Agent]` | `None` | Optional list of agents to add initially | | `agents` | `any` | `None` | Optional list of agents to add initially |
| `port` | `int` | `8000` | Port for the MCP server | | `port` | `int` | `8000` | Port for the MCP server |
| `transport` | `str` | `"streamable-http"` | Transport type for the MCP server | | `transport` | `str` | `"streamable-http"` | Transport type for the MCP server |
| `verbose` | `bool` | `False` | Enable verbose logging | | `verbose` | `bool` | `False` | Enable verbose logging |
| `traceback_enabled` | `bool` | `True` | Enable traceback logging for errors | | `traceback_enabled` | `bool` | `True` | Enable traceback logging for errors |
| `host` | `str` | `"localhost"` | Host to bind the server to | | `host` | `str` | `"localhost"` | Host to bind the server to |
| `log_level` | `str` | `"INFO"` | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) | | `log_level` | `str` | `"INFO"` | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
| `*args` | `Any` | - | Additional positional arguments passed to FastMCP |
| `**kwargs` | `Any` | - | Additional keyword arguments passed to FastMCP |
#### Methods #### Methods
@ -120,6 +122,203 @@ Get information about the MCP server and registered tools.
**Returns:** `Dict[str, Any]` - Server information **Returns:** `Dict[str, Any]` - Server information
##### _register_tool()
Register a single agent as an MCP tool (internal method).
| Parameter | Type | Description |
|-----------|------|-------------|
| `tool_name` | `str` | Name of the tool to register |
| `agent` | `AgentType` | The agent instance to register |
##### _execute_agent_with_timeout()
Execute an agent with a timeout and all run method parameters (internal method).
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `agent` | `AgentType` | Required | The agent to execute |
| `task` | `str` | Required | The task to execute |
| `timeout` | `int` | Required | Maximum execution time in seconds |
| `img` | `str` | `None` | Optional image to be processed by the agent |
| `imgs` | `List[str]` | `None` | Optional list of images to be processed by the agent |
| `correct_answer` | `str` | `None` | Optional correct answer for validation or comparison |
**Returns:** `str` - The agent's response
**Raises:** `TimeoutError` if execution exceeds timeout, `Exception` if agent execution fails
##### _get_agent_discovery_info()
Get discovery information for a specific agent (internal method).
| Parameter | Type | Description |
|-----------|------|-------------|
| `tool_name` | `str` | Name of the agent tool |
**Returns:** `Optional[Dict[str, Any]]` - Agent discovery information, or None if not found
## Discovery Tools
AOP automatically registers several discovery tools that allow agents to learn about each other and enable dynamic agent discovery within the cluster.
### discover_agents
Discover information about agents in the cluster including their name, description, system prompt (truncated to 200 chars), and tags.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `agent_name` | `str` | `None` | Optional specific agent name to get info for. If None, returns info for all agents. |
**Returns:** `Dict[str, Any]` - Agent information for discovery
**Response Format:**
```json
{
"success": true,
"agents": [
{
"tool_name": "agent_name",
"agent_name": "Agent Name",
"description": "Agent description",
"short_system_prompt": "Truncated system prompt...",
"tags": ["tag1", "tag2"],
"capabilities": ["capability1", "capability2"],
"role": "worker",
"model_name": "model_name",
"max_loops": 1,
"temperature": 0.5,
"max_tokens": 4096
}
]
}
```
### get_agent_details
Get detailed information about a single agent by name including configuration, capabilities, and metadata.
| Parameter | Type | Description |
|-----------|------|-------------|
| `agent_name` | `str` | Name of the agent to get information for. |
**Returns:** `Dict[str, Any]` - Detailed agent information
**Response Format:**
```json
{
"success": true,
"agent_info": {
"tool_name": "agent_name",
"agent_name": "Agent Name",
"agent_description": "Agent description",
"model_name": "model_name",
"max_loops": 1,
"tool_description": "Tool description",
"timeout": 30,
"max_retries": 3,
"verbose": false,
"traceback_enabled": true
},
"discovery_info": {
"tool_name": "agent_name",
"agent_name": "Agent Name",
"description": "Agent description",
"short_system_prompt": "Truncated system prompt...",
"tags": ["tag1", "tag2"],
"capabilities": ["capability1", "capability2"],
"role": "worker",
"model_name": "model_name",
"max_loops": 1,
"temperature": 0.5,
"max_tokens": 4096
}
}
```
### get_agents_info
Get detailed information about multiple agents by providing a list of agent names.
| Parameter | Type | Description |
|-----------|------|-------------|
| `agent_names` | `List[str]` | List of agent names to get information for. |
**Returns:** `Dict[str, Any]` - Detailed information for all requested agents
**Response Format:**
```json
{
"success": true,
"agents_info": [
{
"agent_name": "agent_name",
"agent_info": { /* detailed agent info */ },
"discovery_info": { /* discovery info */ }
}
],
"not_found": ["missing_agent"],
"total_found": 1,
"total_requested": 2
}
```
### list_agents
Get a simple list of all available agent names in the cluster.
**Returns:** `Dict[str, Any]` - List of agent names
**Response Format:**
```json
{
"success": true,
"agent_names": ["agent1", "agent2", "agent3"],
"total_count": 3
}
```
### search_agents
Search for agents by name, description, tags, or capabilities using keyword matching.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `query` | `str` | Required | Search query string |
| `search_fields` | `List[str]` | `["name", "description", "tags", "capabilities"]` | Optional list of fields to search in. If None, searches all fields. |
**Returns:** `Dict[str, Any]` - Matching agents
**Response Format:**
```json
{
"success": true,
"matching_agents": [
{
"tool_name": "agent_name",
"agent_name": "Agent Name",
"description": "Agent description",
"short_system_prompt": "Truncated system prompt...",
"tags": ["tag1", "tag2"],
"capabilities": ["capability1", "capability2"],
"role": "worker",
"model_name": "model_name",
"max_loops": 1,
"temperature": 0.5,
"max_tokens": 4096
}
],
"total_matches": 1,
"query": "search_term",
"search_fields": ["name", "description", "tags", "capabilities"]
}
```
### AOPCluster Class ### AOPCluster Class
Class for connecting to and managing multiple MCP servers. Class for connecting to and managing multiple MCP servers.
@ -275,18 +474,22 @@ print(f"Added {len(tool_names)} agents: {tool_names}")
deployer.run() deployer.run()
``` ```
### Advanced Configuration ### Advanced Configuration with Tags and Capabilities
```python ```python
from swarms import Agent from swarms import Agent
from swarms.structs.aop import AOP from swarms.structs.aop import AOP
# Create agent with custom configuration # Create agent with custom configuration, tags, and capabilities
research_agent = Agent( research_agent = Agent(
agent_name="Research-Agent", agent_name="Research-Agent",
agent_description="Expert in research and data collection", agent_description="Expert in research and data collection",
model_name="anthropic/claude-sonnet-4-5", model_name="anthropic/claude-sonnet-4-5",
max_loops=1, max_loops=1,
# Add tags and capabilities for better discovery
tags=["research", "data-collection", "analysis"],
capabilities=["web-search", "data-gathering", "report-generation"],
role="researcher"
) )
# Create AOP with custom settings # Create AOP with custom settings
@ -405,6 +608,141 @@ else:
print("Research-Agent tool not found") print("Research-Agent tool not found")
``` ```
### Discovery Tools Examples
The AOP server automatically provides discovery tools that allow agents to learn about each other. Here are examples of how to use these tools:
```python
# Example discovery tool calls (these would be made by MCP clients)
# Discover all agents in the cluster
all_agents = discover_agents()
print(f"Found {len(all_agents['agents'])} agents in the cluster")
# Discover a specific agent
research_agent_info = discover_agents(agent_name="Research-Agent")
if research_agent_info['success']:
agent = research_agent_info['agents'][0]
print(f"Agent: {agent['agent_name']}")
print(f"Description: {agent['description']}")
print(f"Tags: {agent['tags']}")
print(f"Capabilities: {agent['capabilities']}")
# Get detailed information about a specific agent
agent_details = get_agent_details(agent_name="Research-Agent")
if agent_details['success']:
print("Agent Info:", agent_details['agent_info'])
print("Discovery Info:", agent_details['discovery_info'])
# Get information about multiple agents
multiple_agents = get_agents_info(agent_names=["Research-Agent", "Analysis-Agent"])
print(f"Found {multiple_agents['total_found']} out of {multiple_agents['total_requested']} agents")
print("Not found:", multiple_agents['not_found'])
# List all available agents
agent_list = list_agents()
print(f"Available agents: {agent_list['agent_names']}")
# Search for agents by keyword
search_results = search_agents(query="research")
print(f"Found {search_results['total_matches']} agents matching 'research'")
# Search in specific fields only
tag_search = search_agents(
query="data",
search_fields=["tags", "capabilities"]
)
print(f"Found {tag_search['total_matches']} agents with 'data' in tags or capabilities")
```
### Dynamic Agent Discovery Example
Here's a practical example of how agents can use discovery tools to find and collaborate with other agents:
```python
from swarms import Agent
from swarms.structs.aop import AOP
# Create a coordinator agent that can discover and use other agents
coordinator = Agent(
agent_name="Coordinator-Agent",
agent_description="Coordinates tasks between different specialized agents",
model_name="anthropic/claude-sonnet-4-5",
max_loops=1,
tags=["coordination", "orchestration", "management"],
capabilities=["agent-discovery", "task-distribution", "workflow-management"],
role="coordinator"
)
# Create specialized agents
research_agent = Agent(
agent_name="Research-Agent",
agent_description="Expert in research and data collection",
model_name="anthropic/claude-sonnet-4-5",
max_loops=1,
tags=["research", "data-collection", "analysis"],
capabilities=["web-search", "data-gathering", "report-generation"],
role="researcher"
)
analysis_agent = Agent(
agent_name="Analysis-Agent",
agent_description="Expert in data analysis and insights",
model_name="anthropic/claude-sonnet-4-5",
max_loops=1,
tags=["analysis", "data-processing", "insights"],
capabilities=["statistical-analysis", "pattern-recognition", "visualization"],
role="analyst"
)
# Create AOP server
deployer = AOP(
server_name="DynamicAgentCluster",
port=8000,
verbose=True
)
# Add all agents
deployer.add_agent(coordinator)
deployer.add_agent(research_agent)
deployer.add_agent(analysis_agent)
# The coordinator can now discover other agents and use them
# This would be done through MCP tool calls in practice
def coordinate_research_task(task_description):
"""
Example of how the coordinator might use discovery tools
"""
# 1. Discover available research agents
research_agents = discover_agents()
research_agents = [a for a in research_agents['agents'] if 'research' in a['tags']]
# 2. Get detailed info about the best research agent
if research_agents:
best_agent = research_agents[0]
agent_details = get_agent_details(agent_name=best_agent['agent_name'])
# 3. Use the research agent for the task
research_result = research_agent.run(task=task_description)
# 4. Find analysis agents for processing the research
analysis_agents = search_agents(query="analysis", search_fields=["tags"])
if analysis_agents['matching_agents']:
analysis_agent_name = analysis_agents['matching_agents'][0]['agent_name']
analysis_result = analysis_agent.run(task=f"Analyze this research: {research_result}")
return {
"research_result": research_result,
"analysis_result": analysis_result,
"agents_used": [best_agent['agent_name'], analysis_agent_name]
}
return {"error": "No suitable agents found"}
# Start the server
deployer.run()
```
### Tool Execution Examples ### Tool Execution Examples
Once your AOP server is running, you can call the tools using MCP clients. Here are examples of how the tools would be called: Once your AOP server is running, you can call the tools using MCP clients. Here are examples of how the tools would be called:
@ -460,6 +798,10 @@ AOP provides comprehensive error handling:
| **Handle Errors** | Always check the `success` field in tool responses | | **Handle Errors** | Always check the `success` field in tool responses |
| **Resource Management** | Monitor server resources when running multiple agents | | **Resource Management** | Monitor server resources when running multiple agents |
| **Security** | Use appropriate host/port settings for your deployment environment | | **Security** | Use appropriate host/port settings for your deployment environment |
| **Use Tags and Capabilities** | Add meaningful tags and capabilities to agents for better discovery |
| **Define Agent Roles** | Use the `role` attribute to categorize agents (coordinator, worker, etc.) |
| **Leverage Discovery Tools** | Use built-in discovery tools for dynamic agent collaboration |
| **Design for Scalability** | Plan for adding/removing agents dynamically using discovery tools |
## Integration with Other Systems ## Integration with Other Systems

@ -30,8 +30,8 @@ The AutoSwarmBuilder is designed to:
| `interactive` | bool | False | Whether to enable interactive mode | | `interactive` | bool | False | Whether to enable interactive mode |
| `max_tokens` | int | 8000 | Maximum tokens for the LLM responses | | `max_tokens` | int | 8000 | Maximum tokens for the LLM responses |
| `execution_type` | str | "return-agents" | Type of execution to perform (see Execution Types) | | `execution_type` | str | "return-agents" | Type of execution to perform (see Execution Types) |
| `return_dictionary` | bool | True | Whether to return dictionary format for agent specs |
| `system_prompt` | str | BOSS_SYSTEM_PROMPT | System prompt for the boss agent | | `system_prompt` | str | BOSS_SYSTEM_PROMPT | System prompt for the boss agent |
| `additional_llm_args` | dict | {} | Additional arguments to pass to the LLM |
## Execution Types ## Execution Types
@ -39,12 +39,10 @@ The `execution_type` parameter controls how the AutoSwarmBuilder operates:
| Execution Type | Description | | Execution Type | Description |
|----------------------------------|-----------------------------------------------------------| |----------------------------------|-----------------------------------------------------------|
| **"return-agents"** | Creates and returns a list of Agent objects (default) | | **"return-agents"** | Creates and returns agent specifications as a dictionary (default) |
| **"execute-swarm-router"** | Executes the swarm router with the created agents | | **"execute-swarm-router"** | Executes the swarm router with the created agents |
| **"return-swarm-router-config"** | Returns the swarm router configuration as a dictionary | | **"return-swarm-router-config"** | Returns the swarm router configuration as a dictionary |
| **"return-agent-configurations"**| Returns agent configurations as a dictionary | | **"return-agents-objects"** | Returns agent objects created from specifications |
| **"return-agent-specs"** | Returns agent specifications as a tuple (agents, count) |
| **"return-agent-dictionary"** | Returns agent configurations as a dictionary format |
## Core Methods ## Core Methods
@ -84,26 +82,6 @@ Creates specialized agents for a given task using the boss agent system.
- `Exception`: If there's an error during agent creation - `Exception`: If there's an error during agent creation
### build_agent(agent_name: str, agent_description: str, agent_system_prompt: str)
Builds a single agent with specified parameters and enhanced error handling.
**Parameters:**
| Parameter | Type | Description |
|-----------------------|-------|--------------------------------|
| `agent_name` | str | Name of the agent |
| `agent_description` | str | Description of the agent |
| `agent_system_prompt` | str | System prompt for the agent |
**Returns:**
- `Agent`: The constructed agent
**Raises:**
- `Exception`: If there's an error during agent construction
### create_router_config(task: str) ### create_router_config(task: str)
Creates a swarm router configuration for a given task. Creates a swarm router configuration for a given task.
@ -162,43 +140,6 @@ Returns the available execution types.
- `List[str]`: List of available execution types - `List[str]`: List of available execution types
### _create_agent_specs(task: str)
Create agent specifications for a given task.
**Parameters:**
- `task` (str): The task to create agents for
**Returns:**
- `Tuple[List[Agent], int]`: List of created agents and count
### _create_agent_dictionary(task: str)
Create agent dictionary for a given task.
**Parameters:**
- `task` (str): The task to create agents for
**Returns:**
- `dict`: Dictionary containing agent configurations
### _create_agents_from_specs(task: str, return_dict: bool = False)
Create agents from specifications using the boss agent system.
**Parameters:**
- `task` (str): The task to create agents for
- `return_dict` (bool): Whether to return dictionary format
**Returns:**
- `List[Agent]` or `dict`: Created agents or dictionary
### create_agents_from_specs(agents_dictionary: Any) ### create_agents_from_specs(agents_dictionary: Any)
Create agents from agent specifications. Create agents from agent specifications.
@ -211,40 +152,29 @@ Create agents from agent specifications.
- `List[Agent]`: List of created agents - `List[Agent]`: List of created agents
### build_agent_from_spec(agent_name: str, agent_description: str, agent_system_prompt: str, max_loops: int = 1, model_name: str = "gpt-4.1", dynamic_temperature_enabled: bool = True, auto_generate_prompt: bool = False, role: str = "worker", max_tokens: int = 8192, temperature: float = 0.5) ### dict_to_agent(output: dict)
Build a single agent from agent specification with comprehensive configuration options. Converts a dictionary output to a list of Agent objects.
**Parameters:** **Parameters:**
| Parameter | Type | Default | Description | - `output` (dict): Dictionary containing agent configurations
|-----------|------|---------|-------------|
| `agent_name` | str | - | Name of the agent |
| `agent_description` | str | - | Description of the agent's purpose |
| `agent_system_prompt` | str | - | The system prompt for the agent |
| `max_loops` | int | 1 | Maximum number of loops |
| `model_name` | str | "gpt-4.1" | Model name to use |
| `dynamic_temperature_enabled` | bool | True | Whether to enable dynamic temperature |
| `auto_generate_prompt` | bool | False | Whether to auto-generate prompts |
| `role` | str | "worker" | Role of the agent |
| `max_tokens` | int | 8192 | Maximum tokens |
| `temperature` | float | 0.5 | Temperature setting |
**Returns:** **Returns:**
- `Agent`: The constructed agent instance - `List[Agent]`: List of constructed agents
### dict_to_agent(output: dict) ### _execute_task(task: str)
Converts a dictionary output to a list of Agent objects. Execute a task by creating agents and initializing the swarm router.
**Parameters:** **Parameters:**
- `output` (dict): Dictionary containing agent configurations - `task` (str): The task to execute
**Returns:** **Returns:**
- `List[Agent]`: List of constructed agents - `Any`: The result of the swarm router execution
### build_llm_agent(config: BaseModel) ### build_llm_agent(config: BaseModel)
@ -285,6 +215,7 @@ Configuration for an individual agent specification with comprehensive options.
| `temperature` | float | Parameter controlling randomness of agent output (lower = more deterministic) | | `temperature` | float | Parameter controlling randomness of agent output (lower = more deterministic) |
| `role` | str | Designated role within the swarm influencing behavior and interactions | | `role` | str | Designated role within the swarm influencing behavior and interactions |
| `max_loops` | int | Maximum number of times the agent can repeat its task for iterative processing | | `max_loops` | int | Maximum number of times the agent can repeat its task for iterative processing |
| `goal` | str | The primary objective or desired outcome the agent is tasked with achieving |
### Agents ### Agents
@ -302,7 +233,7 @@ Configuration model for individual agents in a swarm.
| Field | Type | Description | | Field | Type | Description |
|-----------------|---------|-----------------------------------------------------------------------------------------------| |-----------------|---------|-----------------------------------------------------------------------------------------------|
| `name` | str | Unique identifier for the agent | | `agent_name` | str | Unique identifier for the agent |
| `description` | str | Comprehensive description of the agent's purpose and capabilities | | `description` | str | Comprehensive description of the agent's purpose and capabilities |
| `system_prompt` | str | Detailed system prompt defining agent behavior | | `system_prompt` | str | Detailed system prompt defining agent behavior |
| `goal` | str | Primary objective the agent is tasked with achieving | | `goal` | str | Primary objective the agent is tasked with achieving |
@ -465,7 +396,7 @@ from swarms.structs.auto_swarm_builder import AutoSwarmBuilder
swarm = AutoSwarmBuilder( swarm = AutoSwarmBuilder(
name="Marketing Swarm", name="Marketing Swarm",
description="A swarm for marketing strategy development", description="A swarm for marketing strategy development",
execution_type="return-agent-configurations" execution_type="return-agents"
) )
# Get agent configurations without executing # Get agent configurations without executing
@ -475,7 +406,7 @@ agent_configs = swarm.run(
print("Generated agents:") print("Generated agents:")
for agent in agent_configs["agents"]: for agent in agent_configs["agents"]:
print(f"- {agent['name']}: {agent['description']}") print(f"- {agent['agent_name']}: {agent['description']}")
``` ```
### Example 4: Getting Swarm Router Configuration ### Example 4: Getting Swarm Router Configuration
@ -549,24 +480,24 @@ result = swarm.run(
) )
``` ```
### Example 7: Getting Agent Specifications ### Example 7: Getting Agent Objects
```python ```python
from swarms.structs.auto_swarm_builder import AutoSwarmBuilder from swarms.structs.auto_swarm_builder import AutoSwarmBuilder
# Initialize to return agent specifications # Initialize to return agent objects
swarm = AutoSwarmBuilder( swarm = AutoSwarmBuilder(
name="Specification Swarm", name="Specification Swarm",
description="A swarm for generating agent specifications", description="A swarm for generating agent specifications",
execution_type="return-agent-specs" execution_type="return-agents-objects"
) )
# Get agent specifications with count # Get agent objects
agents, count = swarm.run( agents = swarm.run(
"Create a team of agents for analyzing customer feedback and generating actionable insights" "Create a team of agents for analyzing customer feedback and generating actionable insights"
) )
print(f"Created {count} agents:") print(f"Created {len(agents)} agents:")
for agent in agents: for agent in agents:
print(f"- {agent.agent_name}: {agent.description}") print(f"- {agent.agent_name}: {agent.description}")
``` ```
@ -580,7 +511,7 @@ from swarms.structs.auto_swarm_builder import AutoSwarmBuilder
swarm = AutoSwarmBuilder( swarm = AutoSwarmBuilder(
name="Dictionary Swarm", name="Dictionary Swarm",
description="A swarm for generating agent dictionaries", description="A swarm for generating agent dictionaries",
execution_type="return-agent-dictionary" execution_type="return-agents"
) )
# Get agent configurations as dictionary # Get agent configurations as dictionary
@ -635,16 +566,14 @@ swarm = AutoSwarmBuilder(
description="A highly configured swarm with advanced settings", description="A highly configured swarm with advanced settings",
model_name="gpt-4.1", model_name="gpt-4.1",
max_tokens=16000, max_tokens=16000,
temperature=0.3, additional_llm_args={"temperature": 0.3},
return_dictionary=True,
verbose=True, verbose=True,
interactive=False interactive=False
) )
# Create agents with detailed specifications # Create agents with detailed specifications
agent_specs = swarm._create_agents_from_specs( agent_specs = swarm.run(
"Develop a comprehensive cybersecurity strategy for a mid-size company", "Develop a comprehensive cybersecurity strategy for a mid-size company"
return_dict=True
) )
# Build agents from specifications # Build agents from specifications
@ -672,14 +601,13 @@ for agent in agents:
- Set appropriate `max_loops` based on task complexity (typically 1) - Set appropriate `max_loops` based on task complexity (typically 1)
- Use `verbose=True` during development for debugging - Use `verbose=True` during development for debugging
- Choose the right `execution_type` for your use case: - Choose the right `execution_type` for your use case:
- Use `"return-agents"` for direct agent interaction and execution - Use `"return-agents"` for getting agent specifications as dictionary (default)
- Use `"return-agent-configurations"` for inspecting agent setups - Use `"execute-swarm-router"` for executing the swarm router with created agents
- Use `"return-swarm-router-config"` for analyzing swarm architecture - Use `"return-swarm-router-config"` for analyzing swarm architecture
- Use `"return-agent-specs"` for getting agent specifications with counts - Use `"return-agents-objects"` for getting agent objects created from specifications
- Use `"return-agent-dictionary"` for dictionary-format configurations
- Set `max_tokens` appropriately based on expected response length - Set `max_tokens` appropriately based on expected response length
- Use `interactive=True` for real-time collaboration scenarios - Use `interactive=True` for real-time collaboration scenarios
- Set `return_dictionary=True` for easier data manipulation - Use `additional_llm_args` for passing custom parameters to the LLM
!!! note "Model Selection" !!! note "Model Selection"
- Choose appropriate `model_name` based on task requirements - Choose appropriate `model_name` based on task requirements

@ -0,0 +1,66 @@
# AOP Examples
This directory contains runnable examples that demonstrate AOP (Agents over Protocol) patterns in Swarms: spinning up a simple MCP server, discovering available agents/tools, and invoking agent tools from client scripts.
## Whats inside
- **Top-level demos**
- [`example_new_agent_tools.py`](./example_new_agent_tools.py): Endtoend demo of agent discovery utilities (list/search agents, get details for one or many). Targets an MCP server at `http://localhost:5932/mcp`.
- [`list_agents_and_call_them.py`](./list_agents_and_call_them.py): Utility helpers to fetch tools from an MCP server and call an agentstyle tool with a task prompt. Defaults to `http://localhost:8000/mcp`.
- [`get_all_agents.py`](./get_all_agents.py): Minimal snippet to print all tools exposed by an MCP server as JSON. Defaults to `http://0.0.0.0:8000/mcp`.
- **Server**
- [`server/server.py`](./server/server.py): Simple MCP server entrypoint you can run locally to expose tools/agents for the client examples.
- **Client**
- [`client/aop_cluster_example.py`](./client/aop_cluster_example.py): Connect to an AOP cluster and interact with agents.
- [`client/aop_queue_example.py`](./client/aop_queue_example.py): Example of queuestyle task submission to agents.
- [`client/aop_raw_task_example.py`](./client/aop_raw_task_example.py): Shows how to send a raw task payload without additional wrappers.
- [`client/aop_raw_client_code.py`](./client/aop_raw_client_code.py): Minimal, lowlevel client calls against the MCP endpoint.
- **Discovery**
- [`discovery/example_agent_communication.py`](./discovery/example_agent_communication.py): Illustrates simple agenttoagent or agenttoservice communication patterns.
- [`discovery/example_aop_discovery.py`](./discovery/example_aop_discovery.py): Demonstrates discovering available agents/tools via AOP.
- [`discovery/simple_discovery_example.py`](./discovery/simple_discovery_example.py): A pareddown discovery walkthrough.
- [`discovery/test_aop_discovery.py`](./discovery/test_aop_discovery.py): Teststyle script validating discovery functionality.
## Prerequisites
- Python environment with project dependencies installed.
- An MCP server running locally (you can use the provided server example).
## Quick start
1. Start a local MCP server (in a separate terminal):
```bash
python examples/aop_examples/server/server.py
```
1. Try discovery utilities (adjust the URL if your server uses a different port):
```bash
# List exposed tools (defaults to http://0.0.0.0:8000/mcp)
python examples/aop_examples/get_all_agents.py
# Fetch tools and call the first agent-like tool (defaults to http://localhost:8000/mcp)
python examples/aop_examples/list_agents_and_call_them.py
# Rich demo of agent info utilities (expects http://localhost:5932/mcp by default)
python examples/aop_examples/example_new_agent_tools.py
```
1. Explore client variants:
```bash
python examples/aop_examples/client/aop_cluster_example.py
python examples/aop_examples/client/aop_queue_example.py
python examples/aop_examples/client/aop_raw_task_example.py
python examples/aop_examples/client/aop_raw_client_code.py
```
## Tips
- **Server URL/port**: Several examples assume `http://localhost:8000/mcp` or `http://localhost:5932/mcp`. If your server runs elsewhere, update the `server_path`/URL variables at the top of the scripts.
- **Troubleshooting**: If a script reports “No tools available”, ensure the MCP server is running and that the endpoint path (`/mcp`) and port match the script.
- **Next steps**: Use these scripts as templates—swap in your own tools/agents, change the search queries, or extend the client calls to fit your workflow.

@ -1,336 +0,0 @@
# AOP Cluster Example
This example demonstrates how to use AOPCluster to connect to and manage multiple MCP servers running AOP agents.
## Basic Cluster Setup
```python
import json
from swarms.structs.aop import AOPCluster
# Connect to multiple MCP servers
cluster = AOPCluster(
urls=[
"http://localhost:8000/mcp", # Research and Analysis server
"http://localhost:8001/mcp", # Writing and Code server
"http://localhost:8002/mcp" # Financial server
],
transport="streamable-http"
)
# Get all available tools from all servers
all_tools = cluster.get_tools(output_type="dict")
print(f"Found {len(all_tools)} tools across all servers")
# Pretty print all tools
print(json.dumps(all_tools, indent=2))
```
## Finding Specific Tools
```python
# Find a specific tool by name
research_tool = cluster.find_tool_by_server_name("Research-Agent")
if research_tool:
print("Found Research-Agent tool:")
print(json.dumps(research_tool, indent=2))
else:
print("Research-Agent tool not found")
# Find multiple tools
tool_names = ["Research-Agent", "Analysis-Agent", "Writing-Agent", "Code-Agent"]
found_tools = {}
for tool_name in tool_names:
tool = cluster.find_tool_by_server_name(tool_name)
if tool:
found_tools[tool_name] = tool
print(f"✓ Found {tool_name}")
else:
print(f"✗ {tool_name} not found")
print(f"Found {len(found_tools)} out of {len(tool_names)} tools")
```
## Tool Discovery and Management
```python
# Get tools in different formats
json_tools = cluster.get_tools(output_type="json")
dict_tools = cluster.get_tools(output_type="dict")
str_tools = cluster.get_tools(output_type="str")
print(f"JSON format: {len(json_tools)} tools")
print(f"Dict format: {len(dict_tools)} tools")
print(f"String format: {len(str_tools)} tools")
# Analyze tool distribution across servers
server_tools = {}
for tool in dict_tools:
server_name = tool.get("server", "unknown")
if server_name not in server_tools:
server_tools[server_name] = []
server_tools[server_name].append(tool.get("function", {}).get("name", "unknown"))
print("\nTools by server:")
for server, tools in server_tools.items():
print(f" {server}: {len(tools)} tools - {tools}")
```
## Advanced Cluster Management
```python
class AOPClusterManager:
def __init__(self, urls, transport="streamable-http"):
self.cluster = AOPCluster(urls, transport)
self.tools_cache = {}
self.last_update = None
def refresh_tools(self):
"""Refresh the tools cache"""
self.tools_cache = {}
tools = self.cluster.get_tools(output_type="dict")
for tool in tools:
tool_name = tool.get("function", {}).get("name")
if tool_name:
self.tools_cache[tool_name] = tool
self.last_update = time.time()
return len(self.tools_cache)
def get_tool(self, tool_name):
"""Get a specific tool by name"""
if not self.tools_cache or time.time() - self.last_update > 300: # 5 min cache
self.refresh_tools()
return self.tools_cache.get(tool_name)
def list_tools_by_category(self):
"""Categorize tools by their names"""
categories = {
"research": [],
"analysis": [],
"writing": [],
"code": [],
"financial": [],
"other": []
}
for tool_name in self.tools_cache.keys():
tool_name_lower = tool_name.lower()
if "research" in tool_name_lower:
categories["research"].append(tool_name)
elif "analysis" in tool_name_lower:
categories["analysis"].append(tool_name)
elif "writing" in tool_name_lower:
categories["writing"].append(tool_name)
elif "code" in tool_name_lower:
categories["code"].append(tool_name)
elif "financial" in tool_name_lower:
categories["financial"].append(tool_name)
else:
categories["other"].append(tool_name)
return categories
def get_available_servers(self):
"""Get list of available servers"""
servers = set()
for tool in self.tools_cache.values():
server = tool.get("server", "unknown")
servers.add(server)
return list(servers)
# Usage example
import time
manager = AOPClusterManager([
"http://localhost:8000/mcp",
"http://localhost:8001/mcp",
"http://localhost:8002/mcp"
])
# Refresh and display tools
tool_count = manager.refresh_tools()
print(f"Loaded {tool_count} tools")
# Categorize tools
categories = manager.list_tools_by_category()
for category, tools in categories.items():
if tools:
print(f"{category.title()}: {tools}")
# Get available servers
servers = manager.get_available_servers()
print(f"Available servers: {servers}")
```
## Error Handling and Resilience
```python
class ResilientAOPCluster:
def __init__(self, urls, transport="streamable-http"):
self.urls = urls
self.transport = transport
self.cluster = AOPCluster(urls, transport)
self.failed_servers = set()
def get_tools_with_fallback(self, output_type="dict"):
"""Get tools with fallback for failed servers"""
try:
return self.cluster.get_tools(output_type=output_type)
except Exception as e:
print(f"Error getting tools: {e}")
# Try individual servers
all_tools = []
for url in self.urls:
if url in self.failed_servers:
continue
try:
single_cluster = AOPCluster([url], self.transport)
tools = single_cluster.get_tools(output_type=output_type)
all_tools.extend(tools)
except Exception as server_error:
print(f"Server {url} failed: {server_error}")
self.failed_servers.add(url)
return all_tools
def find_tool_with_retry(self, tool_name, max_retries=3):
"""Find tool with retry logic"""
for attempt in range(max_retries):
try:
return self.cluster.find_tool_by_server_name(tool_name)
except Exception as e:
print(f"Attempt {attempt + 1} failed: {e}")
if attempt < max_retries - 1:
time.sleep(2 ** attempt) # Exponential backoff
return None
# Usage
resilient_cluster = ResilientAOPCluster([
"http://localhost:8000/mcp",
"http://localhost:8001/mcp",
"http://localhost:8002/mcp"
])
# Get tools with error handling
tools = resilient_cluster.get_tools_with_fallback()
print(f"Retrieved {len(tools)} tools")
# Find tool with retry
research_tool = resilient_cluster.find_tool_with_retry("Research-Agent")
if research_tool:
print("Found Research-Agent tool")
else:
print("Research-Agent tool not found after retries")
```
## Monitoring and Health Checks
```python
class AOPClusterMonitor:
def __init__(self, cluster_manager):
self.manager = cluster_manager
self.health_status = {}
def check_server_health(self, url):
"""Check if a server is healthy"""
try:
single_cluster = AOPCluster([url], self.manager.cluster.transport)
tools = single_cluster.get_tools(output_type="dict")
return {
"status": "healthy",
"tool_count": len(tools),
"timestamp": time.time()
}
except Exception as e:
return {
"status": "unhealthy",
"error": str(e),
"timestamp": time.time()
}
def check_all_servers(self):
"""Check health of all servers"""
for url in self.manager.cluster.urls:
health = self.check_server_health(url)
self.health_status[url] = health
status_icon = "✓" if health["status"] == "healthy" else "✗"
print(f"{status_icon} {url}: {health['status']}")
if health["status"] == "healthy":
print(f" Tools available: {health['tool_count']}")
else:
print(f" Error: {health['error']}")
def get_health_summary(self):
"""Get summary of server health"""
healthy_count = sum(1 for status in self.health_status.values()
if status["status"] == "healthy")
total_count = len(self.health_status)
return {
"healthy_servers": healthy_count,
"total_servers": total_count,
"health_percentage": (healthy_count / total_count) * 100 if total_count > 0 else 0
}
# Usage
monitor = AOPClusterMonitor(manager)
monitor.check_all_servers()
summary = monitor.get_health_summary()
print(f"Health Summary: {summary['healthy_servers']}/{summary['total_servers']} servers healthy ({summary['health_percentage']:.1f}%)")
```
## Complete Example
```python
import json
import time
from swarms.structs.aop import AOPCluster
def main():
# Initialize cluster
cluster = AOPCluster(
urls=[
"http://localhost:8000/mcp",
"http://localhost:8001/mcp",
"http://localhost:8002/mcp"
],
transport="streamable-http"
)
print("AOP Cluster Management System")
print("=" * 40)
# Get all tools
print("\n1. Discovering tools...")
tools = cluster.get_tools(output_type="dict")
print(f"Found {len(tools)} tools across all servers")
# List all tool names
tool_names = [tool.get("function", {}).get("name") for tool in tools]
print(f"Available tools: {tool_names}")
# Find specific tools
print("\n2. Finding specific tools...")
target_tools = ["Research-Agent", "Analysis-Agent", "Writing-Agent", "Code-Agent", "Financial-Agent"]
for tool_name in target_tools:
tool = cluster.find_tool_by_server_name(tool_name)
if tool:
print(f"✓ {tool_name}: Available")
else:
print(f"✗ {tool_name}: Not found")
# Display tool details
print("\n3. Tool details:")
for tool in tools[:3]: # Show first 3 tools
print(f"\nTool: {tool.get('function', {}).get('name')}")
print(f"Description: {tool.get('function', {}).get('description')}")
print(f"Parameters: {list(tool.get('function', {}).get('parameters', {}).get('properties', {}).keys())}")
print("\nAOP Cluster setup complete!")
if __name__ == "__main__":
main()
```
This example demonstrates comprehensive AOP cluster management including tool discovery, error handling, health monitoring, and advanced cluster operations.

@ -1,11 +0,0 @@
import json
from swarms.structs.aop import AOPCluster
aop_cluster = AOPCluster(
urls=["http://localhost:8000/mcp"],
transport="streamable-http",
)
print(json.dumps(aop_cluster.get_tools(output_type="dict"), indent=4))
print(aop_cluster.find_tool_by_server_name("Research-Agent"))

@ -1,318 +0,0 @@
# AOP Server Setup Example
This example demonstrates how to set up an AOP (Agent Orchestration Protocol) server with multiple specialized agents.
## Complete Server Setup
```python
from swarms import Agent
from swarms.structs.aop import AOP
# Create specialized agents
research_agent = Agent(
agent_name="Research-Agent",
agent_description="Expert in research, data collection, and information gathering",
model_name="anthropic/claude-sonnet-4-5",
max_loops=1,
top_p=None,
dynamic_temperature_enabled=True,
system_prompt="""You are a research specialist. Your role is to:
1. Gather comprehensive information on any given topic
2. Analyze data from multiple sources
3. Provide well-structured research findings
4. Cite sources and maintain accuracy
5. Present findings in a clear, organized manner
Always provide detailed, factual information with proper context.""",
)
analysis_agent = Agent(
agent_name="Analysis-Agent",
agent_description="Expert in data analysis, pattern recognition, and generating insights",
model_name="anthropic/claude-sonnet-4-5",
max_loops=1,
top_p=None,
dynamic_temperature_enabled=True,
system_prompt="""You are an analysis specialist. Your role is to:
1. Analyze data and identify patterns
2. Generate actionable insights
3. Create visualizations and summaries
4. Provide statistical analysis
5. Make data-driven recommendations
Focus on extracting meaningful insights from information.""",
)
writing_agent = Agent(
agent_name="Writing-Agent",
agent_description="Expert in content creation, editing, and communication",
model_name="anthropic/claude-sonnet-4-5",
max_loops=1,
top_p=None,
dynamic_temperature_enabled=True,
system_prompt="""You are a writing specialist. Your role is to:
1. Create engaging, well-structured content
2. Edit and improve existing text
3. Adapt tone and style for different audiences
4. Ensure clarity and coherence
5. Follow best practices in writing
Always produce high-quality, professional content.""",
)
code_agent = Agent(
agent_name="Code-Agent",
agent_description="Expert in programming, code review, and software development",
model_name="anthropic/claude-sonnet-4-5",
max_loops=1,
top_p=None,
dynamic_temperature_enabled=True,
system_prompt="""You are a coding specialist. Your role is to:
1. Write clean, efficient code
2. Debug and fix issues
3. Review and optimize code
4. Explain programming concepts
5. Follow best practices and standards
Always provide working, well-documented code.""",
)
financial_agent = Agent(
agent_name="Financial-Agent",
agent_description="Expert in financial analysis, market research, and investment insights",
model_name="anthropic/claude-sonnet-4-5",
max_loops=1,
top_p=None,
dynamic_temperature_enabled=True,
system_prompt="""You are a financial specialist. Your role is to:
1. Analyze financial data and markets
2. Provide investment insights
3. Assess risk and opportunities
4. Create financial reports
5. Explain complex financial concepts
Always provide accurate, well-reasoned financial analysis.""",
)
# Create AOP instance
deployer = AOP(
server_name="MyAgentServer",
port=8000,
verbose=True,
log_level="INFO"
)
# Add all agents at once
agents = [
research_agent,
analysis_agent,
writing_agent,
code_agent,
financial_agent,
]
tool_names = deployer.add_agents_batch(agents)
print(f"Added {len(tool_names)} agents: {tool_names}")
# Display server information
server_info = deployer.get_server_info()
print(f"Server: {server_info['server_name']}")
print(f"Total tools: {server_info['total_tools']}")
print(f"Available tools: {server_info['tools']}")
# Start the server
print("Starting AOP server...")
deployer.run()
```
## Running the Server
1. Save the code above to a file (e.g., `aop_server.py`)
2. Install required dependencies:
```bash
pip install swarms
```
3. Run the server:
```bash
python aop_server.py
```
The server will start on `http://localhost:8000` and make all agents available as MCP tools.
## Tool Usage Examples
Once the server is running, you can call the tools using MCP clients:
### Research Agent
```python
# Call the research agent
result = research_tool(task="Research the latest AI trends in 2024")
print(result)
```
### Analysis Agent with Image
```python
# Call the analysis agent with an image
result = analysis_tool(
task="Analyze this chart and provide insights",
img="path/to/chart.png"
)
print(result)
```
### Writing Agent with Multiple Images
```python
# Call the writing agent with multiple images
result = writing_tool(
task="Write a comprehensive report based on these images",
imgs=["image1.jpg", "image2.jpg", "image3.jpg"]
)
print(result)
```
### Code Agent with Validation
```python
# Call the code agent with expected output
result = code_tool(
task="Debug this Python function",
correct_answer="Expected output: Hello World"
)
print(result)
```
### Financial Agent
```python
# Call the financial agent
result = financial_tool(task="Analyze the current market trends for tech stocks")
print(result)
```
## Response Format
All tools return a standardized response:
```json
{
"result": "The agent's response to the task",
"success": true,
"error": null
}
```
## Advanced Configuration
### Custom Timeouts and Retries
```python
# Add agent with custom configuration
deployer.add_agent(
agent=research_agent,
tool_name="custom_research_tool",
tool_description="Research tool with extended timeout",
timeout=120, # 2 minutes
max_retries=5,
verbose=True
)
```
### Custom Input/Output Schemas
```python
# Define custom schemas
custom_input_schema = {
"type": "object",
"properties": {
"task": {"type": "string", "description": "The research task"},
"sources": {
"type": "array",
"items": {"type": "string"},
"description": "Specific sources to research"
},
"depth": {
"type": "string",
"enum": ["shallow", "medium", "deep"],
"description": "Research depth level"
}
},
"required": ["task"]
}
# Add agent with custom schemas
deployer.add_agent(
agent=research_agent,
tool_name="advanced_research_tool",
input_schema=custom_input_schema,
timeout=60
)
```
## Monitoring and Debugging
### Enable Verbose Logging
```python
deployer = AOP(
server_name="DebugServer",
verbose=True,
traceback_enabled=True,
log_level="DEBUG"
)
```
### Check Server Status
```python
# List all registered agents
agents = deployer.list_agents()
print(f"Registered agents: {agents}")
# Get detailed agent information
for agent_name in agents:
info = deployer.get_agent_info(agent_name)
print(f"Agent {agent_name}: {info}")
# Get server information
server_info = deployer.get_server_info()
print(f"Server info: {server_info}")
```
## Production Deployment
### External Access
```python
deployer = AOP(
server_name="ProductionServer",
host="0.0.0.0", # Allow external connections
port=8000,
verbose=False, # Disable verbose logging in production
log_level="WARNING"
)
```
### Multiple Servers
```python
# Server 1: Research and Analysis
research_deployer = AOP("ResearchServer", port=8000)
research_deployer.add_agent(research_agent)
research_deployer.add_agent(analysis_agent)
# Server 2: Writing and Code
content_deployer = AOP("ContentServer", port=8001)
content_deployer.add_agent(writing_agent)
content_deployer.add_agent(code_agent)
# Server 3: Financial
finance_deployer = AOP("FinanceServer", port=8002)
finance_deployer.add_agent(financial_agent)
# Start all servers
import threading
threading.Thread(target=research_deployer.run).start()
threading.Thread(target=content_deployer.run).start()
threading.Thread(target=finance_deployer.run).start()
```
This example demonstrates a complete AOP server setup with multiple specialized agents, proper configuration, and production-ready deployment options.

@ -0,0 +1,47 @@
import json
import asyncio
from swarms.structs.aop import AOPCluster
from swarms.tools.mcp_client_tools import execute_tool_call_simple
async def discover_agents_example():
"""
Discover all agents using the AOPCluster and print the result.
"""
aop_cluster = AOPCluster(
urls=["http://localhost:5932/mcp"],
transport="streamable-http",
)
tool = aop_cluster.find_tool_by_server_name("discover_agents")
if not tool:
print("discover_agents tool not found.")
return None
tool_call_request = {
"type": "function",
"function": {
"name": "discover_agents",
"arguments": "{}",
},
}
result = await execute_tool_call_simple(
response=tool_call_request,
server_path="http://localhost:5932/mcp",
output_type="dict",
verbose=False,
)
print(json.dumps(result, indent=2))
return result
def main():
"""
Run the discover_agents_example coroutine.
"""
asyncio.run(discover_agents_example())
if __name__ == "__main__":
main()

@ -0,0 +1,149 @@
#!/usr/bin/env python3
"""
Example demonstrating the AOP queue system for agent execution.
This example shows how to use the new queue-based execution system
in the AOP framework for improved performance and reliability.
"""
import time
from swarms import Agent
from swarms.structs.aop import AOP
def main():
"""Demonstrate AOP queue functionality."""
# Create some sample agents
agent1 = Agent(
agent_name="Research Agent",
agent_description="Specialized in research tasks",
model_name="gpt-4",
max_loops=1,
)
agent2 = Agent(
agent_name="Writing Agent",
agent_description="Specialized in writing tasks",
model_name="gpt-4",
max_loops=1,
)
# Create AOP with queue enabled
aop = AOP(
server_name="Queue Demo Cluster",
description="A demonstration of queue-based agent execution",
queue_enabled=True,
max_workers_per_agent=2, # 2 workers per agent
max_queue_size_per_agent=100, # Max 100 tasks per queue
processing_timeout=60, # 60 second timeout
retry_delay=2.0, # 2 second delay between retries
verbose=True,
)
# Add agents to the cluster
print("Adding agents to cluster...")
aop.add_agent(agent1, tool_name="researcher")
aop.add_agent(agent2, tool_name="writer")
# Get initial queue stats
print("\nInitial queue stats:")
stats = aop.get_queue_stats()
print(f"Stats: {stats}")
# Add some tasks to the queues
print("\nAdding tasks to queues...")
# Add high priority research task
research_task_id = aop.task_queues["researcher"].add_task(
task="Research the latest developments in quantum computing",
priority=10, # High priority
max_retries=2,
)
print(f"Added research task: {research_task_id}")
# Add medium priority writing task
writing_task_id = aop.task_queues["writer"].add_task(
task="Write a summary of AI trends in 2024",
priority=5, # Medium priority
max_retries=3,
)
print(f"Added writing task: {writing_task_id}")
# Add multiple low priority tasks
for i in range(3):
task_id = aop.task_queues["researcher"].add_task(
task=f"Research task {i+1}: Analyze market trends",
priority=1, # Low priority
max_retries=1,
)
print(f"Added research task {i+1}: {task_id}")
# Get updated queue stats
print("\nUpdated queue stats:")
stats = aop.get_queue_stats()
print(f"Stats: {stats}")
# Monitor task progress
print("\nMonitoring task progress...")
for _ in range(10): # Monitor for 10 iterations
time.sleep(1)
# Check research task status
research_status = aop.get_task_status(
"researcher", research_task_id
)
print(
f"Research task status: {research_status['task']['status'] if research_status['success'] else 'Error'}"
)
# Check writing task status
writing_status = aop.get_task_status(
"writer", writing_task_id
)
print(
f"Writing task status: {writing_status['task']['status'] if writing_status['success'] else 'Error'}"
)
# Get current queue stats
current_stats = aop.get_queue_stats()
if current_stats["success"]:
for agent_name, agent_stats in current_stats[
"stats"
].items():
print(
f"{agent_name}: {agent_stats['pending_tasks']} pending, {agent_stats['processing_tasks']} processing, {agent_stats['completed_tasks']} completed"
)
print("---")
# Demonstrate queue management
print("\nDemonstrating queue management...")
# Pause the research agent queue
print("Pausing research agent queue...")
aop.pause_agent_queue("researcher")
# Get queue status
research_queue_status = aop.task_queues["researcher"].get_status()
print(f"Research queue status: {research_queue_status.value}")
# Resume the research agent queue
print("Resuming research agent queue...")
aop.resume_agent_queue("researcher")
# Clear all queues
print("Clearing all queues...")
cleared = aop.clear_all_queues()
print(f"Cleared tasks: {cleared}")
# Final stats
print("\nFinal queue stats:")
final_stats = aop.get_queue_stats()
print(f"Final stats: {final_stats}")
print("\nQueue demonstration completed!")
if __name__ == "__main__":
main()

@ -0,0 +1,88 @@
import json
import asyncio
from swarms.structs.aop import AOPCluster
from swarms.tools.mcp_client_tools import execute_tool_call_simple
from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client
async def discover_agents_example():
"""
Discover all agents using the AOPCluster and print the result.
"""
aop_cluster = AOPCluster(
urls=["http://localhost:5932/mcp"],
transport="streamable-http",
)
tool = aop_cluster.find_tool_by_server_name("discover_agents")
if not tool:
print("discover_agents tool not found.")
return None
tool_call_request = {
"type": "function",
"function": {
"name": "discover_agents",
"arguments": "{}",
},
}
result = await execute_tool_call_simple(
response=tool_call_request,
server_path="http://localhost:5932/mcp",
output_type="dict",
verbose=False,
)
print(json.dumps(result, indent=2))
return result
async def raw_mcp_discover_agents_example():
"""
Call the MCP server directly using the raw MCP client to execute the
built-in "discover_agents" tool and print the JSON result.
This demonstrates how to:
- Initialize an MCP client over streamable HTTP
- List available tools (optional)
- Call a specific tool by name with arguments
"""
url = "http://localhost:5932/mcp"
# Open a raw MCP client connection
async with streamablehttp_client(url, timeout=10) as ctx:
if len(ctx) == 2:
read, write = ctx
else:
read, write, *_ = ctx
async with ClientSession(read, write) as session:
# Initialize the MCP session and optionally inspect tools
await session.initialize()
# Optional: list tools (uncomment to print)
# tools = await session.list_tools()
# print(json.dumps(tools.model_dump(), indent=2))
# Call the built-in discovery tool with empty arguments
result = await session.call_tool(
name="discover_agents",
arguments={},
)
# Convert to dict for pretty printing
print(json.dumps(result.model_dump(), indent=2))
return result.model_dump()
def main():
"""
Run the helper-based and raw MCP client discovery examples.
"""
asyncio.run(discover_agents_example())
asyncio.run(raw_mcp_discover_agents_example())
if __name__ == "__main__":
main()

@ -0,0 +1,107 @@
import json
import asyncio
from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client
async def call_agent_tool_raw(
url: str,
tool_name: str,
task: str,
img: str | None = None,
imgs: list[str] | None = None,
correct_answer: str | None = None,
) -> dict:
"""
Call a specific agent tool on an MCP server using the raw MCP client.
Args:
url: MCP server URL (e.g., "http://localhost:5932/mcp").
tool_name: Name of the tool/agent to invoke.
task: Task prompt to execute.
img: Optional single image path/URL.
imgs: Optional list of image paths/URLs.
correct_answer: Optional expected answer for validation.
Returns:
A dict containing the tool's JSON response.
"""
# Open a raw MCP client connection over streamable HTTP
async with streamablehttp_client(url, timeout=30) as ctx:
if len(ctx) == 2:
read, write = ctx
else:
read, write, *_ = ctx
async with ClientSession(read, write) as session:
# Initialize the MCP session
await session.initialize()
# Prepare arguments in the canonical AOP tool format
arguments: dict = {"task": task}
if img is not None:
arguments["img"] = img
if imgs is not None:
arguments["imgs"] = imgs
if correct_answer is not None:
arguments["correct_answer"] = correct_answer
# Invoke the tool by name
result = await session.call_tool(
name=tool_name, arguments=arguments
)
# Convert to dict for return/printing
return result.model_dump()
async def list_available_tools(url: str) -> dict:
"""
List tools from an MCP server using the raw client.
Args:
url: MCP server URL (e.g., "http://localhost:5932/mcp").
Returns:
A dict representation of the tools listing.
"""
async with streamablehttp_client(url, timeout=30) as ctx:
if len(ctx) == 2:
read, write = ctx
else:
read, write, *_ = ctx
async with ClientSession(read, write) as session:
await session.initialize()
tools = await session.list_tools()
return tools.model_dump()
def main() -> None:
"""
Demonstration entrypoint: list tools, then call a specified tool with a task.
"""
url = "http://localhost:5932/mcp"
tool_name = "Research-Agent" # Change to your agent tool name
task = "Summarize the latest advances in agent orchestration protocols."
# List tools
tools_info = asyncio.run(list_available_tools(url))
print("Available tools:")
print(json.dumps(tools_info, indent=2))
# Call the tool
print(f"\nCalling tool '{tool_name}' with task...\n")
result = asyncio.run(
call_agent_tool_raw(
url=url,
tool_name=tool_name,
task=task,
)
)
print(json.dumps(result, indent=2))
if __name__ == "__main__":
main()

@ -0,0 +1,149 @@
#!/usr/bin/env python3
"""
Example showing how agents can use the discovery tool to learn about each other
and collaborate more effectively.
"""
from swarms import Agent
from swarms.structs.aop import AOP
def simulate_agent_discovery():
"""Simulate how an agent would use the discovery tool."""
# Create a sample agent that will use the discovery tool
Agent(
agent_name="ProjectCoordinator",
agent_description="Coordinates projects and assigns tasks to other agents",
system_prompt="You are a project coordinator who helps organize work and delegate tasks to the most appropriate team members. You can discover information about other agents to make better decisions.",
model_name="gpt-4o-mini",
temperature=0.4,
)
# Create the AOP cluster
aop = AOP(
server_name="Project Team",
description="A team of specialized agents for project coordination",
verbose=True,
)
# Add some specialized agents
data_agent = Agent(
agent_name="DataSpecialist",
agent_description="Handles all data-related tasks and analysis",
system_prompt="You are a data specialist with expertise in data processing, analysis, and visualization. You work with large datasets and create insights.",
tags=["data", "analysis", "python", "sql", "statistics"],
capabilities=[
"data_processing",
"statistical_analysis",
"visualization",
],
role="specialist",
)
code_agent = Agent(
agent_name="CodeSpecialist",
agent_description="Handles all coding and development tasks",
system_prompt="You are a software development specialist who writes clean, efficient code and follows best practices. You handle both frontend and backend development.",
tags=[
"coding",
"development",
"python",
"javascript",
"react",
],
capabilities=[
"software_development",
"code_review",
"debugging",
],
role="developer",
)
writing_agent = Agent(
agent_name="ContentSpecialist",
agent_description="Creates and manages all written content",
system_prompt="You are a content specialist who creates engaging written content, documentation, and marketing materials. You ensure all content is clear and compelling.",
tags=["writing", "content", "documentation", "marketing"],
capabilities=[
"content_creation",
"technical_writing",
"editing",
],
role="writer",
)
# Add agents to the cluster
aop.add_agent(data_agent, tool_name="data_specialist")
aop.add_agent(code_agent, tool_name="code_specialist")
aop.add_agent(writing_agent, tool_name="content_specialist")
print("🏢 Project Team AOP Cluster Created!")
print(f"👥 Team members: {aop.list_agents()}")
print()
# Simulate the coordinator discovering team members
print("🔍 Project Coordinator discovering team capabilities...")
print()
# Get discovery info for each agent
for tool_name in aop.list_agents():
if (
tool_name != "discover_agents"
): # Skip the discovery tool itself
agent_info = aop._get_agent_discovery_info(tool_name)
if agent_info:
print(f"📋 {agent_info['agent_name']}:")
print(f" Description: {agent_info['description']}")
print(f" Role: {agent_info['role']}")
print(f" Tags: {', '.join(agent_info['tags'])}")
print(
f" Capabilities: {', '.join(agent_info['capabilities'])}"
)
print(
f" System Prompt: {agent_info['short_system_prompt'][:100]}..."
)
print()
print("💡 How agents would use this in practice:")
print(" 1. Agent calls 'discover_agents' MCP tool")
print(" 2. Gets information about all available agents")
print(
" 3. Uses this info to make informed decisions about task delegation"
)
print(
" 4. Can discover specific agents by name for targeted collaboration"
)
print()
# Show what the MCP tool response would look like
print("📡 Sample MCP tool response structure:")
print(" discover_agents() -> {")
print(" 'success': True,")
print(" 'agents': [")
print(" {")
print(" 'tool_name': 'data_specialist',")
print(" 'agent_name': 'DataSpecialist',")
print(
" 'description': 'Handles all data-related tasks...',"
)
print(
" 'short_system_prompt': 'You are a data specialist...',"
)
print(" 'tags': ['data', 'analysis', 'python'],")
print(
" 'capabilities': ['data_processing', 'statistics'],"
)
print(" 'role': 'specialist',")
print(" ...")
print(" }")
print(" ]")
print(" }")
print()
print("✅ Agent discovery system ready for collaborative work!")
if __name__ == "__main__":
simulate_agent_discovery()

@ -0,0 +1,117 @@
#!/usr/bin/env python3
"""
Example demonstrating the new agent discovery MCP tool in AOP.
This example shows how agents can discover information about each other
using the new 'discover_agents' MCP tool.
"""
from swarms import Agent
from swarms.structs.aop import AOP
def main():
"""Demonstrate the agent discovery functionality."""
# Create some sample agents with different configurations
agent1 = Agent(
agent_name="DataAnalyst",
agent_description="Specialized in data analysis and visualization",
system_prompt="You are a data analyst with expertise in Python, pandas, and statistical analysis. You help users understand data patterns and create visualizations.",
tags=["data", "analysis", "python", "pandas"],
capabilities=["data_analysis", "visualization", "statistics"],
role="analyst",
model_name="gpt-4o-mini",
temperature=0.3,
)
agent2 = Agent(
agent_name="CodeReviewer",
agent_description="Expert code reviewer and quality assurance specialist",
system_prompt="You are a senior software engineer who specializes in code review, best practices, and quality assurance. You help identify bugs, suggest improvements, and ensure code follows industry standards.",
tags=["code", "review", "quality", "python", "javascript"],
capabilities=[
"code_review",
"quality_assurance",
"best_practices",
],
role="reviewer",
model_name="gpt-4o-mini",
temperature=0.2,
)
agent3 = Agent(
agent_name="CreativeWriter",
agent_description="Creative content writer and storyteller",
system_prompt="You are a creative writer who specializes in storytelling, content creation, and engaging narratives. You help create compelling stories, articles, and marketing content.",
tags=["writing", "creative", "content", "storytelling"],
capabilities=[
"creative_writing",
"content_creation",
"storytelling",
],
role="writer",
model_name="gpt-4o-mini",
temperature=0.8,
)
# Create AOP cluster with the agents
aop = AOP(
server_name="Agent Discovery Demo",
description="A demo cluster showing agent discovery capabilities",
agents=[agent1, agent2, agent3],
verbose=True,
)
print("🚀 AOP Cluster initialized with agent discovery tool!")
print(f"📊 Total agents registered: {len(aop.agents)}")
print(f"🔧 Available tools: {aop.list_agents()}")
print()
# Demonstrate the discovery tool
print("🔍 Testing agent discovery functionality...")
print()
# Test discovering all agents
print("1. Discovering all agents:")
all_agents_info = aop._get_agent_discovery_info(
"DataAnalyst"
) # This would normally be called via MCP
print(
f" Found agent: {all_agents_info['agent_name'] if all_agents_info else 'None'}"
)
print()
# Show what the MCP tool would return
print("2. What the 'discover_agents' MCP tool would return:")
print(" - Tool name: discover_agents")
print(
" - Description: Discover information about other agents in the cluster"
)
print(" - Parameters: agent_name (optional)")
print(
" - Returns: Agent info including name, description, short system prompt, tags, capabilities, role, etc."
)
print()
# Show sample agent info structure
if all_agents_info:
print("3. Sample agent discovery info structure:")
for key, value in all_agents_info.items():
if key == "short_system_prompt":
print(f" {key}: {value[:100]}...")
else:
print(f" {key}: {value}")
print()
print("✅ Agent discovery tool successfully integrated!")
print(
"💡 Agents can now use the 'discover_agents' MCP tool to learn about each other."
)
print(
"🔄 The tool is automatically updated when new agents are added to the cluster."
)
if __name__ == "__main__":
main()

@ -0,0 +1,231 @@
#!/usr/bin/env python3
"""
Simple example showing how to call the discover_agents tool synchronously.
"""
import json
import asyncio
from swarms.structs.aop import AOPCluster
from swarms.tools.mcp_client_tools import execute_tool_call_simple
def call_discover_agents_sync(server_url="http://localhost:5932/mcp"):
"""
Synchronously call the discover_agents tool.
Args:
server_url: URL of the MCP server
Returns:
Dict containing the discovery results
"""
# Create the tool call request
tool_call_request = {
"type": "function",
"function": {
"name": "discover_agents",
"arguments": json.dumps({}), # Empty = get all agents
},
}
# Run the async function
return asyncio.run(
execute_tool_call_simple(
response=tool_call_request,
server_path=server_url,
output_type="dict",
)
)
def call_discover_specific_agent_sync(
agent_name, server_url="http://localhost:5932/mcp"
):
"""
Synchronously call the discover_agents tool for a specific agent.
Args:
agent_name: Name of the specific agent to discover
server_url: URL of the MCP server
Returns:
Dict containing the discovery results
"""
# Create the tool call request
tool_call_request = {
"type": "function",
"function": {
"name": "discover_agents",
"arguments": json.dumps({"agent_name": agent_name}),
},
}
# Run the async function
return asyncio.run(
execute_tool_call_simple(
response=tool_call_request,
server_path=server_url,
output_type="dict",
)
)
def main():
"""Main function demonstrating discovery tool usage."""
print("🔍 AOP Agent Discovery Tool Example")
print("=" * 40)
print()
# First, check what tools are available
print("1. Checking available MCP tools...")
aop_cluster = AOPCluster(
urls=["http://localhost:5932/mcp"],
transport="streamable-http",
)
tools = aop_cluster.get_tools(output_type="dict")
print(f" Found {len(tools)} tools")
# Check if discover_agents is available
discover_tool = aop_cluster.find_tool_by_server_name(
"discover_agents"
)
if not discover_tool:
print("❌ discover_agents tool not found!")
print(
" Make sure your AOP server is running with agents registered."
)
return
print("✅ discover_agents tool found!")
print()
# Discover all agents
print("2. Discovering all agents...")
try:
result = call_discover_agents_sync()
if isinstance(result, list) and len(result) > 0:
discovery_data = result[0]
if discovery_data.get("success"):
agents = discovery_data.get("agents", [])
print(f" ✅ Found {len(agents)} agents:")
for i, agent in enumerate(agents, 1):
print(
f" {i}. {agent.get('agent_name', 'Unknown')}"
)
print(
f" Role: {agent.get('role', 'worker')}"
)
print(
f" Description: {agent.get('description', 'No description')}"
)
print(
f" Tags: {', '.join(agent.get('tags', []))}"
)
print(
f" Capabilities: {', '.join(agent.get('capabilities', []))}"
)
print(
f" System Prompt: {agent.get('short_system_prompt', 'No prompt')[:100]}..."
)
print()
else:
print(
f" ❌ Discovery failed: {discovery_data.get('error', 'Unknown error')}"
)
else:
print(" ❌ No valid result returned")
except Exception as e:
print(f" ❌ Error: {e}")
print()
# Example of discovering a specific agent (if any exist)
print("3. Example: Discovering a specific agent...")
try:
# Try to discover the first agent specifically
if isinstance(result, list) and len(result) > 0:
discovery_data = result[0]
if discovery_data.get("success") and discovery_data.get(
"agents"
):
first_agent_name = discovery_data["agents"][0].get(
"agent_name"
)
if first_agent_name:
print(
f" Looking for specific agent: {first_agent_name}"
)
specific_result = (
call_discover_specific_agent_sync(
first_agent_name
)
)
if (
isinstance(specific_result, list)
and len(specific_result) > 0
):
specific_data = specific_result[0]
if specific_data.get("success"):
agent = specific_data.get("agents", [{}])[
0
]
print(
f" ✅ Found specific agent: {agent.get('agent_name', 'Unknown')}"
)
print(
f" Model: {agent.get('model_name', 'Unknown')}"
)
print(
f" Max Loops: {agent.get('max_loops', 1)}"
)
print(
f" Temperature: {agent.get('temperature', 0.5)}"
)
else:
print(
f" ❌ Specific discovery failed: {specific_data.get('error')}"
)
else:
print(" ❌ No valid specific result")
else:
print(
" ⚠️ No agents found to test specific discovery"
)
else:
print(
" ⚠️ No agents available for specific discovery"
)
else:
print(
" ⚠️ No previous discovery results to use for specific discovery"
)
except Exception as e:
print(f" ❌ Error in specific discovery: {e}")
print()
print("✅ Discovery tool demonstration complete!")
print()
print("💡 Usage Summary:")
print(
" • Call discover_agents() with no arguments to get all agents"
)
print(
" • Call discover_agents(agent_name='AgentName') to get specific agent"
)
print(
" • Each agent returns: name, description, role, tags, capabilities, system prompt, etc."
)
if __name__ == "__main__":
main()

@ -0,0 +1,198 @@
#!/usr/bin/env python3
"""
Test script to verify the agent discovery functionality works correctly.
"""
import sys
import os
# Add the swarms directory to the path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "swarms"))
from swarms import Agent
from swarms.structs.aop import AOP
def test_agent_discovery():
"""Test the agent discovery functionality."""
print("🧪 Testing AOP Agent Discovery Functionality")
print("=" * 50)
# Create test agents
agent1 = Agent(
agent_name="TestAgent1",
agent_description="First test agent for discovery",
system_prompt="This is a test agent with a very long system prompt that should be truncated to 200 characters when returned by the discovery tool. This prompt contains detailed instructions about how the agent should behave and what tasks it can perform.",
tags=["test", "agent", "discovery"],
capabilities=["testing", "validation"],
role="tester",
)
agent2 = Agent(
agent_name="TestAgent2",
agent_description="Second test agent for discovery",
system_prompt="Another test agent with different capabilities and a shorter prompt.",
tags=["test", "agent", "analysis"],
capabilities=["analysis", "reporting"],
role="analyst",
)
# Create AOP cluster
aop = AOP(
server_name="Test Cluster",
description="Test cluster for agent discovery",
verbose=False,
)
# Add agents
aop.add_agent(agent1, tool_name="test_agent_1")
aop.add_agent(agent2, tool_name="test_agent_2")
print(f"✅ Created AOP cluster with {len(aop.agents)} agents")
print(f"📋 Available tools: {aop.list_agents()}")
print()
# Test discovery functionality
print("🔍 Testing agent discovery...")
# Test getting info for specific agent
agent1_info = aop._get_agent_discovery_info("test_agent_1")
assert (
agent1_info is not None
), "Should be able to get info for test_agent_1"
assert (
agent1_info["agent_name"] == "TestAgent1"
), "Agent name should match"
assert (
agent1_info["description"] == "First test agent for discovery"
), "Description should match"
assert (
len(agent1_info["short_system_prompt"]) <= 203
), "System prompt should be truncated to ~200 chars"
assert "test" in agent1_info["tags"], "Tags should include 'test'"
assert (
"testing" in agent1_info["capabilities"]
), "Capabilities should include 'testing'"
assert agent1_info["role"] == "tester", "Role should be 'tester'"
print("✅ Specific agent discovery test passed")
# Test getting info for non-existent agent
non_existent_info = aop._get_agent_discovery_info(
"non_existent_agent"
)
assert (
non_existent_info is None
), "Should return None for non-existent agent"
print("✅ Non-existent agent test passed")
# Test that discovery tool is registered
# Note: In a real scenario, this would be tested via MCP tool calls
# For now, we just verify the method exists and works
try:
# This simulates what the MCP tool would do
discovery_result = {"success": True, "agents": []}
for tool_name in aop.agents.keys():
agent_info = aop._get_agent_discovery_info(tool_name)
if agent_info:
discovery_result["agents"].append(agent_info)
assert (
len(discovery_result["agents"]) == 2
), "Should discover both agents"
assert (
discovery_result["success"] is True
), "Discovery should be successful"
print("✅ Discovery tool simulation test passed")
except Exception as e:
print(f"❌ Discovery tool test failed: {e}")
return False
# Test system prompt truncation
long_prompt = "A" * 300 # 300 character string
agent_with_long_prompt = Agent(
agent_name="LongPromptAgent",
agent_description="Agent with very long system prompt",
system_prompt=long_prompt,
)
aop.add_agent(
agent_with_long_prompt, tool_name="long_prompt_agent"
)
long_prompt_info = aop._get_agent_discovery_info(
"long_prompt_agent"
)
assert (
long_prompt_info is not None
), "Should get info for long prompt agent"
assert (
len(long_prompt_info["short_system_prompt"]) == 203
), "Should truncate to 200 chars + '...'"
assert long_prompt_info["short_system_prompt"].endswith(
"..."
), "Should end with '...'"
print("✅ System prompt truncation test passed")
# Test with missing attributes
minimal_agent = Agent(
agent_name="MinimalAgent",
# No description, tags, capabilities, or role specified
)
aop.add_agent(minimal_agent, tool_name="minimal_agent")
minimal_info = aop._get_agent_discovery_info("minimal_agent")
assert (
minimal_info is not None
), "Should get info for minimal agent"
assert (
minimal_info["description"] == "No description available"
), "Should have default description"
assert minimal_info["tags"] == [], "Should have empty tags list"
assert (
minimal_info["capabilities"] == []
), "Should have empty capabilities list"
assert (
minimal_info["role"] == "worker"
), "Should have default role"
print("✅ Minimal agent attributes test passed")
print()
print(
"🎉 All tests passed! Agent discovery functionality is working correctly."
)
print()
print("📊 Summary of discovered agents:")
for tool_name in aop.agents.keys():
info = aop._get_agent_discovery_info(tool_name)
if info:
print(
f"{info['agent_name']} ({info['role']}) - {info['description']}"
)
return True
if __name__ == "__main__":
try:
success = test_agent_discovery()
if success:
print("\n✅ All tests completed successfully!")
sys.exit(0)
else:
print("\n❌ Some tests failed!")
sys.exit(1)
except Exception as e:
print(f"\n💥 Test failed with exception: {e}")
import traceback
traceback.print_exc()
sys.exit(1)

@ -0,0 +1,225 @@
#!/usr/bin/env python3
"""
Example demonstrating the new agent information tools in AOP.
This example shows how to use the new MCP tools for getting agent information.
"""
import json
import asyncio
from swarms.structs.aop import AOPCluster
from swarms.tools.mcp_client_tools import execute_tool_call_simple
async def demonstrate_new_agent_tools():
"""Demonstrate the new agent information tools."""
# Create AOP cluster connection
AOPCluster(
urls=["http://localhost:5932/mcp"],
transport="streamable-http",
)
print("🔧 New AOP Agent Information Tools Demo")
print("=" * 50)
print()
# 1. List all agents
print("1. Listing all agents...")
try:
tool_call = {
"type": "function",
"function": {"name": "list_agents", "arguments": "{}"},
}
result = await execute_tool_call_simple(
response=tool_call,
server_path="http://localhost:5932/mcp",
output_type="dict",
verbose=False,
)
if isinstance(result, list) and len(result) > 0:
data = result[0]
if data.get("success"):
agent_names = data.get("agent_names", [])
print(
f" Found {len(agent_names)} agents: {agent_names}"
)
else:
print(f" Error: {data.get('error')}")
else:
print(" No valid result returned")
except Exception as e:
print(f" Error: {e}")
print()
# 2. Get details for a specific agent
print("2. Getting details for a specific agent...")
try:
tool_call = {
"type": "function",
"function": {
"name": "get_agent_details",
"arguments": json.dumps(
{"agent_name": "Research-Agent"}
),
},
}
result = await execute_tool_call_simple(
response=tool_call,
server_path="http://localhost:5932/mcp",
output_type="dict",
verbose=False,
)
if isinstance(result, list) and len(result) > 0:
data = result[0]
if data.get("success"):
data.get("agent_info", {})
discovery_info = data.get("discovery_info", {})
print(
f" Agent: {discovery_info.get('agent_name', 'Unknown')}"
)
print(
f" Description: {discovery_info.get('description', 'No description')}"
)
print(
f" Model: {discovery_info.get('model_name', 'Unknown')}"
)
print(f" Tags: {discovery_info.get('tags', [])}")
print(
f" Capabilities: {discovery_info.get('capabilities', [])}"
)
else:
print(f" Error: {data.get('error')}")
else:
print(" No valid result returned")
except Exception as e:
print(f" Error: {e}")
print()
# 3. Get info for multiple agents
print("3. Getting info for multiple agents...")
try:
tool_call = {
"type": "function",
"function": {
"name": "get_agents_info",
"arguments": json.dumps(
{
"agent_names": [
"Research-Agent",
"DataAnalyst",
"Writer",
]
}
),
},
}
result = await execute_tool_call_simple(
response=tool_call,
server_path="http://localhost:5932/mcp",
output_type="dict",
verbose=False,
)
if isinstance(result, list) and len(result) > 0:
data = result[0]
if data.get("success"):
agents_info = data.get("agents_info", [])
not_found = data.get("not_found", [])
print(
f" Found {len(agents_info)} agents out of {data.get('total_requested', 0)} requested"
)
for agent in agents_info:
discovery_info = agent.get("discovery_info", {})
print(
f"{discovery_info.get('agent_name', 'Unknown')}: {discovery_info.get('description', 'No description')}"
)
if not_found:
print(f" Not found: {not_found}")
else:
print(f" Error: {data.get('error')}")
else:
print(" No valid result returned")
except Exception as e:
print(f" Error: {e}")
print()
# 4. Search for agents
print("4. Searching for agents...")
try:
tool_call = {
"type": "function",
"function": {
"name": "search_agents",
"arguments": json.dumps(
{
"query": "data",
"search_fields": [
"name",
"description",
"tags",
"capabilities",
],
}
),
},
}
result = await execute_tool_call_simple(
response=tool_call,
server_path="http://localhost:5932/mcp",
output_type="dict",
verbose=False,
)
if isinstance(result, list) and len(result) > 0:
data = result[0]
if data.get("success"):
matching_agents = data.get("matching_agents", [])
print(
f" Found {len(matching_agents)} agents matching 'data'"
)
for agent in matching_agents:
print(
f"{agent.get('agent_name', 'Unknown')}: {agent.get('description', 'No description')}"
)
print(f" Tags: {agent.get('tags', [])}")
print(
f" Capabilities: {agent.get('capabilities', [])}"
)
else:
print(f" Error: {data.get('error')}")
else:
print(" No valid result returned")
except Exception as e:
print(f" Error: {e}")
print()
print("✅ New agent tools demonstration complete!")
print()
print("💡 Available Tools:")
print(
" • discover_agents - Get discovery info for all or specific agents"
)
print(
" • get_agent_details - Get detailed info for a single agent"
)
print(
" • get_agents_info - Get detailed info for multiple agents"
)
print(" • list_agents - Get simple list of all agent names")
print(" • search_agents - Search agents by keywords")
def main():
"""Main function to run the demonstration."""
asyncio.run(demonstrate_new_agent_tools())
if __name__ == "__main__":
main()

@ -0,0 +1,113 @@
import asyncio
import json
from typing import Dict
import requests
from swarms.structs.aop import AOPCluster
from swarms.tools.mcp_client_tools import execute_tool_call_simple
def _select_tools_by_keyword(tools: list, keyword: str) -> list:
"""
Return tools whose name or description contains the keyword
(case-insensitive).
"""
kw = keyword.lower()
selected = []
for t in tools:
name = t.get("function", {}).get("name", "")
desc = t.get("function", {}).get("description", "")
if kw in name.lower() or kw in desc.lower():
selected.append(t)
return selected
def _example_payload_from_schema(tools: list, tool_name: str) -> dict:
"""
Construct a minimal example payload for a given tool using its JSON schema.
Falls back to a generic 'task' if schema not present.
"""
for t in tools:
fn = t.get("function", {})
if fn.get("name") == tool_name:
schema = fn.get("parameters", {})
required = schema.get("required", [])
props = schema.get("properties", {})
payload = {}
for r in required:
if r in props:
if props[r].get("type") == "string":
payload[r] = (
"Example patient case: 45M, egfr 59 ml/min/1.73"
)
elif props[r].get("type") == "boolean":
payload[r] = False
else:
payload[r] = None
if not payload:
payload = {
"task": "Provide ICD-10 suggestions for the case above"
}
return payload
return {"task": "Provide ICD-10 suggestions for the case above"}
def main() -> None:
cluster = AOPCluster(
urls=["http://localhost:8000/mcp"],
transport="streamable-http",
)
tools = cluster.get_tools(output_type="dict")
print(f"Tools: {len(tools)}")
coding_tools = _select_tools_by_keyword(tools, "coder")
names = [t.get("function", {}).get("name") for t in coding_tools]
print(f"Coding-related tools: {names}")
# Build a real payload for "Medical Coder" and execute the tool call
tool_name = "Medical Coder"
payload: Dict[str, object] = _example_payload_from_schema(tools, tool_name)
# Enrich with public keyless data (epidemiology context via disease.sh)
try:
epi = requests.get(
"https://disease.sh/v3/covid-19/countries/USA?strict=true",
timeout=5,
)
if epi.ok:
data = epi.json()
epi_summary = (
f"US COVID-19 context: cases={data.get('cases')}, "
f"todayCases={data.get('todayCases')}, deaths={data.get('deaths')}"
)
base_task = payload.get("task") or ""
payload["task"] = (
f"{base_task}\n\nEpidemiology context (no key API): {epi_summary}"
)
except Exception:
pass
print("Calling tool:", tool_name)
request = {
"function": {
"name": tool_name,
"arguments": payload,
}
}
result = asyncio.run(
execute_tool_call_simple(
response=request,
server_path="http://localhost:8000/mcp",
output_type="json",
transport="streamable-http",
verbose=False,
)
)
print("Response:")
print(result)
if __name__ == "__main__":
main()

@ -0,0 +1,166 @@
# Import medical agents defined in the demo module
from examples.demos.medical.medical_coder_agent import (chief_medical_officer,
internist,
medical_coder,
synthesizer,
virologist)
from swarms.structs.aop import AOP
def _enrich_agents_metadata() -> None:
"""
Add lightweight tags/capabilities/roles to imported agents for
better discovery results.
"""
chief_medical_officer.tags = [
"coordination",
"diagnosis",
"triage",
]
chief_medical_officer.capabilities = [
"case-intake",
"differential",
"planning",
]
chief_medical_officer.role = "coordinator"
virologist.tags = ["virology", "infectious-disease"]
virologist.capabilities = ["viral-analysis", "icd10-suggestion"]
virologist.role = "specialist"
internist.tags = ["internal-medicine", "evaluation"]
internist.capabilities = [
"system-review",
"hcc-tagging",
"risk-stratification",
]
internist.role = "specialist"
medical_coder.tags = ["coding", "icd10", "compliance"]
medical_coder.capabilities = [
"code-assignment",
"documentation-review",
]
medical_coder.role = "coder"
synthesizer.tags = ["synthesis", "reporting"]
synthesizer.capabilities = [
"evidence-reconciliation",
"final-report",
]
synthesizer.role = "synthesizer"
def _medical_input_schema() -> dict:
return {
"type": "object",
"properties": {
"task": {
"type": "string",
"description": "Patient case or instruction for the agent",
},
"priority": {
"type": "string",
"enum": ["low", "normal", "high"],
"description": "Processing priority",
},
"include_images": {
"type": "boolean",
"description": "Whether to consider linked images if provided",
"default": False,
},
"img": {
"type": "string",
"description": "Optional image path/URL",
},
"imgs": {
"type": "array",
"items": {"type": "string"},
"description": "Optional list of images",
},
},
"required": ["task"],
"additionalProperties": False,
}
def _medical_output_schema() -> dict:
return {
"type": "object",
"properties": {
"result": {"type": "string"},
"success": {"type": "boolean"},
"error": {"type": "string"},
"confidence": {
"type": "number",
"minimum": 0,
"maximum": 1,
"description": "Optional confidence in the assessment",
},
"codes": {
"type": "array",
"items": {"type": "string"},
"description": "Optional list of suggested ICD-10 codes",
},
},
"required": ["result", "success"],
"additionalProperties": True,
}
def main() -> None:
"""
Start an AOP MCP server that exposes the medical agents as tools with
structured schemas and per-agent settings.
"""
_enrich_agents_metadata()
deployer = AOP(
server_name="Medical-AOP-Server",
port=8000,
verbose=False,
traceback_enabled=True,
log_level="INFO",
transport="streamable-http",
)
input_schema = _medical_input_schema()
output_schema = _medical_output_schema()
# Register each agent with a modest, role-appropriate timeout
deployer.add_agent(
chief_medical_officer,
timeout=45,
input_schema=input_schema,
output_schema=output_schema,
)
deployer.add_agent(
virologist,
timeout=40,
input_schema=input_schema,
output_schema=output_schema,
)
deployer.add_agent(
internist,
timeout=40,
input_schema=input_schema,
output_schema=output_schema,
)
deployer.add_agent(
medical_coder,
timeout=50,
input_schema=input_schema,
output_schema=output_schema,
)
deployer.add_agent(
synthesizer,
timeout=45,
input_schema=input_schema,
output_schema=output_schema,
)
deployer.run()
if __name__ == "__main__":
main()

@ -90,7 +90,7 @@ financial_agent = Agent(
) )
# Basic usage - individual agent addition # Basic usage - individual agent addition
deployer = AOP("MyAgentServer", verbose=True) deployer = AOP("MyAgentServer", verbose=True, port=5932)
agents = [ agents = [
research_agent, research_agent,
@ -102,17 +102,5 @@ agents = [
deployer.add_agents_batch(agents) deployer.add_agents_batch(agents)
# Example usage with different parameters
# The tools now accept: task, img, imgs, correct_answer parameters
# task: str (required) - The main task or prompt
# img: str (optional) - Single image to process
# imgs: List[str] (optional) - Multiple images to process
# correct_answer: str (optional) - Correct answer for validation
# Example calls that would be made to the MCP tools:
# research_tool(task="Research the latest AI trends")
# analysis_tool(task="Analyze this data", img="path/to/image.jpg")
# writing_tool(task="Write a blog post", imgs=["img1.jpg", "img2.jpg"])
# code_tool(task="Debug this code", correct_answer="expected_output")
deployer.run() deployer.run()

@ -0,0 +1,73 @@
from swarms import Agent, AgentRearrange
# Create specialized quantitative research agents
weather_data_agent = Agent(
agent_name="Weather-Data-Agent",
agent_description="Expert in weather data collection, agricultural commodity research, and meteorological analysis",
model_name="claude-sonnet-4-20250514",
max_loops=1,
system_prompt="""You are a quantitative weather data research specialist. Your role is to:
1. Collect and analyze weather data from multiple sources (NOAA, Weather APIs, satellite data)
2. Research agricultural commodity markets and their weather dependencies
3. Identify weather patterns that historically impact crop yields and commodity prices
4. Gather data on seasonal weather trends, precipitation patterns, temperature anomalies
5. Research specific regions and their agricultural production cycles
6. Collect data on extreme weather events and their market impact
7. Analyze historical correlations between weather data and commodity price movements
Focus on actionable weather intelligence for trading opportunities. Always provide specific data points,
timeframes, and geographic regions. Include confidence levels and data quality assessments.""",
)
quant_analysis_agent = Agent(
agent_name="Quant-Analysis-Agent",
agent_description="Expert in quantitative analysis of weather patterns, arbitrage opportunities, and statistical modeling",
model_name="claude-sonnet-4-20250514",
max_loops=1,
system_prompt="""You are a quantitative analysis specialist focused on weather-driven arbitrage opportunities. Your role is to:
1. Analyze weather data correlations with commodity price movements
2. Identify statistical arbitrage opportunities in agricultural futures markets
3. Calculate risk-adjusted returns for weather-based trading strategies
4. Model price impact scenarios based on weather forecasts
5. Identify seasonal patterns and mean reversion opportunities
6. Analyze basis risk and correlation breakdowns between weather and prices
7. Calculate optimal position sizes and hedging ratios
8. Assess market inefficiencies in weather-sensitive commodities
Focus on actionable trading signals with specific entry/exit criteria, risk metrics, and expected returns.
Always provide quantitative justification and statistical confidence levels.""",
)
trading_strategy_agent = Agent(
agent_name="Trading-Strategy-Agent",
agent_description="Expert in trading strategy development, risk assessment, and portfolio management for weather-driven arbitrage",
model_name="claude-sonnet-4-20250514",
max_loops=1,
system_prompt="""You are a quantitative trading strategy specialist focused on weather-driven arbitrage opportunities. Your role is to:
1. Develop comprehensive trading strategies based on weather data and commodity analysis
2. Create detailed risk management frameworks for weather-sensitive positions
3. Design portfolio allocation strategies for agricultural commodity arbitrage
4. Develop hedging strategies to mitigate weather-related risks
5. Create position sizing models based on volatility and correlation analysis
6. Design entry and exit criteria for weather-based trades
7. Develop contingency plans for unexpected weather events
8. Create performance monitoring and evaluation frameworks
Focus on practical, implementable trading strategies with clear risk parameters,
position management rules, and performance metrics. Always include specific trade setups,
risk limits, and monitoring protocols.""",
)
rearrange_system = AgentRearrange(
agents=[
weather_data_agent,
quant_analysis_agent,
trading_strategy_agent,
],
flow=f"{trading_strategy_agent.agent_name} -> {quant_analysis_agent.agent_name}, {weather_data_agent.agent_name}",
max_loops=1,
)
rearrange_system.run(
"What are the best weather trades for the rest of the year 2025? Can we short wheat futures, corn futures, soybean futures, etc.?"
)

@ -0,0 +1,17 @@
import json
from swarms import AutoSwarmBuilder
swarm = AutoSwarmBuilder(
name="My Swarm",
description="A swarm of agents",
verbose=True,
max_loops=1,
model_name="gpt-4o-mini",
execution_type="return-agents",
)
out = swarm.run(
task="Create an accounting team to analyze crypto transactions, there must be 5 agents in the team with extremely extensive prompts. Make the prompts extremely detailed and specific and long and comprehensive. Make sure to include all the details of the task in the prompts."
)
print(json.dumps(out, indent=4))

@ -0,0 +1,19 @@
from swarms import Agent
# Initialize the agent
agent = Agent(
agent_name="Quantitative-Trading-Agent",
agent_description="Advanced quantitative trading and algorithmic analysis agent",
model_name="claude-sonnet-4-2025051eqfewfwmfkmekef",
dynamic_temperature_enabled=True,
max_loops=1,
dynamic_context_window=True,
streaming_on=True,
fallback_models=["gpt-4o-mini", "anthropic/claude-sonnet-4-5"],
)
out = agent.run(
task="What are the top five best energy stocks across nuclear, solar, gas, and other energy sources?",
)
print(out)

@ -0,0 +1,106 @@
from swarms import Agent
from swarms.structs.aop import (
AOP,
)
# Create specialized agents
research_agent = Agent(
agent_name="Research-Agent",
agent_description="Expert in research, data collection, and information gathering",
model_name="anthropic/claude-sonnet-4-5",
max_loops=1,
top_p=None,
dynamic_temperature_enabled=True,
system_prompt="""You are a research specialist. Your role is to:
1. Gather comprehensive information on any given topic
2. Analyze data from multiple sources
3. Provide well-structured research findings
4. Cite sources and maintain accuracy
5. Present findings in a clear, organized manner
Always provide detailed, factual information with proper context.""",
)
analysis_agent = Agent(
agent_name="Analysis-Agent",
agent_description="Expert in data analysis, pattern recognition, and generating insights",
model_name="anthropic/claude-sonnet-4-5",
max_loops=1,
top_p=None,
dynamic_temperature_enabled=True,
system_prompt="""You are an analysis specialist. Your role is to:
1. Analyze data and identify patterns
2. Generate actionable insights
3. Create visualizations and summaries
4. Provide statistical analysis
5. Make data-driven recommendations
Focus on extracting meaningful insights from information.""",
)
writing_agent = Agent(
agent_name="Writing-Agent",
agent_description="Expert in content creation, editing, and communication",
model_name="anthropic/claude-sonnet-4-5",
max_loops=1,
top_p=None,
dynamic_temperature_enabled=True,
system_prompt="""You are a writing specialist. Your role is to:
1. Create engaging, well-structured content
2. Edit and improve existing text
3. Adapt tone and style for different audiences
4. Ensure clarity and coherence
5. Follow best practices in writing
Always produce high-quality, professional content.""",
)
code_agent = Agent(
agent_name="Code-Agent",
agent_description="Expert in programming, code review, and software development",
model_name="anthropic/claude-sonnet-4-5",
max_loops=1,
top_p=None,
dynamic_temperature_enabled=True,
system_prompt="""You are a coding specialist. Your role is to:
1. Write clean, efficient code
2. Debug and fix issues
3. Review and optimize code
4. Explain programming concepts
5. Follow best practices and standards
Always provide working, well-documented code.""",
)
financial_agent = Agent(
agent_name="Financial-Agent",
agent_description="Expert in financial analysis, market research, and investment insights",
model_name="anthropic/claude-sonnet-4-5",
max_loops=1,
top_p=None,
dynamic_temperature_enabled=True,
system_prompt="""You are a financial specialist. Your role is to:
1. Analyze financial data and markets
2. Provide investment insights
3. Assess risk and opportunities
4. Create financial reports
5. Explain complex financial concepts
Always provide accurate, well-reasoned financial analysis.""",
)
# Basic usage - individual agent addition
deployer = AOP("MyAgentServer", verbose=True, port=5932)
agents = [
research_agent,
analysis_agent,
writing_agent,
code_agent,
financial_agent,
]
deployer.add_agents_batch(agents)
deployer.run()

@ -1,5 +1,4 @@
import json import json
import os
from contextlib import suppress from contextlib import suppress
from typing import Any, Callable, Dict, Optional, Type, Union from typing import Any, Callable, Dict, Optional, Type, Union

@ -1,4 +1,3 @@
import os
from dotenv import load_dotenv from dotenv import load_dotenv
# Swarm imports # Swarm imports

@ -1,4 +1,3 @@
import os
from dotenv import load_dotenv from dotenv import load_dotenv
# Swarm imports # Swarm imports

@ -1,45 +1,11 @@
#!/usr/bin/env python3
"""
Demo script for the Agent Map Simulation.
This script demonstrates how to set up and run a simulation where multiple AI agents
move around a 2D map and automatically engage in conversations when they come into
proximity with each other.
NEW: Task-based simulation support! You can now specify what the agents should discuss:
# Create simulation
simulation = AgentMapSimulation(map_width=50, map_height=50)
# Add your agents
simulation.add_agent(my_agent1)
simulation.add_agent(my_agent2)
# Run with a specific task
results = simulation.run(
task="Discuss the impact of AI on financial markets",
duration=300, # 5 minutes
with_visualization=True
)
Features demonstrated:
- Creating agents with different specializations
- Setting up the simulation environment
- Running task-focused conversations
- Live visualization
- Monitoring conversation activity
- Saving conversation summaries
Run this script to see agents moving around and discussing specific topics!
"""
import time import time
from typing import List from typing import List
from swarms import Agent from swarms import Agent
# Remove the formal collaboration prompt import from examples.multi_agent.simulations.agent_map.agent_map_simulation import (
from simulations.agent_map_simulation import AgentMapSimulation AgentMapSimulation,
)
# Create a natural conversation prompt for the simulation # Create a natural conversation prompt for the simulation
NATURAL_CONVERSATION_PROMPT = """ NATURAL_CONVERSATION_PROMPT = """

@ -7,8 +7,12 @@ what topic the agents should discuss when they meet.
""" """
from swarms import Agent from swarms import Agent
from simulations.agent_map_simulation import AgentMapSimulation from examples.multi_agent.simulations.agent_map.agent_map_simulation import (
from simulations.v0.demo_simulation import NATURAL_CONVERSATION_PROMPT AgentMapSimulation,
)
from examples.multi_agent.simulations.agent_map.v0.demo_simulation import (
NATURAL_CONVERSATION_PROMPT,
)
def create_simple_agent(name: str, expertise: str) -> Agent: def create_simple_agent(name: str, expertise: str) -> Agent:

@ -19,7 +19,9 @@ CASE: 34-year-old female with sudden severe headache
from typing import List from typing import List
from swarms import Agent from swarms import Agent
from simulations.agent_map_simulation import AgentMapSimulation from examples.multi_agent.simulations.agent_map.agent_map_simulation import (
AgentMapSimulation,
)
def create_medical_agent( def create_medical_agent(

@ -13,7 +13,7 @@ Run this to see agents naturally forming groups and having multi-party conversat
from swarms import Agent from swarms import Agent
from simulations.agent_map_simulation import ( from examples.multi_agent.simulations.agent_map.agent_map_simulation import (
AgentMapSimulation, AgentMapSimulation,
Position, Position,
) )

@ -8,7 +8,7 @@ that all components work correctly without requiring a GUI.
import time import time
from swarms import Agent from swarms import Agent
from simulations.agent_map_simulation import ( from examples.multi_agent.simulations.agent_map.agent_map_simulation import (
AgentMapSimulation, AgentMapSimulation,
Position, Position,
) )

@ -1,122 +0,0 @@
"""
Example demonstrating the use of uvloop for running multiple agents concurrently.
This example shows how to use the new uvloop-based functions:
- run_agents_concurrently_uvloop: For running multiple agents with the same task
- run_agents_with_tasks_uvloop: For running agents with different tasks
uvloop provides significant performance improvements over standard asyncio,
especially for I/O-bound operations and concurrent task execution.
"""
import os
from swarms.structs.multi_agent_exec import (
run_agents_concurrently_uvloop,
run_agents_with_tasks_uvloop,
)
from swarms.structs.agent import Agent
def create_example_agents(num_agents: int = 3):
"""Create example agents for demonstration."""
agents = []
for i in range(num_agents):
agent = Agent(
agent_name=f"Agent_{i+1}",
system_prompt=f"You are Agent {i+1}, a helpful AI assistant.",
model_name="gpt-4o-mini", # Using a lightweight model for examples
max_loops=1,
autosave=False,
verbose=False,
)
agents.append(agent)
return agents
def example_same_task():
"""Example: Running multiple agents with the same task using uvloop."""
print("=== Example 1: Same Task for All Agents (uvloop) ===")
agents = create_example_agents(3)
task = (
"Write a one-sentence summary about artificial intelligence."
)
print(f"Running {len(agents)} agents with the same task...")
print(f"Task: {task}")
try:
results = run_agents_concurrently_uvloop(agents, task)
print("\nResults:")
for i, result in enumerate(results, 1):
print(f"Agent {i}: {result}")
except Exception as e:
print(f"Error: {e}")
def example_different_tasks():
"""Example: Running agents with different tasks using uvloop."""
print(
"\n=== Example 2: Different Tasks for Each Agent (uvloop) ==="
)
agents = create_example_agents(3)
tasks = [
"Explain what machine learning is in simple terms.",
"Describe the benefits of cloud computing.",
"What are the main challenges in natural language processing?",
]
print(f"Running {len(agents)} agents with different tasks...")
try:
results = run_agents_with_tasks_uvloop(agents, tasks)
print("\nResults:")
for i, (result, task) in enumerate(zip(results, tasks), 1):
print(f"Agent {i} (Task: {task[:50]}...):")
print(f" Response: {result}")
print()
except Exception as e:
print(f"Error: {e}")
def performance_comparison():
"""Demonstrate the performance benefit of uvloop vs standard asyncio."""
print("\n=== Performance Comparison ===")
# Note: This is a conceptual example. In practice, you'd need to measure actual performance
print("uvloop vs Standard asyncio:")
print("• uvloop: Cython-based event loop, ~2-4x faster")
print("• Better for I/O-bound operations")
print("• Lower latency and higher throughput")
print("• Especially beneficial for concurrent agent execution")
print("• Automatic fallback to asyncio if uvloop unavailable")
if __name__ == "__main__":
# Check if API key is available
if not os.getenv("OPENAI_API_KEY"):
print(
"Please set your OPENAI_API_KEY environment variable to run this example."
)
print("Example: export OPENAI_API_KEY='your-api-key-here'")
exit(1)
print("🚀 uvloop Multi-Agent Execution Examples")
print("=" * 50)
# Run examples
example_same_task()
example_different_tasks()
performance_comparison()
print("\n✅ Examples completed!")
print("\nTo use uvloop functions in your code:")
print(
"from swarms.structs.multi_agent_exec import run_agents_concurrently_uvloop"
)
print("results = run_agents_concurrently_uvloop(agents, task)")

@ -0,0 +1,30 @@
from swarms.structs.agent import Agent
from swarms.structs.multi_agent_exec import (
run_agents_concurrently_uvloop,
)
def create_example_agents(num_agents: int = 3):
"""Create example agents for demonstration."""
agents = []
for i in range(num_agents):
agent = Agent(
agent_name=f"Agent_{i+1}",
system_prompt=f"You are Agent {i+1}, a helpful AI assistant.",
model_name="gpt-4o-mini", # Using a lightweight model for examples
max_loops=1,
autosave=False,
verbose=False,
)
agents.append(agent)
return agents
agents = create_example_agents(3)
task = "Write a one-sentence summary about artificial intelligence."
results = run_agents_concurrently_uvloop(agents, task)
print(results)

@ -5,10 +5,10 @@ build-backend = "poetry.core.masonry.api"
[tool.poetry] [tool.poetry]
name = "swarms" name = "swarms"
version = "8.4.0" version = "8.4.1"
description = "Swarms - TGSC" description = "Swarms - TGSC"
license = "MIT" license = "MIT"
authors = ["Kye Gomez <kye@apac.ai>"] authors = ["Kye Gomez <kye@swarms.world>"]
homepage = "https://github.com/kyegomez/swarms" homepage = "https://github.com/kyegomez/swarms"
documentation = "https://docs.swarms.world" documentation = "https://docs.swarms.world"
readme = "README.md" readme = "README.md"
@ -67,7 +67,6 @@ tenacity = "*"
psutil = "*" psutil = "*"
python-dotenv = "*" python-dotenv = "*"
PyYAML = "*" PyYAML = "*"
docstring_parser = "0.16" # TODO:
networkx = "*" networkx = "*"
aiofiles = "*" aiofiles = "*"
rich = "*" rich = "*"
@ -78,7 +77,8 @@ mcp = "*"
aiohttp = "*" aiohttp = "*"
orjson = "*" orjson = "*"
schedule = "*" schedule = "*"
uvloop = {version = "*", markers = "sys_platform != 'win32'"} uvloop = {version = "*", markers = "sys_platform == 'linux' or sys_platform == 'darwin'"}
winloop = {version = "*", markers = "sys_platform == 'win32'"}
[tool.poetry.scripts] [tool.poetry.scripts]
swarms = "swarms.cli.main:main" swarms = "swarms.cli.main:main"

@ -9,7 +9,6 @@ rich
psutil psutil
python-dotenv python-dotenv
PyYAML PyYAML
docstring_parser==0.16
black black
ruff ruff
types-toml>=0.10.8.1 types-toml>=0.10.8.1
@ -26,4 +25,5 @@ mcp
numpy numpy
orjson orjson
schedule schedule
uvloop uvloop; sys_platform == 'linux' or sys_platform == 'darwin' # linux or macos only
winloop; sys_platform == 'win32' # windows only

@ -1,6 +1,7 @@
from swarms.structs.agent import Agent from swarms.structs.agent import Agent
from swarms.structs.agent_loader import AgentLoader from swarms.structs.agent_loader import AgentLoader
from swarms.structs.agent_rearrange import AgentRearrange, rearrange from swarms.structs.agent_rearrange import AgentRearrange, rearrange
from swarms.structs.aop import AOP
from swarms.structs.auto_swarm_builder import AutoSwarmBuilder from swarms.structs.auto_swarm_builder import AutoSwarmBuilder
from swarms.structs.base_structure import BaseStructure from swarms.structs.base_structure import BaseStructure
from swarms.structs.base_swarm import BaseSwarm from swarms.structs.base_swarm import BaseSwarm
@ -184,4 +185,5 @@ __all__ = [
"check_end", "check_end",
"AgentLoader", "AgentLoader",
"BatchedGridWorkflow", "BatchedGridWorkflow",
"AOP",
] ]

@ -2406,12 +2406,14 @@ class Agent:
Dict[str, Any]: A dictionary representation of the class attributes. Dict[str, Any]: A dictionary representation of the class attributes.
""" """
# Remove the llm object from the dictionary # Create a copy of the dict to avoid mutating the original object
self.__dict__.pop("llm", None) # Remove the llm object from the copy since it's not serializable
dict_copy = self.__dict__.copy()
dict_copy.pop("llm", None)
return { return {
attr_name: self._serialize_attr(attr_name, attr_value) attr_name: self._serialize_attr(attr_name, attr_value)
for attr_name, attr_value in self.__dict__.items() for attr_name, attr_value in dict_copy.items()
} }
def to_json(self, indent: int = 4, *args, **kwargs): def to_json(self, indent: int = 4, *args, **kwargs):

File diff suppressed because it is too large Load Diff

@ -1,6 +1,6 @@
import json import json
import traceback import traceback
from typing import Any, List, Optional, Tuple from typing import Any, List, Optional
from dotenv import load_dotenv from dotenv import load_dotenv
from loguru import logger from loguru import logger
@ -14,6 +14,13 @@ from swarms.utils.litellm_wrapper import LiteLLM
load_dotenv() load_dotenv()
execution_types = [
"return-agents",
"execute-swarm-router",
"return-swarm-router-config",
"return-agents-objects",
]
BOSS_SYSTEM_PROMPT = """ BOSS_SYSTEM_PROMPT = """
You are an expert multi-agent architecture designer and team coordinator. Your role is to create and orchestrate sophisticated teams of specialized AI agents, each with distinct personalities, roles, and capabilities. Your primary goal is to ensure the multi-agent system operates efficiently while maintaining clear communication, well-defined responsibilities, and optimal task distribution. You are an expert multi-agent architecture designer and team coordinator. Your role is to create and orchestrate sophisticated teams of specialized AI agents, each with distinct personalities, roles, and capabilities. Your primary goal is to ensure the multi-agent system operates efficiently while maintaining clear communication, well-defined responsibilities, and optimal task distribution.
@ -143,23 +150,32 @@ class AgentSpec(BaseModel):
description="The initial instruction or context provided to the agent, guiding its behavior and responses during execution.", description="The initial instruction or context provided to the agent, guiding its behavior and responses during execution.",
) )
model_name: Optional[str] = Field( model_name: Optional[str] = Field(
description="The name of the AI model that the agent will utilize for processing tasks and generating outputs. For example: gpt-4o, gpt-4o-mini, openai/o3-mini" "gpt-4.1",
description="The name of the AI model that the agent will utilize for processing tasks and generating outputs. For example: gpt-4o, gpt-4o-mini, openai/o3-mini",
) )
auto_generate_prompt: Optional[bool] = Field( auto_generate_prompt: Optional[bool] = Field(
description="A flag indicating whether the agent should automatically create prompts based on the task requirements." False,
description="A flag indicating whether the agent should automatically create prompts based on the task requirements.",
) )
max_tokens: Optional[int] = Field( max_tokens: Optional[int] = Field(
None, 8192,
description="The maximum number of tokens that the agent is allowed to generate in its responses, limiting output length.", description="The maximum number of tokens that the agent is allowed to generate in its responses, limiting output length.",
) )
temperature: Optional[float] = Field( temperature: Optional[float] = Field(
description="A parameter that controls the randomness of the agent's output; lower values result in more deterministic responses." 0.5,
description="A parameter that controls the randomness of the agent's output; lower values result in more deterministic responses.",
) )
role: Optional[str] = Field( role: Optional[str] = Field(
description="The designated role of the agent within the swarm, which influences its behavior and interaction with other agents." "worker",
description="The designated role of the agent within the swarm, which influences its behavior and interaction with other agents.",
) )
max_loops: Optional[int] = Field( max_loops: Optional[int] = Field(
description="The maximum number of times the agent is allowed to repeat its task, enabling iterative processing if necessary." 1,
description="The maximum number of times the agent is allowed to repeat its task, enabling iterative processing if necessary.",
)
goal: Optional[str] = Field(
None,
description="The primary objective or desired outcome the agent is tasked with achieving.",
) )
@ -171,57 +187,10 @@ class Agents(BaseModel):
) )
execution_types = [
"return-agents",
"execute-swarm-router",
"return-swarm-router-config",
"return-agent-configurations",
"return-agent-specs",
"return-agent-dictionary",
]
class AgentConfig(BaseModel):
"""Configuration for an individual agent in a swarm"""
name: str = Field(
description="The name of the agent. This should be a unique identifier that distinguishes this agent from others within the swarm. The name should reflect the agent's primary function, role, or area of expertise, and should be easily recognizable by both humans and other agents in the system. A well-chosen name helps clarify the agent's responsibilities and facilitates effective communication and collaboration within the swarm.",
)
description: str = Field(
description=(
"A comprehensive description of the agent's purpose, core responsibilities, and capabilities within the swarm. One sentence is enough."
),
)
system_prompt: str = Field(
description=(
"The system prompt that defines the agent's behavior. This prompt should be extremely long, comprehensive, and extensive, encapsulating the agent's identity, operational guidelines, and decision-making framework in great detail. It provides the foundational instructions that guide the agent's actions, communication style, and interaction protocols with both users and other agents. The system prompt should be highly detailed, unambiguous, and exhaustive, ensuring the agent consistently acts in accordance with its intended role and adheres to the swarm's standards and best practices. The prompt should leave no ambiguity and cover all relevant aspects of the agent's responsibilities, behaviors, and expected outcomes."
),
)
goal: str = Field(
description="The goal of the agent. This should clearly state the primary objective or desired outcome the agent is tasked with achieving. The goal should be specific, measurable, and aligned with the overall mission of the swarm. It serves as the guiding principle for the agent's actions and decision-making processes, helping to maintain focus and drive effective collaboration within the multi-agent system.",
)
model_name: str = Field(
description="The model to use for the agent. This is the model that will be used to generate the agent's responses. For example, 'gpt-4o-mini' or 'claude-sonnet-3.7-sonnet-20240620'."
)
temperature: float = Field(
description="The temperature to use for the agent. This controls the randomness of the agent's responses. For example, 0.5 or 1.0."
)
max_loops: int = Field(
description="The maximum number of loops for the agent to run. This is the maximum number of times the agent will run its loop. For example, 1, 2, or 3. Keep this set to 1 unless the agent requires more than one loop to complete its task.",
)
# max_loops: int = Field(
# description="The maximum number of loops for the agent to run",
# )
class Config:
arbitrary_types_allowed = True
class AgentsConfig(BaseModel): class AgentsConfig(BaseModel):
"""Configuration for a list of agents in a swarm""" """Configuration for a list of agents in a swarm"""
agents: List[AgentConfig] = Field( agents: List[AgentSpec] = Field(
description="A list of agent configurations", description="A list of agent configurations",
) )
@ -233,7 +202,7 @@ class SwarmRouterConfig(BaseModel):
description: str = Field( description: str = Field(
description="Description of the team of agents" description="Description of the team of agents"
) )
agents: List[AgentConfig] = Field( agents: List[AgentSpec] = Field(
description="A list of agent configurations", description="A list of agent configurations",
) )
swarm_type: SwarmType = Field( swarm_type: SwarmType = Field(
@ -245,7 +214,10 @@ class SwarmRouterConfig(BaseModel):
rules: Optional[str] = Field( rules: Optional[str] = Field(
description="Rules to inject into every agent. This is a string of rules that will be injected into every agent's system prompt. This is a good place to put things like 'You are a helpful assistant' or 'You are a helpful assistant that can answer questions and help with tasks'." description="Rules to inject into every agent. This is a string of rules that will be injected into every agent's system prompt. This is a good place to put things like 'You are a helpful assistant' or 'You are a helpful assistant that can answer questions and help with tasks'."
) )
multi_agent_collab_prompt: Optional[str] = Field(
None,
description="Prompt for multi-agent collaboration and coordination.",
)
task: str = Field( task: str = Field(
description="The task to be executed by the swarm", description="The task to be executed by the swarm",
) )
@ -271,7 +243,6 @@ class AutoSwarmBuilder:
interactive (bool): Whether to enable interactive mode. Defaults to False. interactive (bool): Whether to enable interactive mode. Defaults to False.
max_tokens (int): Maximum tokens for the LLM responses. Defaults to 8000. max_tokens (int): Maximum tokens for the LLM responses. Defaults to 8000.
execution_type (str): Type of execution to perform. Defaults to "return-agents". execution_type (str): Type of execution to perform. Defaults to "return-agents".
return_dictionary (bool): Whether to return dictionary format for agent specs. Defaults to True.
system_prompt (str): System prompt for the boss agent. Defaults to BOSS_SYSTEM_PROMPT. system_prompt (str): System prompt for the boss agent. Defaults to BOSS_SYSTEM_PROMPT.
""" """
@ -286,8 +257,8 @@ class AutoSwarmBuilder:
interactive: bool = False, interactive: bool = False,
max_tokens: int = 8000, max_tokens: int = 8000,
execution_type: execution_types = "return-agents", execution_type: execution_types = "return-agents",
return_dictionary: bool = True,
system_prompt: str = BOSS_SYSTEM_PROMPT, system_prompt: str = BOSS_SYSTEM_PROMPT,
additional_llm_args: dict = {},
): ):
"""Initialize the AutoSwarmBuilder. """Initialize the AutoSwarmBuilder.
@ -313,15 +284,19 @@ class AutoSwarmBuilder:
self.interactive = interactive self.interactive = interactive
self.max_tokens = max_tokens self.max_tokens = max_tokens
self.execution_type = execution_type self.execution_type = execution_type
self.return_dictionary = return_dictionary
self.system_prompt = system_prompt self.system_prompt = system_prompt
self.additional_llm_args = additional_llm_args
self.conversation = Conversation() self.conversation = Conversation()
self.agents_pool = [] self.agents_pool = []
self.reliability_check() self.reliability_check()
def reliability_check(self): def reliability_check(self):
"""Perform reliability checks on the AutoSwarmBuilder configuration.
Raises:
ValueError: If max_loops is set to 0
"""
if self.max_loops == 0: if self.max_loops == 0:
raise ValueError( raise ValueError(
f"AutoSwarmBuilder: {self.name} max_loops cannot be 0" f"AutoSwarmBuilder: {self.name} max_loops cannot be 0"
@ -332,9 +307,20 @@ class AutoSwarmBuilder:
) )
def _execute_task(self, task: str): def _execute_task(self, task: str):
"""Execute a task by creating agents and initializing the swarm router.
Args:
task (str): The task to execute
Returns:
Any: The result of the swarm router execution
"""
logger.info(f"Executing task: {task}") logger.info(f"Executing task: {task}")
agents = self.create_agents(task) agents_dict = self.create_agents(task)
# Convert dictionary to Agent objects for execution
agents = self.create_agents_from_specs(agents_dict)
if self.execution_type == "return-agents": if self.execution_type == "return-agents":
logger.info("Setting random models for agents") logger.info("Setting random models for agents")
@ -342,50 +328,20 @@ class AutoSwarmBuilder:
return self.initialize_swarm_router(agents=agents, task=task) return self.initialize_swarm_router(agents=agents, task=task)
def run(self, task: str, *args, **kwargs): def dict_to_agent(self, output: dict):
"""Run the swarm on a given task. """Convert dictionary output to Agent objects.
Args: Args:
task (str): The task to execute output (dict): Dictionary containing agent configurations
*args: Additional positional arguments
**kwargs: Additional keyword arguments
Returns: Returns:
Any: The result of the swarm execution List[Agent]: List of created Agent objects
Raises:
Exception: If there's an error during execution
""" """
try:
if self.execution_type == "return-swarm-router-config":
return self.create_router_config(task)
elif self.execution_type == "return-agent-configurations":
return self.create_agents(task)
elif self.execution_type == "return-agent-specs":
return self._create_agent_specs(task)
elif self.execution_type == "return-agent-dictionary":
return self._create_agent_dictionary(task)
else:
return self._execute_task(task)
except Exception as e:
logger.error(
f"AutoSwarmBuilder: Error in swarm execution: {str(e)} Traceback: {traceback.format_exc()}",
exc_info=True,
)
raise
def dict_to_agent(self, output: dict):
agents = [] agents = []
if isinstance(output, dict): if isinstance(output, dict):
for agent_config in output["agents"]: for agent_config in output["agents"]:
logger.info(f"Building agent: {agent_config['name']}") logger.info(f"Building agent: {agent_config['name']}")
agent = self.build_agent( agent = Agent(**agent_config)
agent_name=agent_config["name"],
agent_description=agent_config["description"],
agent_system_prompt=agent_config["system_prompt"],
)
agents.append(agent) agents.append(agent)
logger.info( logger.info(
f"Successfully built agent: {agent_config['name']}" f"Successfully built agent: {agent_config['name']}"
@ -417,12 +373,21 @@ class AutoSwarmBuilder:
raise e raise e
def build_llm_agent(self, config: BaseModel): def build_llm_agent(self, config: BaseModel):
"""Build a LiteLLM agent with the specified configuration.
Args:
config (BaseModel): Pydantic model configuration for the LLM
Returns:
LiteLLM: Configured LiteLLM instance
"""
return LiteLLM( return LiteLLM(
model_name=self.model_name, model_name=self.model_name,
system_prompt=BOSS_SYSTEM_PROMPT, system_prompt=BOSS_SYSTEM_PROMPT,
temperature=0.5, temperature=0.5,
response_format=config, response_format=config,
max_tokens=self.max_tokens, max_tokens=self.max_tokens,
**self.additional_llm_args,
) )
def create_agents(self, task: str): def create_agents(self, task: str):
@ -432,21 +397,18 @@ class AutoSwarmBuilder:
task (str): The task to create agents for task (str): The task to create agents for
Returns: Returns:
List[Agent]: List of created agents dict: Dictionary containing agent specifications
Raises: Raises:
Exception: If there's an error during agent creation Exception: If there's an error during agent creation
""" """
try: try:
model = self.build_llm_agent(config=AgentsConfig) logger.info("Creating agents from specifications")
model = self.build_llm_agent(config=Agents)
output = model.run(
f"Create the agents for the following task: {task}"
)
output = json.loads(output) agents_dictionary = model.run(task)
return output return agents_dictionary
except Exception as e: except Exception as e:
logger.error( logger.error(
@ -455,43 +417,6 @@ class AutoSwarmBuilder:
) )
raise e raise e
def build_agent(
self,
agent_name: str,
agent_description: str,
agent_system_prompt: str,
) -> Agent:
"""Build a single agent with enhanced error handling.
Args:
agent_name (str): Name of the agent
agent_description (str): Description of the agent
agent_system_prompt (str): System prompt for the agent
Returns:
Agent: The constructed agent
Raises:
Exception: If there's an error during agent construction
"""
logger.info(f"Building agent: {agent_name}")
try:
agent = Agent(
agent_name=agent_name,
agent_description=agent_description,
system_prompt=agent_system_prompt,
verbose=self.verbose,
dynamic_temperature_enabled=False,
)
logger.info(f"Successfully built agent: {agent_name}")
return agent
except Exception as e:
logger.error(
f"Error building agent {agent_name}: {str(e)}",
exc_info=True,
)
raise
def initialize_swarm_router(self, agents: List[Agent], task: str): def initialize_swarm_router(self, agents: List[Agent], task: str):
"""Initialize and run the swarm router. """Initialize and run the swarm router.
@ -552,75 +477,8 @@ class AutoSwarmBuilder:
Raises: Raises:
Exception: If there's an error during batch execution Exception: If there's an error during batch execution
""" """
return [self.run(task) for task in tasks] return [self.run(task) for task in tasks]
def _create_agent_specs(
self, task: str
) -> Tuple[List[Agent], int]:
"""Create agent specifications for a given task.
Args:
task (str): The task to create agents for
Returns:
Tuple[List[Agent], int]: List of created agents and count
"""
logger.info("Creating agent specifications for task")
agents = self._create_agents_from_specs(task)
return agents, len(agents)
def _create_agent_dictionary(self, task: str):
"""Create agent dictionary for a given task.
Args:
task (str): The task to create agents for
Returns:
dict: Dictionary containing agent configurations
"""
logger.info("Creating agent dictionary for task")
agents_dictionary = self._create_agents_from_specs(
task, return_dict=True
)
return agents_dictionary
def _create_agents_from_specs(
self, task: str, return_dict: bool = False
):
"""Create agents from specifications.
Args:
task (str): The task to create agents for
return_dict (bool): Whether to return dictionary format
Returns:
List[Agent] or dict: Created agents or dictionary
"""
logger.info("Creating agents from specifications")
model = LiteLLM(
model_name=self.model_name,
system_prompt=self.system_prompt,
temperature=0.1,
response_format=Agents,
max_tokens=8192,
)
agents_dictionary = model.run(task)
print(agents_dictionary)
print(type(agents_dictionary))
logger.info("Agents successfully created")
logger.info(f"Agents: {len(agents_dictionary.agents)}")
if return_dict or self.return_dictionary:
logger.info("Returning dictionary")
# Convert swarm config to dictionary
agents_dictionary = agents_dictionary.model_dump()
return agents_dictionary
else:
logger.info("Returning agents")
return self.create_agents_from_specs(agents_dictionary)
def create_agents_from_specs( def create_agents_from_specs(
self, agents_dictionary: Any self, agents_dictionary: Any
) -> List[Agent]: ) -> List[Agent]:
@ -631,78 +489,85 @@ class AutoSwarmBuilder:
Returns: Returns:
List[Agent]: List of created agents List[Agent]: List of created agents
Notes:
- Handles both dict and Pydantic AgentSpec inputs
- Maps 'description' field to 'agent_description' for Agent compatibility
""" """
# Create agents from config # Create agents from config
agents = [] agents = []
for agent_config in agents_dictionary.agents:
# Handle both dict and object formats
if isinstance(agents_dictionary, dict):
agents_list = agents_dictionary.get("agents", [])
else:
agents_list = agents_dictionary.agents
for agent_config in agents_list:
# Convert dict to AgentSpec if needed # Convert dict to AgentSpec if needed
if isinstance(agent_config, dict): if isinstance(agent_config, dict):
agent_config = AgentSpec(**agent_config) agent_config = AgentSpec(**agent_config)
agent = self.build_agent_from_spec( # Convert Pydantic model to dict for Agent initialization
agent_name=agent_config.agent_name, if isinstance(agent_config, BaseModel):
agent_description=agent_config.description, agent_data = agent_config.model_dump()
agent_system_prompt=agent_config.system_prompt, else:
model_name=agent_config.model_name, agent_data = agent_config
max_loops=agent_config.max_loops,
dynamic_temperature_enabled=True, # Handle parameter name mapping: description -> agent_description
auto_generate_prompt=agent_config.auto_generate_prompt, if (
role=agent_config.role, "description" in agent_data
max_tokens=agent_config.max_tokens, and "agent_description" not in agent_data
temperature=agent_config.temperature, ):
) agent_data["agent_description"] = agent_data.pop(
"description"
)
# Create agent from processed data
agent = Agent(**agent_data)
agents.append(agent) agents.append(agent)
return agents return agents
def build_agent_from_spec( def list_types(self):
self, """List all available execution types.
agent_name: str,
agent_description: str, Returns:
agent_system_prompt: str, List[str]: List of available execution types
max_loops: int = 1, """
model_name: str = "gpt-4.1", return execution_types
dynamic_temperature_enabled: bool = True,
auto_generate_prompt: bool = False, def run(self, task: str, *args, **kwargs):
role: str = "worker", """Run the swarm on a given task.
max_tokens: int = 8192,
temperature: float = 0.5,
) -> Agent:
"""Build a single agent from agent specification.
Args: Args:
agent_name (str): Name of the agent task (str): The task to execute
agent_description (str): Description of the agent's purpose *args: Additional positional arguments
agent_system_prompt (str): The system prompt for the agent **kwargs: Additional keyword arguments
max_loops (int): Maximum number of loops
model_name (str): Model name to use
dynamic_temperature_enabled (bool): Whether to enable dynamic temperature
auto_generate_prompt (bool): Whether to auto-generate prompts
role (str): Role of the agent
max_tokens (int): Maximum tokens
temperature (float): Temperature setting
Returns: Returns:
Agent: The constructed agent instance Any: The result of the swarm execution
Raises:
Exception: If there's an error during execution
""" """
logger.info(f"Building agent from spec: {agent_name}") try:
agent = Agent(
agent_name=agent_name,
description=agent_description,
system_prompt=agent_system_prompt,
model_name=model_name,
max_loops=max_loops,
dynamic_temperature_enabled=dynamic_temperature_enabled,
context_length=200000,
output_type="str",
streaming_on=False,
auto_generate_prompt=auto_generate_prompt,
role=role,
max_tokens=max_tokens,
temperature=temperature,
)
return agent if self.execution_type == "return-agents":
return self.create_agents(task)
elif self.execution_type == "return-swarm-router-config":
return self.create_router_config(task)
elif self.execution_type == "return-agents-objects":
agents = self.create_agents(task)
return self.create_agents_from_specs(agents)
else:
raise ValueError(
f"Invalid execution type: {self.execution_type}"
)
def list_types(self): except Exception as e:
return execution_types logger.error(
f"AutoSwarmBuilder: Error in swarm execution: {str(e)} Traceback: {traceback.format_exc()}",
exc_info=True,
)
raise e

@ -17,6 +17,7 @@ from rich.progress import (
TimeElapsedColumn, TimeElapsedColumn,
) )
from rich.table import Table from rich.table import Table
from swarms.structs.agent import Agent from swarms.structs.agent import Agent
from swarms.structs.conversation import Conversation from swarms.structs.conversation import Conversation
from swarms.tools.tool_type import tool_type from swarms.tools.tool_type import tool_type
@ -27,361 +28,197 @@ from swarms.utils.history_output_formatter import (
from swarms.utils.litellm_wrapper import LiteLLM from swarms.utils.litellm_wrapper import LiteLLM
RESEARCH_AGENT_PROMPT = """ RESEARCH_AGENT_PROMPT = """
You are an expert Research Agent with exceptional capabilities in: You are a senior research agent. Your mission is to deliver fast, trustworthy, and reproducible research that supports decision-making.
CORE EXPERTISE: Objective:
- Comprehensive information gathering and synthesis - Produce well-sourced, reproducible, and actionable research that directly answers the task.
- Primary and secondary research methodologies
- Data collection, validation, and verification Core responsibilities:
- Market research and competitive analysis - Frame the research scope and assumptions
- Academic and industry report analysis - Design and execute a systematic search strategy
- Statistical data interpretation - Extract and evaluate evidence
- Trend identification and pattern recognition - Triangulate across sources and assess reliability
- Source credibility assessment - Present findings with limitations and next steps
RESEARCH METHODOLOGIES: Process:
- Systematic literature reviews 1. Clarify scope; state assumptions if details are missing
- Market surveys and analysis 2. Define search strategy (keywords, databases, time range)
- Competitive intelligence gathering 3. Collect sources, prioritizing primary and high-credibility ones
- Industry benchmarking studies 4. Extract key claims, methods, and figures with provenance
- Consumer behavior research 5. Score source credibility and reconcile conflicting claims
- Technical specification analysis 6. Synthesize into actionable insights
- Historical data compilation
- Cross-referencing multiple sources Scoring rubric (05 scale for each):
- Credibility
ANALYTICAL CAPABILITIES: - Recency
- Data quality assessment - Methodological transparency
- Information gap identification - Relevance
- Research bias detection - Consistency with other sources
- Methodology evaluation
- Source triangulation Deliverables:
- Evidence hierarchy establishment 1. Concise summary (12 sentences)
- Research limitation identification 2. Key findings (bullet points)
- Reliability scoring 3. Evidence table (source id, claim, support level, credibility, link)
4. Search log and methods
DELIVERABLES: 5. Assumptions and unknowns
- Comprehensive research reports 6. Limitations and biases
- Executive summaries with key findings 7. Recommendations and next steps
- Data visualization recommendations 8. Confidence score with justification
- Source documentation and citations 9. Raw citations and extracts
- Research methodology explanations
- Confidence intervals and uncertainty ranges Citation rules:
- Recommendations for further research - Number citations inline [1], [2], and provide metadata in the evidence table
- Action items based on findings - Explicitly label assumptions
- Include provenance for paraphrased content
You approach every research task with:
- Systematic methodology Style and guardrails:
- Critical thinking - Objective, precise language
- Attention to detail - Present conflicting evidence fairly
- Objective analysis - Redact sensitive details unless explicitly authorized
- Comprehensive coverage - If evidence is insufficient, state what is missing and suggest how to obtain it
- Quality assurance """
- Ethical research practices
Provide thorough, well-sourced, and actionable research insights."""
ANALYSIS_AGENT_PROMPT = """ ANALYSIS_AGENT_PROMPT = """
You are an expert Analysis Agent with advanced capabilities in: You are an expert analysis agent. Your mission is to transform raw data or research into validated, decision-grade insights.
ANALYTICAL EXPERTISE: Objective:
- Advanced statistical analysis and modeling - Deliver statistically sound analyses and models with quantified uncertainty.
- Pattern recognition and trend analysis
- Causal relationship identification Core responsibilities:
- Predictive modeling and forecasting - Assess data quality
- Risk assessment and scenario analysis - Choose appropriate methods and justify them
- Performance metrics development - Run diagnostics and quantify uncertainty
- Comparative analysis frameworks - Interpret results in context and provide recommendations
- Root cause analysis methodologies
Process:
ANALYTICAL TECHNIQUES: 1. Validate dataset (structure, missingness, ranges)
- Regression analysis and correlation studies 2. Clean and document transformations
- Time series analysis and forecasting 3. Explore (distributions, outliers, correlations)
- Cluster analysis and segmentation 4. Select methods (justify choice)
- Factor analysis and dimensionality reduction 5. Fit models or perform tests; report parameters and uncertainty
- Sensitivity analysis and stress testing 6. Run sensitivity and robustness checks
- Monte Carlo simulations 7. Interpret results and link to decisions
- Decision tree analysis
- Optimization modeling Deliverables:
1. Concise summary (key implication in 12 sentences)
DATA INTERPRETATION: 2. Dataset overview
- Statistical significance testing 3. Methods and assumptions
- Confidence interval calculation 4. Results (tables, coefficients, metrics, units)
- Variance analysis and decomposition 5. Diagnostics and robustness
- Outlier detection and handling 6. Quantified uncertainty
- Missing data treatment 7. Practical interpretation and recommendations
- Bias identification and correction 8. Limitations and biases
- Data transformation techniques 9. Optional reproducible code/pseudocode
- Quality metrics establishment
Style and guardrails:
INSIGHT GENERATION: - Rigorous but stakeholder-friendly explanations
- Key finding identification - Clearly distinguish correlation from causation
- Implication analysis - Present conservative results when evidence is weak
- Strategic recommendation development """
- Performance gap analysis
- Opportunity identification
- Threat assessment
- Success factor determination
- Critical path analysis
DELIVERABLES:
- Detailed analytical reports
- Statistical summaries and interpretations
- Predictive models and forecasts
- Risk assessment matrices
- Performance dashboards
- Recommendation frameworks
- Implementation roadmaps
- Success measurement criteria
You approach analysis with:
- Mathematical rigor
- Statistical validity
- Logical reasoning
- Systematic methodology
- Evidence-based conclusions
- Actionable insights
- Clear communication
Provide precise, data-driven analysis with clear implications and
recommendations."""
ALTERNATIVES_AGENT_PROMPT = """ ALTERNATIVES_AGENT_PROMPT = """
You are an expert Alternatives Agent with exceptional capabilities in: You are an alternatives agent. Your mission is to generate a diverse portfolio of solutions and evaluate trade-offs consistently.
STRATEGIC THINKING: Objective:
- Alternative strategy development - Present multiple credible strategies, evaluate them against defined criteria, and recommend a primary and fallback path.
- Creative problem-solving approaches
- Innovation and ideation techniques Core responsibilities:
- Strategic option evaluation - Generate a balanced set of alternatives
- Scenario planning and modeling - Evaluate each using a consistent set of criteria
- Blue ocean strategy identification - Provide implementation outlines and risk mitigation
- Disruptive innovation assessment
- Strategic pivot recommendations Process:
1. Define evaluation criteria and weights
SOLUTION FRAMEWORKS: 2. Generate at least four distinct alternatives
- Multiple pathway generation 3. For each option, describe scope, cost, timeline, resources, risks, and success metrics
- Trade-off analysis matrices 4. Score options in a trade-off matrix
- Cost-benefit evaluation models 5. Rank and recommend primary and fallback strategies
- Risk-reward assessment tools 6. Provide phased implementation roadmap
- Implementation complexity scoring
- Resource requirement analysis Deliverables:
- Timeline and milestone planning 1. Concise recommendation with rationale
- Success probability estimation 2. List of alternatives with short descriptions
3. Trade-off matrix with scores and justifications
CREATIVE METHODOLOGIES: 4. Recommendation with risk plan
- Design thinking processes 5. Implementation roadmap with milestones
- Brainstorming and ideation sessions 6. Success criteria and KPIs
- Lateral thinking techniques 7. Contingency plans with switch triggers
- Analogical reasoning approaches
- Constraint removal exercises Style and guardrails:
- Assumption challenging methods - Creative but realistic options
- Reverse engineering solutions - Transparent about hidden costs or dependencies
- Cross-industry benchmarking - Highlight flexibility-preserving options
- Use ranges and confidence where estimates are uncertain
OPTION EVALUATION:
- Multi-criteria decision analysis
- Weighted scoring models
- Pareto analysis applications
- Real options valuation
- Strategic fit assessment
- Competitive advantage evaluation
- Scalability potential analysis
- Market acceptance probability
STRATEGIC ALTERNATIVES:
- Build vs. buy vs. partner decisions
- Organic vs. inorganic growth options
- Technology platform choices
- Market entry strategies
- Business model innovations
- Operational approach variations
- Financial structure alternatives
- Partnership and alliance options
DELIVERABLES:
- Alternative strategy portfolios
- Option evaluation matrices
- Implementation roadmaps
- Risk mitigation plans
- Resource allocation models
- Timeline and milestone charts
- Success measurement frameworks
- Contingency planning guides
You approach alternatives generation with:
- Creative thinking
- Strategic insight
- Practical feasibility
- Innovation mindset
- Risk awareness
- Implementation focus
- Value optimization
Provide innovative, practical, and well-evaluated alternative approaches
and solutions.
""" """
VERIFICATION_AGENT_PROMPT = """ VERIFICATION_AGENT_PROMPT = """
You are an expert Verification Agent with comprehensive capabilities in: You are a verification agent. Your mission is to rigorously validate claims, methods, and feasibility.
VALIDATION EXPERTISE: Objective:
- Fact-checking and source verification - Provide a transparent, evidence-backed verification of claims and quantify remaining uncertainty.
- Data accuracy and integrity assessment
- Methodology validation and review Core responsibilities:
- Assumption testing and challenge - Fact-check against primary sources
- Logic and reasoning verification - Validate methodology and internal consistency
- Completeness and gap analysis - Assess feasibility and compliance
- Consistency checking across sources - Deliver verdicts with supporting evidence
- Evidence quality evaluation
Process:
FEASIBILITY ASSESSMENT: 1. Identify claims or deliverables to verify
- Technical feasibility evaluation 2. Define requirements for verification
- Economic viability analysis 3. Triangulate independent sources
- Operational capability assessment 4. Re-run calculations or sanity checks
- Resource availability verification 5. Stress-test assumptions
- Timeline realism evaluation 6. Produce verification scorecard and remediation steps
- Risk factor identification
- Constraint and limitation analysis Deliverables:
- Implementation barrier assessment 1. Claim summary
2. Verification status (verified, partial, not verified)
QUALITY ASSURANCE: 3. Evidence matrix (source, finding, support, confidence)
- Information reliability scoring 4. Reproduction of critical calculations
- Source credibility evaluation 5. Key risks and failure modes
- Bias detection and mitigation 6. Corrective steps
- Error identification and correction 7. Confidence score with reasons
- Standard compliance verification
- Best practice alignment check Style and guardrails:
- Performance criteria validation - Transparent chain-of-evidence
- Success measurement verification - Highlight uncertainty explicitly
- If data is missing, state whats needed and propose next steps
VERIFICATION METHODOLOGIES: """
- Independent source triangulation
- Peer review and expert validation
- Benchmarking against standards
- Historical precedent analysis
- Stress testing and scenario modeling
- Sensitivity analysis performance
- Cross-functional review processes
- Stakeholder feedback integration
RISK ASSESSMENT:
- Implementation risk evaluation
- Market acceptance risk analysis
- Technical risk identification
- Financial risk assessment
- Operational risk evaluation
- Regulatory compliance verification
- Competitive response assessment
- Timeline and delivery risk analysis
COMPLIANCE VERIFICATION:
- Regulatory requirement checking
- Industry standard compliance
- Legal framework alignment
- Ethical guideline adherence
- Safety standard verification
- Quality management compliance
- Environmental impact assessment
- Social responsibility validation
DELIVERABLES:
- Verification and validation reports
- Feasibility assessment summaries
- Risk evaluation matrices
- Compliance checklists
- Quality assurance scorecards
- Recommendation refinements
- Implementation guardrails
- Success probability assessments
You approach verification with:
- Rigorous methodology
- Critical evaluation
- Attention to detail
- Objective assessment
- Risk awareness
- Quality focus
- Practical realism
Provide thorough, objective verification with clear feasibility
assessments and risk evaluations."""
SYNTHESIS_AGENT_PROMPT = """ SYNTHESIS_AGENT_PROMPT = """
You are an expert Synthesis Agent with advanced capabilities in: You are a synthesis agent. Your mission is to integrate multiple inputs into a coherent narrative and executable plan.
INTEGRATION EXPERTISE: Objective:
- Multi-perspective synthesis and integration - Deliver an integrated synthesis that reconciles evidence, clarifies trade-offs, and yields a prioritized plan.
- Cross-functional analysis and coordination
- Holistic view development and presentation Core responsibilities:
- Complex information consolidation - Combine outputs from research, analysis, alternatives, and verification
- Stakeholder perspective integration - Highlight consensus and conflicts
- Strategic alignment and coherence - Provide a prioritized roadmap and communication plan
- Comprehensive solution development
- Executive summary creation Process:
1. Map inputs and provenance
SYNTHESIS METHODOLOGIES: 2. Identify convergence and conflicts
- Information architecture development 3. Prioritize actions by impact and feasibility
- Priority matrix creation and application 4. Develop integrated roadmap with owners, milestones, KPIs
- Weighted factor analysis 5. Create stakeholder-specific summaries
- Multi-criteria decision frameworks
- Consensus building techniques Deliverables:
- Conflict resolution approaches 1. Executive summary (150 words)
- Trade-off optimization strategies 2. Consensus findings and open questions
- Value proposition development 3. Priority action list
4. Integrated roadmap
COMPREHENSIVE ANALYSIS: 5. Measurement and evaluation plan
- End-to-end solution evaluation 6. Communication plan per stakeholder group
- Impact assessment across dimensions 7. Evidence map and assumptions
- Cost-benefit comprehensive analysis
- Risk-reward optimization models Style and guardrails:
- Implementation roadmap development - Executive-focused summary, technical appendix for implementers
- Success factor identification - Transparent about uncertainty
- Critical path analysis - Include what could break this plan with mitigation steps
- Milestone and deliverable planning """
STRATEGIC INTEGRATION:
- Vision and mission alignment
- Strategic objective integration
- Resource optimization across initiatives
- Timeline synchronization and coordination
- Stakeholder impact assessment
- Change management consideration
- Performance measurement integration
- Continuous improvement frameworks
DELIVERABLE CREATION:
- Executive summary development
- Strategic recommendation reports
- Implementation action plans
- Risk mitigation strategies
- Performance measurement frameworks
- Communication and rollout plans
- Success criteria and metrics
- Follow-up and review schedules
COMMUNICATION EXCELLENCE:
- Clear and concise reporting
- Executive-level presentation skills
- Technical detail appropriate scaling
- Visual and narrative integration
- Stakeholder-specific customization
- Action-oriented recommendations
- Decision-support optimization
- Implementation-focused guidance
You approach synthesis with:
- Holistic thinking
- Strategic perspective
- Integration mindset
- Communication clarity
- Action orientation
- Value optimization
- Implementation focus
Provide comprehensive, integrated analysis with clear, actionable
recommendations and detailed implementation guidance."""
schema = { schema = {
"type": "function", "type": "function",
@ -446,64 +283,62 @@ schema = [schema]
class HeavySwarm: class HeavySwarm:
""" """
HeavySwarm is a sophisticated multi-agent orchestration system that HeavySwarm is a sophisticated multi-agent orchestration system that
decomposes complex tasks into specialized questions and executes them decomposes complex tasks into specialized questions and executes them
using four specialized agents: Research, Analysis, Alternatives, and using four specialized agents: Research, Analysis, Alternatives, and
Verification. The results are then synthesized into a comprehensive Verification. The results are then synthesized into a comprehensive
response. response.
This swarm architecture provides robust task analysis through: This swarm architecture provides robust task analysis through:
- Intelligent question generation for specialized agent roles - Intelligent question generation for specialized agent roles
- Parallel execution of specialized agents for efficiency - Parallel execution of specialized agents for efficiency
- Comprehensive synthesis of multi-perspective results - Comprehensive synthesis of multi-perspective results
- Real-time progress monitoring with rich dashboard displays - Real-time progress monitoring with rich dashboard displays
- Reliability checks and validation systems - Reliability checks and validation systems
- Multi-loop iterative refinement with context preservation - Multi-loop iterative refinement with context preservation
The HeavySwarm follows a structured workflow: The HeavySwarm follows a structured workflow:
1. Task decomposition into specialized questions 1. Task decomposition into specialized questions
2. Parallel execution by specialized agents 2. Parallel execution by specialized agents
3. Result synthesis and integration 3. Result synthesis and integration
4. Comprehensive final report generation 4. Comprehensive final report generation
5. Optional iterative refinement through multiple loops 5. Optional iterative refinement through multiple loops
Key Features: Key Features:
- **Multi-loop Execution**: The max_loops parameter enables iterative - **Multi-loop Execution**: The max_loops parameter enables iterative
refinement where each subsequent loop builds upon the context and refinement where each subsequent loop builds upon the context and
results from previous loops results from previous loops
- **Context Preservation**: Conversation history is maintained across S **Iterative Refinement**: Each loop can refine, improve, or complete
all loops, allowing for deeper analysis and refinement aspects of the analysis based on previous results
- **Iterative Refinement**: Each loop can refine, improve, or complete
aspects of the analysis based on previous results Attributes:
name (str): Name identifier for the swarm instance
Attributes: description (str): Description of the swarm's purpose
name (str): Name identifier for the swarm instance agents (Dict[str, Agent]): Dictionary of specialized agent instances (created internally)
description (str): Description of the swarm's purpose timeout (int): Maximum execution time per agent in seconds
agents (Dict[str, Agent]): Dictionary of specialized agent instances (created internally) aggregation_strategy (str): Strategy for result aggregation (currently 'synthesis')
timeout (int): Maximum execution time per agent in seconds loops_per_agent (int): Number of execution loops per agent
aggregation_strategy (str): Strategy for result aggregation (currently 'synthesis') question_agent_model_name (str): Model name for question generation
loops_per_agent (int): Number of execution loops per agent worker_model_name (str): Model name for specialized worker agents
question_agent_model_name (str): Model name for question generation verbose (bool): Enable detailed logging output
worker_model_name (str): Model name for specialized worker agents max_workers (int): Maximum number of concurrent worker threads
verbose (bool): Enable detailed logging output show_dashboard (bool): Enable rich dashboard with progress visualization
max_workers (int): Maximum number of concurrent worker threads agent_prints_on (bool): Enable individual agent output printing
show_dashboard (bool): Enable rich dashboard with progress visualization max_loops (int): Maximum number of execution loops for iterative refinement
agent_prints_on (bool): Enable individual agent output printing conversation (Conversation): Conversation history tracker
max_loops (int): Maximum number of execution loops for iterative refinement console (Console): Rich console for dashboard output
conversation (Conversation): Conversation history tracker
console (Console): Rich console for dashboard output Example:
>>> swarm = HeavySwarm(
Example: ... name="AnalysisSwarm",
>>> swarm = HeavySwarm( ... description="Market analysis swarm",
... name="AnalysisSwarm", ... question_agent_model_name="gpt-4o-mini",
... description="Market analysis swarm", ... worker_model_name="gpt-4o-mini",
... question_agent_model_name="gpt-4o-mini", ... show_dashboard=True,
... worker_model_name="gpt-4o-mini", ... max_loops=3
... show_dashboard=True, ... )
... max_loops=3 >>> result = swarm.run("Analyze the current cryptocurrency market trends")
... ) >>> # The swarm will run 3 iterations, each building upon the previous results
>>> result = swarm.run("Analyze the current cryptocurrency market trends")
>>> # The swarm will run 3 iterations, each building upon the previous results
""" """
def __init__( def __init__(
@ -1734,16 +1569,23 @@ class HeavySwarm:
# Create the prompt for question generation # Create the prompt for question generation
prompt = f""" prompt = f"""
You are an expert task analyzer. Your job is to break down the following task into 4 specialized questions for different agent roles: System: Technical task analyzer. Generate 4 non-overlapping analytical questions via function tool.
Roles:
- Research: systematic evidence collection, source verification, data quality assessment
- Analysis: statistical analysis, pattern recognition, quantitative insights, correlation analysis
- Alternatives: strategic option generation, multi-criteria analysis, scenario planning, decision modeling
- Verification: systematic validation, risk assessment, feasibility analysis, logical consistency
1. Research Agent: Focuses on gathering information, data, and background context Requirements:
2. Analysis Agent: Focuses on examining patterns, trends, and deriving insights - Each question 30 words, technically precise, action-oriented
3. Alternatives Agent: Focuses on exploring different approaches and solutions - No duplication across roles. No meta text in questions
4. Verification Agent: Focuses on validating findings and checking feasibility - Ambiguity notes only in "thinking" field (40 words)
- Focus on systematic methodology and quantitative analysis
Task to analyze: {task} Task: {task}
Use the generate_specialized_questions function to create targeted questions for each agent role. Use generate_specialized_questions function only.
""" """
question_agent = LiteLLM( question_agent = LiteLLM(

@ -1,12 +1,13 @@
from typing import Dict, List, Any, Optional, Union, Callable
import random import random
from swarms.prompts.collaborative_prompts import (
get_multi_agent_collaboration_prompt_one,
)
from functools import lru_cache from functools import lru_cache
from typing import Any, Callable, Dict, List, Optional, Union
from loguru import logger from loguru import logger
from swarms.prompts.collaborative_prompts import (
get_multi_agent_collaboration_prompt_one,
)
def list_all_agents( def list_all_agents(
agents: List[Union[Callable, Any]], agents: List[Union[Callable, Any]],
@ -131,11 +132,9 @@ def set_random_models_for_agents(
return random.choice(model_names) return random.choice(model_names)
if isinstance(agents, list): if isinstance(agents, list):
return [ for agent in agents:
setattr(agent, "model_name", random.choice(model_names)) setattr(agent, "model_name", random.choice(model_names))
or agent return agents
for agent in agents
]
else: else:
setattr(agents, "model_name", random.choice(model_names)) setattr(agents, "model_name", random.choice(model_names))
return agents return agents

@ -1,5 +1,4 @@
import json import json
import os
from typing import List from typing import List

@ -1,12 +1,12 @@
import asyncio import asyncio
import concurrent.futures import concurrent.futures
import os import os
import sys
from concurrent.futures import ( from concurrent.futures import (
ThreadPoolExecutor, ThreadPoolExecutor,
) )
from typing import Any, Callable, List, Optional, Union from typing import Any, Callable, List, Optional, Union
import uvloop
from loguru import logger from loguru import logger
from swarms.structs.agent import Agent from swarms.structs.agent import Agent
@ -16,20 +16,50 @@ from swarms.structs.omni_agent_types import AgentType
def run_single_agent( def run_single_agent(
agent: AgentType, task: str, *args, **kwargs agent: AgentType, task: str, *args, **kwargs
) -> Any: ) -> Any:
"""Run a single agent synchronously""" """
Run a single agent synchronously with the given task.
This function provides a synchronous wrapper for executing a single agent
with a specific task. It passes through any additional arguments and
keyword arguments to the agent's run method.
Args:
agent (AgentType): The agent instance to execute
task (str): The task string to be executed by the agent
*args: Variable length argument list passed to agent.run()
**kwargs: Arbitrary keyword arguments passed to agent.run()
Returns:
Any: The result returned by the agent's run method
Example:
>>> agent = SomeAgent()
>>> result = run_single_agent(agent, "Analyze this data")
>>> print(result)
"""
return agent.run(task=task, *args, **kwargs) return agent.run(task=task, *args, **kwargs)
async def run_agent_async(agent: AgentType, task: str) -> Any: async def run_agent_async(agent: AgentType, task: str) -> Any:
""" """
Run an agent asynchronously. Run an agent asynchronously using asyncio event loop.
This function executes a single agent asynchronously by running it in a
thread executor to avoid blocking the event loop. It's designed to be
used within async contexts for concurrent execution.
Args: Args:
agent: Agent instance to run agent (AgentType): The agent instance to execute asynchronously
task: Task string to execute task (str): The task string to be executed by the agent
Returns: Returns:
Agent execution result Any: The result returned by the agent's run method
Example:
>>> async def main():
... agent = SomeAgent()
... result = await run_agent_async(agent, "Process data")
... return result
""" """
loop = asyncio.get_event_loop() loop = asyncio.get_event_loop()
return await loop.run_in_executor( return await loop.run_in_executor(
@ -41,14 +71,25 @@ async def run_agents_concurrently_async(
agents: List[AgentType], task: str agents: List[AgentType], task: str
) -> List[Any]: ) -> List[Any]:
""" """
Run multiple agents concurrently using asyncio. Run multiple agents concurrently using asyncio gather.
This function executes multiple agents concurrently using asyncio.gather(),
which runs all agents in parallel and waits for all to complete. Each agent
runs the same task asynchronously.
Args: Args:
agents: List of Agent instances to run concurrently agents (List[AgentType]): List of agent instances to run concurrently
task: Task string to execute task (str): The task string to be executed by all agents
Returns: Returns:
List of outputs from each agent List[Any]: List of results from each agent in the same order as input
Example:
>>> async def main():
... agents = [Agent1(), Agent2(), Agent3()]
... results = await run_agents_concurrently_async(agents, "Analyze data")
... for i, result in enumerate(results):
... print(f"Agent {i+1} result: {result}")
""" """
results = await asyncio.gather( results = await asyncio.gather(
*(run_agent_async(agent, task) for agent in agents) *(run_agent_async(agent, task) for agent in agents)
@ -62,15 +103,35 @@ def run_agents_concurrently(
max_workers: Optional[int] = None, max_workers: Optional[int] = None,
) -> List[Any]: ) -> List[Any]:
""" """
Optimized concurrent agent runner using ThreadPoolExecutor. Run multiple agents concurrently using ThreadPoolExecutor for optimal performance.
This function executes multiple agents concurrently using a thread pool executor,
which provides better performance than asyncio for CPU-bound tasks. It automatically
determines the optimal number of worker threads based on available CPU cores.
Args: Args:
agents: List of Agent instances to run concurrently agents (List[AgentType]): List of agent instances to run concurrently
task: Task string to execute task (str): The task string to be executed by all agents
max_workers: Maximum number of threads in the executor (defaults to 95% of CPU cores) max_workers (Optional[int]): Maximum number of threads in the executor.
Defaults to 95% of available CPU cores for optimal performance
Returns: Returns:
List of outputs from each agent List[Any]: List of results from each agent. If an agent fails, the exception
is included in the results list instead of the result.
Note:
- Uses 95% of CPU cores by default for optimal resource utilization
- Handles exceptions gracefully by including them in the results
- Results may not be in the same order as input agents due to concurrent execution
Example:
>>> agents = [Agent1(), Agent2(), Agent3()]
>>> results = run_agents_concurrently(agents, "Process data")
>>> for i, result in enumerate(results):
... if isinstance(result, Exception):
... print(f"Agent {i+1} failed: {result}")
... else:
... print(f"Agent {i+1} result: {result}")
""" """
if max_workers is None: if max_workers is None:
# 95% of the available CPU cores # 95% of the available CPU cores
@ -103,16 +164,30 @@ def run_agents_concurrently_multiprocess(
agents: List[Agent], task: str, batch_size: int = os.cpu_count() agents: List[Agent], task: str, batch_size: int = os.cpu_count()
) -> List[Any]: ) -> List[Any]:
""" """
Manage and run multiple agents concurrently in batches, with optimized performance. Run multiple agents concurrently in batches using asyncio for optimized performance.
This function processes agents in batches to avoid overwhelming system resources
while still achieving high concurrency. It uses asyncio internally to manage
the concurrent execution of agent batches.
Args: Args:
agents (List[Agent]): List of Agent instances to run concurrently. agents (List[Agent]): List of Agent instances to run concurrently
task (str): The task string to execute by all agents. task (str): The task string to be executed by all agents
batch_size (int, optional): Number of agents to run in parallel in each batch. batch_size (int, optional): Number of agents to run in parallel in each batch.
Defaults to the number of CPU cores. Defaults to the number of CPU cores for optimal resource usage
Returns: Returns:
List[Any]: A list of outputs from each agent. List[Any]: List of results from each agent, maintaining the order of input agents
Note:
- Processes agents in batches to prevent resource exhaustion
- Uses asyncio for efficient concurrent execution within batches
- Results are returned in the same order as input agents
Example:
>>> agents = [Agent1(), Agent2(), Agent3(), Agent4(), Agent5()]
>>> results = run_agents_concurrently_multiprocess(agents, "Analyze data", batch_size=2)
>>> print(f"Processed {len(results)} agents")
""" """
results = [] results = []
loop = asyncio.get_event_loop() loop = asyncio.get_event_loop()
@ -134,15 +209,36 @@ def batched_grid_agent_execution(
max_workers: int = None, max_workers: int = None,
) -> List[Any]: ) -> List[Any]:
""" """
Run multiple agents with different tasks concurrently. Run multiple agents with different tasks concurrently using ThreadPoolExecutor.
This function pairs each agent with a specific task and executes them concurrently.
It's designed for scenarios where different agents need to work on different tasks
simultaneously, creating a grid-like execution pattern.
Args: Args:
agents (List[AgentType]): List of agent instances. agents (List[AgentType]): List of agent instances to execute
tasks (List[str]): List of tasks, one for each agent. tasks (List[str]): List of task strings, one for each agent. Must match the number of agents
max_workers (int, optional): Maximum number of threads to use. Defaults to 90% of CPU cores. max_workers (int, optional): Maximum number of threads to use.
Defaults to 90% of available CPU cores for optimal performance
Returns: Returns:
List[Any]: List of results from each agent. List[Any]: List of results from each agent in the same order as input agents.
If an agent fails, the exception is included in the results.
Raises:
ValueError: If the number of agents doesn't match the number of tasks
Note:
- Uses 90% of CPU cores by default for optimal resource utilization
- Results maintain the same order as input agents
- Handles exceptions gracefully by including them in results
Example:
>>> agents = [Agent1(), Agent2(), Agent3()]
>>> tasks = ["Task A", "Task B", "Task C"]
>>> results = batched_grid_agent_execution(agents, tasks)
>>> for i, result in enumerate(results):
... print(f"Agent {i+1} with {tasks[i]}: {result}")
""" """
logger.info( logger.info(
f"Batch Grid Execution with {len(agents)} agents and number of tasks: {len(tasks)}" f"Batch Grid Execution with {len(agents)} agents and number of tasks: {len(tasks)}"
@ -184,16 +280,34 @@ def run_agents_with_different_tasks(
""" """
Run multiple agents with different tasks concurrently, processing them in batches. Run multiple agents with different tasks concurrently, processing them in batches.
This function executes each agent on its corresponding task, processing the agent-task pairs in batches This function executes each agent on its corresponding task, processing the agent-task pairs
of size `batch_size` for efficient resource utilization. in batches for efficient resource utilization. It's designed for scenarios where you have
a large number of agent-task pairs that need to be processed efficiently.
Args: Args:
agent_task_pairs: List of (agent, task) tuples. agent_task_pairs (List[tuple[AgentType, str]]): List of (agent, task) tuples to execute.
batch_size: Number of agents to run in parallel in each batch. Each tuple contains an agent instance and its task
max_workers: Maximum number of threads. batch_size (int, optional): Number of agent-task pairs to process in parallel in each batch.
Defaults to 10 for balanced resource usage
max_workers (int, optional): Maximum number of threads to use for each batch.
If None, uses the default from batched_grid_agent_execution
Returns: Returns:
List of outputs from each agent, in the same order as the input pairs. List[Any]: List of outputs from each agent-task pair, maintaining the same order as input pairs.
If an agent fails, the exception is included in the results.
Note:
- Processes agent-task pairs in batches to prevent resource exhaustion
- Results maintain the same order as input pairs
- Handles exceptions gracefully by including them in results
- Uses batched_grid_agent_execution internally for each batch
Example:
>>> pairs = [(agent1, "Task A"), (agent2, "Task B"), (agent3, "Task C")]
>>> results = run_agents_with_different_tasks(pairs, batch_size=5)
>>> for i, result in enumerate(results):
... agent, task = pairs[i]
... print(f"Agent {agent.agent_name} with {task}: {result}")
""" """
if not agent_task_pairs: if not agent_task_pairs:
return [] return []
@ -216,36 +330,77 @@ def run_agents_concurrently_uvloop(
max_workers: Optional[int] = None, max_workers: Optional[int] = None,
) -> List[Any]: ) -> List[Any]:
""" """
Run multiple agents concurrently using uvloop for optimized async performance. Run multiple agents concurrently using optimized async performance with uvloop/winloop.
uvloop is a fast, drop-in replacement for asyncio's event loop, implemented in Cython. This function provides high-performance concurrent execution of multiple agents using
It's designed to be significantly faster than the standard asyncio event loop, optimized event loop implementations. It automatically selects the best available
especially beneficial for I/O-bound tasks and concurrent operations. event loop for the platform (uvloop on Unix systems, winloop on Windows).
Args: Args:
agents: List of Agent instances to run concurrently agents (List[AgentType]): List of agent instances to run concurrently
task: Task string to execute by all agents task (str): The task string to be executed by all agents
max_workers: Maximum number of threads in the executor (defaults to 95% of CPU cores) max_workers (Optional[int]): Maximum number of threads in the executor.
Defaults to 95% of available CPU cores for optimal performance
Returns: Returns:
List of outputs from each agent List[Any]: List of results from each agent. If an agent fails, the exception
is included in the results list instead of the result.
Raises: Raises:
ImportError: If uvloop is not installed ImportError: If neither uvloop nor winloop is available (falls back to standard asyncio)
RuntimeError: If uvloop cannot be set as the event loop policy RuntimeError: If event loop policy cannot be set (falls back to standard asyncio)
Note:
- Automatically uses uvloop on Linux/macOS and winloop on Windows
- Falls back gracefully to standard asyncio if optimized loops are unavailable
- Uses 95% of CPU cores by default for optimal resource utilization
- Handles exceptions gracefully by including them in results
- Results may not be in the same order as input agents due to concurrent execution
Example:
>>> agents = [Agent1(), Agent2(), Agent3()]
>>> results = run_agents_concurrently_uvloop(agents, "Process data")
>>> for i, result in enumerate(results):
... if isinstance(result, Exception):
... print(f"Agent {i+1} failed: {result}")
... else:
... print(f"Agent {i+1} result: {result}")
""" """
try: # Platform-specific event loop policy setup
# Set uvloop as the default event loop policy for better performance if sys.platform in ("win32", "cygwin"):
asyncio.set_event_loop_policy(uvloop.EventLoopPolicy()) # Windows: Try to use winloop
except ImportError: try:
logger.warning( import winloop
"uvloop not available, falling back to standard asyncio. "
"Install uvloop with: pip install uvloop" asyncio.set_event_loop_policy(winloop.EventLoopPolicy())
) logger.info(
except RuntimeError as e: "Using winloop for enhanced Windows performance"
logger.warning( )
f"Could not set uvloop policy: {e}. Using default asyncio." except ImportError:
) logger.warning(
"winloop not available, falling back to standard asyncio. "
"Install winloop with: pip install winloop"
)
except RuntimeError as e:
logger.warning(
f"Could not set winloop policy: {e}. Using default asyncio."
)
else:
# Linux/macOS: Try to use uvloop
try:
import uvloop
asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
logger.info("Using uvloop for enhanced Unix performance")
except ImportError:
logger.warning(
"uvloop not available, falling back to standard asyncio. "
"Install uvloop with: pip install uvloop"
)
except RuntimeError as e:
logger.warning(
f"Could not set uvloop policy: {e}. Using default asyncio."
)
if max_workers is None: if max_workers is None:
# Use 95% of available CPU cores for optimal performance # Use 95% of available CPU cores for optimal performance
@ -311,46 +466,90 @@ def run_agents_with_tasks_uvloop(
max_workers: Optional[int] = None, max_workers: Optional[int] = None,
) -> List[Any]: ) -> List[Any]:
""" """
Run multiple agents with different tasks concurrently using uvloop. Run multiple agents with different tasks concurrently using optimized async performance.
This function pairs each agent with a specific task and runs them concurrently This function pairs each agent with a specific task and runs them concurrently using
using uvloop for optimized performance. optimized event loop implementations (uvloop on Unix systems, winloop on Windows).
It's designed for high-performance scenarios where different agents need to work
on different tasks simultaneously.
Args: Args:
agents: List of Agent instances to run agents (List[AgentType]): List of agent instances to run
tasks: List of task strings (must match number of agents) tasks (List[str]): List of task strings, one for each agent. Must match the number of agents
max_workers: Maximum number of threads (defaults to 95% of CPU cores) max_workers (Optional[int]): Maximum number of threads in the executor.
Defaults to 95% of available CPU cores for optimal performance
Returns: Returns:
List of outputs from each agent List[Any]: List of results from each agent in the same order as input agents.
If an agent fails, the exception is included in the results.
Raises: Raises:
ValueError: If number of agents doesn't match number of tasks ValueError: If the number of agents doesn't match the number of tasks
Note:
- Automatically uses uvloop on Linux/macOS and winloop on Windows
- Falls back gracefully to standard asyncio if optimized loops are unavailable
- Uses 95% of CPU cores by default for optimal resource utilization
- Results maintain the same order as input agents
- Handles exceptions gracefully by including them in results
Example:
>>> agents = [Agent1(), Agent2(), Agent3()]
>>> tasks = ["Task A", "Task B", "Task C"]
>>> results = run_agents_with_tasks_uvloop(agents, tasks)
>>> for i, result in enumerate(results):
... if isinstance(result, Exception):
... print(f"Agent {i+1} with {tasks[i]} failed: {result}")
... else:
... print(f"Agent {i+1} with {tasks[i]}: {result}")
""" """
if len(agents) != len(tasks): if len(agents) != len(tasks):
raise ValueError( raise ValueError(
f"Number of agents ({len(agents)}) must match number of tasks ({len(tasks)})" f"Number of agents ({len(agents)}) must match number of tasks ({len(tasks)})"
) )
try: # Platform-specific event loop policy setup
# Set uvloop as the default event loop policy if sys.platform in ("win32", "cygwin"):
asyncio.set_event_loop_policy(uvloop.EventLoopPolicy()) # Windows: Try to use winloop
except ImportError: try:
logger.warning( import winloop
"uvloop not available, falling back to standard asyncio. "
"Install uvloop with: pip install uvloop" asyncio.set_event_loop_policy(winloop.EventLoopPolicy())
) logger.info(
except RuntimeError as e: "Using winloop for enhanced Windows performance"
logger.warning( )
f"Could not set uvloop policy: {e}. Using default asyncio." except ImportError:
) logger.warning(
"winloop not available, falling back to standard asyncio. "
"Install winloop with: pip install winloop"
)
except RuntimeError as e:
logger.warning(
f"Could not set winloop policy: {e}. Using default asyncio."
)
else:
# Linux/macOS: Try to use uvloop
try:
import uvloop
asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
logger.info("Using uvloop for enhanced Unix performance")
except ImportError:
logger.warning(
"uvloop not available, falling back to standard asyncio. "
"Install uvloop with: pip install uvloop"
)
except RuntimeError as e:
logger.warning(
f"Could not set uvloop policy: {e}. Using default asyncio."
)
if max_workers is None: if max_workers is None:
num_cores = os.cpu_count() num_cores = os.cpu_count()
max_workers = int(num_cores * 0.95) if num_cores else 1 max_workers = int(num_cores * 0.95) if num_cores else 1
logger.inufo( logger.info(
f"Running {len(agents)} agents with {len(tasks)} tasks using uvloop (max_workers: {max_workers})" f"Running {len(agents)} agents with {len(tasks)} tasks using optimized event loop (max_workers: {max_workers})"
) )
async def run_agents_with_tasks_async(): async def run_agents_with_tasks_async():
@ -407,10 +606,40 @@ def run_agents_with_tasks_uvloop(
def get_swarms_info(swarms: List[Callable]) -> str: def get_swarms_info(swarms: List[Callable]) -> str:
""" """
Fetches and formats information about all available swarms in the system. Fetch and format information about all available swarms in the system.
This function provides a comprehensive overview of all swarms currently
available in the system, including their names, descriptions, agent counts,
and swarm types. It's useful for debugging, monitoring, and system introspection.
Args:
swarms (List[Callable]): List of swarm instances to get information about.
Each swarm should have name, description, agents, and swarm_type attributes
Returns: Returns:
str: A formatted string containing names and descriptions of all swarms. str: A formatted string containing detailed information about all swarms.
Returns "No swarms currently available in the system." if the list is empty.
Note:
- Each swarm is expected to have the following attributes:
- name: The name of the swarm
- description: A description of the swarm's purpose
- agents: A list of agents in the swarm
- swarm_type: The type/category of the swarm
- The output is formatted for human readability with clear section headers
Example:
>>> swarms = [swarm1, swarm2, swarm3]
>>> info = get_swarms_info(swarms)
>>> print(info)
Available Swarms:
[Swarm 1]
Name: Data Processing Swarm
Description: Handles data analysis tasks
Length of Agents: 5
Swarm Type: Analysis
...
""" """
if not swarms: if not swarms:
return "No swarms currently available in the system." return "No swarms currently available in the system."
@ -439,10 +668,47 @@ def get_agents_info(
agents: List[Union[Agent, Callable]], team_name: str = None agents: List[Union[Agent, Callable]], team_name: str = None
) -> str: ) -> str:
""" """
Fetches and formats information about all available agents in the system. Fetch and format information about all available agents in the system.
This function provides a comprehensive overview of all agents currently
available in the system, including their names, descriptions, roles,
models, and configuration details. It's useful for debugging, monitoring,
and system introspection.
Args:
agents (List[Union[Agent, Callable]]): List of agent instances to get information about.
Each agent should have agent_name, agent_description,
role, model_name, and max_loops attributes
team_name (str, optional): Optional team name to include in the output header.
If None, uses a generic header
Returns: Returns:
str: A formatted string containing names and descriptions of all swarms. str: A formatted string containing detailed information about all agents.
Returns "No agents currently available in the system." if the list is empty.
Note:
- Each agent is expected to have the following attributes:
- agent_name: The name of the agent
- agent_description: A description of the agent's purpose
- role: The role or function of the agent
- model_name: The AI model used by the agent
- max_loops: The maximum number of loops the agent can execute
- The output is formatted for human readability with clear section headers
- Team name is included in the header if provided
Example:
>>> agents = [agent1, agent2, agent3]
>>> info = get_agents_info(agents, team_name="Data Team")
>>> print(info)
Available Agents for Team: Data Team
[Agent 1]
Name: Data Analyzer
Description: Analyzes data patterns
Role: Analyst
Model: gpt-4
Max Loops: 10
...
""" """
if not agents: if not agents:
return "No agents currently available in the system." return "No agents currently available in the system."

@ -231,9 +231,10 @@ class BaseTool(BaseModel):
def base_model_to_dict( def base_model_to_dict(
self, self,
pydantic_type: type[BaseModel], pydantic_type: type[BaseModel],
output_str: bool = False,
*args: Any, *args: Any,
**kwargs: Any, **kwargs: Any,
) -> dict[str, Any]: ) -> Union[dict[str, Any], str]:
""" """
Convert a Pydantic BaseModel to OpenAI function calling schema dictionary. Convert a Pydantic BaseModel to OpenAI function calling schema dictionary.
@ -247,7 +248,7 @@ class BaseTool(BaseModel):
**kwargs: Additional keyword arguments **kwargs: Additional keyword arguments
Returns: Returns:
dict[str, Any]: OpenAI function calling schema dictionary Union[dict[str, Any], str]: OpenAI function calling schema dictionary or JSON string
Raises: Raises:
ToolValidationError: If pydantic_type validation fails ToolValidationError: If pydantic_type validation fails
@ -278,9 +279,13 @@ class BaseTool(BaseModel):
# Get the base function schema # Get the base function schema
base_result = base_model_to_openai_function( base_result = base_model_to_openai_function(
pydantic_type, *args, **kwargs pydantic_type, output_str=output_str, *args, **kwargs
) )
# If output_str is True, return the string directly
if output_str and isinstance(base_result, str):
return base_result
# Extract the function definition from the functions array # Extract the function definition from the functions array
if ( if (
"functions" in base_result "functions" in base_result
@ -314,8 +319,8 @@ class BaseTool(BaseModel):
) from e ) from e
def multi_base_models_to_dict( def multi_base_models_to_dict(
self, base_models: List[BaseModel] self, base_models: List[BaseModel], output_str: bool = False
) -> dict[str, Any]: ) -> Union[dict[str, Any], str]:
""" """
Convert multiple Pydantic BaseModels to OpenAI function calling schema. Convert multiple Pydantic BaseModels to OpenAI function calling schema.
@ -323,12 +328,11 @@ class BaseTool(BaseModel):
a unified OpenAI function calling schema format. a unified OpenAI function calling schema format.
Args: Args:
return_str (bool): Whether to return string format base_models (List[BaseModel]): List of Pydantic models to convert
*args: Additional positional arguments output_str (bool): Whether to return string format. Defaults to False.
**kwargs: Additional keyword arguments
Returns: Returns:
dict[str, Any]: Combined OpenAI function calling schema dict[str, Any] or str: Combined OpenAI function calling schema or JSON string
Raises: Raises:
ToolValidationError: If base_models validation fails ToolValidationError: If base_models validation fails
@ -344,10 +348,18 @@ class BaseTool(BaseModel):
) )
try: try:
return [ results = [
self.base_model_to_dict(model) self.base_model_to_dict(model, output_str=output_str)
for model in base_models for model in base_models
] ]
# If output_str is True, return the string directly
if output_str:
import json
return json.dumps(results, indent=2)
return results
except Exception as e: except Exception as e:
self._log_if_verbose( self._log_if_verbose(
"error", f"Failed to convert multiple models: {e}" "error", f"Failed to convert multiple models: {e}"

@ -346,6 +346,7 @@ async def aget_mcp_tools(
format: str = "openai", format: str = "openai",
connection: Optional[MCPConnection] = None, connection: Optional[MCPConnection] = None,
transport: Optional[str] = None, transport: Optional[str] = None,
verbose: bool = True,
*args, *args,
**kwargs, **kwargs,
) -> List[Dict[str, Any]]: ) -> List[Dict[str, Any]]:
@ -356,15 +357,17 @@ async def aget_mcp_tools(
format (str): Format to return tools in ('openai' or 'mcp'). format (str): Format to return tools in ('openai' or 'mcp').
connection (Optional[MCPConnection]): Optional connection object. connection (Optional[MCPConnection]): Optional connection object.
transport (Optional[str]): Transport type. If None, auto-detects. transport (Optional[str]): Transport type. If None, auto-detects.
verbose (bool): Enable verbose logging. Defaults to True.
Returns: Returns:
List[Dict[str, Any]]: List of available MCP tools in OpenAI format. List[Dict[str, Any]]: List of available MCP tools in OpenAI format.
Raises: Raises:
MCPValidationError: If server_path is invalid. MCPValidationError: If server_path is invalid.
MCPConnectionError: If connection to server fails. MCPConnectionError: If connection to server fails.
""" """
logger.info( if verbose:
f"aget_mcp_tools called for server_path: {server_path}" logger.info(
) f"aget_mcp_tools called for server_path: {server_path}"
)
if transport is None: if transport is None:
transport = auto_detect_transport(server_path) transport = auto_detect_transport(server_path)
if exists(connection): if exists(connection):
@ -381,9 +384,10 @@ async def aget_mcp_tools(
server_path, server_path,
) )
url = server_path url = server_path
logger.info( if verbose:
f"Fetching MCP tools from server: {server_path} using transport: {transport}" logger.info(
) f"Fetching MCP tools from server: {server_path} using transport: {transport}"
)
try: try:
async with get_mcp_client( async with get_mcp_client(
transport, transport,
@ -402,9 +406,10 @@ async def aget_mcp_tools(
tools = await load_mcp_tools( tools = await load_mcp_tools(
session=session, format=format session=session, format=format
) )
logger.info( if verbose:
f"Successfully fetched {len(tools)} tools" logger.info(
) f"Successfully fetched {len(tools)} tools"
)
return tools return tools
except Exception as e: except Exception as e:
logger.error( logger.error(
@ -420,6 +425,7 @@ def get_mcp_tools_sync(
format: str = "openai", format: str = "openai",
connection: Optional[MCPConnection] = None, connection: Optional[MCPConnection] = None,
transport: Optional[str] = "streamable-http", transport: Optional[str] = "streamable-http",
verbose: bool = True,
*args, *args,
**kwargs, **kwargs,
) -> List[Dict[str, Any]]: ) -> List[Dict[str, Any]]:
@ -430,6 +436,7 @@ def get_mcp_tools_sync(
format (str): Format to return tools in ('openai' or 'mcp'). format (str): Format to return tools in ('openai' or 'mcp').
connection (Optional[MCPConnection]): Optional connection object. connection (Optional[MCPConnection]): Optional connection object.
transport (Optional[str]): Transport type. If None, auto-detects. transport (Optional[str]): Transport type. If None, auto-detects.
verbose (bool): Enable verbose logging. Defaults to True.
Returns: Returns:
List[Dict[str, Any]]: List of available MCP tools in OpenAI format. List[Dict[str, Any]]: List of available MCP tools in OpenAI format.
Raises: Raises:
@ -437,9 +444,10 @@ def get_mcp_tools_sync(
MCPConnectionError: If connection to server fails. MCPConnectionError: If connection to server fails.
MCPExecutionError: If event loop management fails. MCPExecutionError: If event loop management fails.
""" """
logger.info( if verbose:
f"get_mcp_tools_sync called for server_path: {server_path}" logger.info(
) f"get_mcp_tools_sync called for server_path: {server_path}"
)
if transport is None: if transport is None:
transport = auto_detect_transport(server_path) transport = auto_detect_transport(server_path)
with get_or_create_event_loop() as loop: with get_or_create_event_loop() as loop:
@ -450,6 +458,7 @@ def get_mcp_tools_sync(
format=format, format=format,
connection=connection, connection=connection,
transport=transport, transport=transport,
verbose=verbose,
*args, *args,
**kwargs, **kwargs,
) )
@ -468,6 +477,7 @@ def _fetch_tools_for_server(
connection: Optional[MCPConnection] = None, connection: Optional[MCPConnection] = None,
format: str = "openai", format: str = "openai",
transport: Optional[str] = None, transport: Optional[str] = None,
verbose: bool = True,
) -> List[Dict[str, Any]]: ) -> List[Dict[str, Any]]:
""" """
Helper function to fetch tools for a single server. Helper function to fetch tools for a single server.
@ -476,10 +486,12 @@ def _fetch_tools_for_server(
connection (Optional[MCPConnection]): Optional connection object. connection (Optional[MCPConnection]): Optional connection object.
format (str): Format to return tools in. format (str): Format to return tools in.
transport (Optional[str]): Transport type. If None, auto-detects. transport (Optional[str]): Transport type. If None, auto-detects.
verbose (bool): Enable verbose logging. Defaults to True.
Returns: Returns:
List[Dict[str, Any]]: List of available MCP tools. List[Dict[str, Any]]: List of available MCP tools.
""" """
logger.info(f"_fetch_tools_for_server called for url: {url}") if verbose:
logger.info(f"_fetch_tools_for_server called for url: {url}")
if transport is None: if transport is None:
transport = auto_detect_transport(url) transport = auto_detect_transport(url)
return get_mcp_tools_sync( return get_mcp_tools_sync(
@ -487,6 +499,7 @@ def _fetch_tools_for_server(
connection=connection, connection=connection,
format=format, format=format,
transport=transport, transport=transport,
verbose=verbose,
) )
@ -497,6 +510,7 @@ def get_tools_for_multiple_mcp_servers(
output_type: Literal["json", "dict", "str"] = "str", output_type: Literal["json", "dict", "str"] = "str",
max_workers: Optional[int] = None, max_workers: Optional[int] = None,
transport: Optional[str] = None, transport: Optional[str] = None,
verbose: bool = True,
) -> List[Dict[str, Any]]: ) -> List[Dict[str, Any]]:
""" """
Get tools for multiple MCP servers concurrently using ThreadPoolExecutor. Get tools for multiple MCP servers concurrently using ThreadPoolExecutor.
@ -507,12 +521,14 @@ def get_tools_for_multiple_mcp_servers(
output_type (Literal): Output format type. output_type (Literal): Output format type.
max_workers (Optional[int]): Max worker threads. max_workers (Optional[int]): Max worker threads.
transport (Optional[str]): Transport type. If None, auto-detects per URL. transport (Optional[str]): Transport type. If None, auto-detects per URL.
verbose (bool): Enable verbose logging. Defaults to True.
Returns: Returns:
List[Dict[str, Any]]: Combined list of tools from all servers. List[Dict[str, Any]]: Combined list of tools from all servers.
""" """
logger.info( if verbose:
f"get_tools_for_multiple_mcp_servers called for {len(urls)} urls." logger.info(
) f"get_tools_for_multiple_mcp_servers called for {len(urls)} urls."
)
tools = [] tools = []
( (
min(32, os.cpu_count() + 4) min(32, os.cpu_count() + 4)
@ -528,6 +544,7 @@ def get_tools_for_multiple_mcp_servers(
connection, connection,
format, format,
transport, transport,
verbose,
): url ): url
for url, connection in zip(urls, connections) for url, connection in zip(urls, connections)
} }
@ -539,6 +556,7 @@ def get_tools_for_multiple_mcp_servers(
None, None,
format, format,
transport, transport,
verbose,
): url ): url
for url in urls for url in urls
} }
@ -563,6 +581,7 @@ async def _execute_tool_call_simple(
connection: Optional[MCPConnection] = None, connection: Optional[MCPConnection] = None,
output_type: Literal["json", "dict", "str"] = "str", output_type: Literal["json", "dict", "str"] = "str",
transport: Optional[str] = None, transport: Optional[str] = None,
verbose: bool = True,
*args, *args,
**kwargs, **kwargs,
): ):
@ -574,14 +593,16 @@ async def _execute_tool_call_simple(
connection (Optional[MCPConnection]): Optional connection object. connection (Optional[MCPConnection]): Optional connection object.
output_type (Literal): Output format type. output_type (Literal): Output format type.
transport (Optional[str]): Transport type. If None, auto-detects. transport (Optional[str]): Transport type. If None, auto-detects.
verbose (bool): Enable verbose logging. Defaults to True.
Returns: Returns:
The tool call result in the specified output format. The tool call result in the specified output format.
Raises: Raises:
MCPExecutionError, MCPConnectionError MCPExecutionError, MCPConnectionError
""" """
logger.info( if verbose:
f"_execute_tool_call_simple called for server_path: {server_path}" logger.info(
) f"_execute_tool_call_simple called for server_path: {server_path}"
)
if transport is None: if transport is None:
transport = auto_detect_transport(server_path) transport = auto_detect_transport(server_path)
if exists(connection): if exists(connection):
@ -638,9 +659,10 @@ async def _execute_tool_call_simple(
out = "\n".join(formatted_lines) out = "\n".join(formatted_lines)
else: else:
out = call_result.model_dump() out = call_result.model_dump()
logger.info( if verbose:
f"Tool call executed successfully for {server_path}" logger.info(
) f"Tool call executed successfully for {server_path}"
)
return out return out
except Exception as e: except Exception as e:
logger.error( logger.error(
@ -664,6 +686,7 @@ async def execute_tool_call_simple(
connection: Optional[MCPConnection] = None, connection: Optional[MCPConnection] = None,
output_type: Literal["json", "dict", "str", "formatted"] = "str", output_type: Literal["json", "dict", "str", "formatted"] = "str",
transport: Optional[str] = None, transport: Optional[str] = None,
verbose: bool = True,
*args, *args,
**kwargs, **kwargs,
) -> List[Dict[str, Any]]: ) -> List[Dict[str, Any]]:
@ -675,12 +698,14 @@ async def execute_tool_call_simple(
connection (Optional[MCPConnection]): Optional connection object. connection (Optional[MCPConnection]): Optional connection object.
output_type (Literal): Output format type. output_type (Literal): Output format type.
transport (Optional[str]): Transport type. If None, auto-detects. transport (Optional[str]): Transport type. If None, auto-detects.
verbose (bool): Enable verbose logging. Defaults to True.
Returns: Returns:
The tool call result in the specified output format. The tool call result in the specified output format.
""" """
logger.info( if verbose:
f"execute_tool_call_simple called for server_path: {server_path}" logger.info(
) f"execute_tool_call_simple called for server_path: {server_path}"
)
if transport is None: if transport is None:
transport = auto_detect_transport(server_path) transport = auto_detect_transport(server_path)
if isinstance(response, str): if isinstance(response, str):
@ -691,6 +716,7 @@ async def execute_tool_call_simple(
connection=connection, connection=connection,
output_type=output_type, output_type=output_type,
transport=transport, transport=transport,
verbose=verbose,
*args, *args,
**kwargs, **kwargs,
) )
@ -701,6 +727,7 @@ def _create_server_tool_mapping(
connections: List[MCPConnection] = None, connections: List[MCPConnection] = None,
format: str = "openai", format: str = "openai",
transport: Optional[str] = None, transport: Optional[str] = None,
verbose: bool = True,
) -> Dict[str, Dict[str, Any]]: ) -> Dict[str, Dict[str, Any]]:
""" """
Create a mapping of function names to server information for all MCP servers. Create a mapping of function names to server information for all MCP servers.
@ -709,6 +736,7 @@ def _create_server_tool_mapping(
connections (List[MCPConnection]): Optional list of MCPConnection objects. connections (List[MCPConnection]): Optional list of MCPConnection objects.
format (str): Format to fetch tools in. format (str): Format to fetch tools in.
transport (Optional[str]): Transport type. If None, auto-detects per URL. transport (Optional[str]): Transport type. If None, auto-detects per URL.
verbose (bool): Enable verbose logging. Defaults to True.
Returns: Returns:
Dict[str, Dict[str, Any]]: Mapping of function names to server info. Dict[str, Dict[str, Any]]: Mapping of function names to server info.
""" """
@ -725,6 +753,7 @@ def _create_server_tool_mapping(
connection=connection, connection=connection,
format=format, format=format,
transport=transport, transport=transport,
verbose=verbose,
) )
for tool in tools: for tool in tools:
if isinstance(tool, dict) and "function" in tool: if isinstance(tool, dict) and "function" in tool:
@ -755,6 +784,7 @@ async def _create_server_tool_mapping_async(
connections: List[MCPConnection] = None, connections: List[MCPConnection] = None,
format: str = "openai", format: str = "openai",
transport: str = "streamable-http", transport: str = "streamable-http",
verbose: bool = True,
) -> Dict[str, Dict[str, Any]]: ) -> Dict[str, Dict[str, Any]]:
""" """
Async version: Create a mapping of function names to server information for all MCP servers. Async version: Create a mapping of function names to server information for all MCP servers.
@ -763,6 +793,7 @@ async def _create_server_tool_mapping_async(
connections (List[MCPConnection]): Optional list of MCPConnection objects. connections (List[MCPConnection]): Optional list of MCPConnection objects.
format (str): Format to fetch tools in. format (str): Format to fetch tools in.
transport (str): Transport type. transport (str): Transport type.
verbose (bool): Enable verbose logging. Defaults to True.
Returns: Returns:
Dict[str, Dict[str, Any]]: Mapping of function names to server info. Dict[str, Dict[str, Any]]: Mapping of function names to server info.
""" """
@ -779,6 +810,7 @@ async def _create_server_tool_mapping_async(
connection=connection, connection=connection,
format=format, format=format,
transport=transport, transport=transport,
verbose=verbose,
) )
for tool in tools: for tool in tools:
if isinstance(tool, dict) and "function" in tool: if isinstance(tool, dict) and "function" in tool:
@ -809,6 +841,7 @@ async def _execute_tool_on_server(
server_info: Dict[str, Any], server_info: Dict[str, Any],
output_type: Literal["json", "dict", "str", "formatted"] = "str", output_type: Literal["json", "dict", "str", "formatted"] = "str",
transport: str = "streamable-http", transport: str = "streamable-http",
verbose: bool = True,
) -> Dict[str, Any]: ) -> Dict[str, Any]:
""" """
Execute a single tool call on a specific server. Execute a single tool call on a specific server.
@ -817,6 +850,7 @@ async def _execute_tool_on_server(
server_info (Dict[str, Any]): Server information from the mapping. server_info (Dict[str, Any]): Server information from the mapping.
output_type (Literal): Output format type. output_type (Literal): Output format type.
transport (str): Transport type. transport (str): Transport type.
verbose (bool): Enable verbose logging. Defaults to True.
Returns: Returns:
Dict[str, Any]: Execution result with server metadata. Dict[str, Any]: Execution result with server metadata.
""" """
@ -827,6 +861,7 @@ async def _execute_tool_on_server(
connection=server_info["connection"], connection=server_info["connection"],
output_type=output_type, output_type=output_type,
transport=transport, transport=transport,
verbose=verbose,
) )
return { return {
"server_url": server_info["url"], "server_url": server_info["url"],
@ -860,6 +895,7 @@ async def execute_multiple_tools_on_multiple_mcp_servers(
output_type: Literal["json", "dict", "str", "formatted"] = "str", output_type: Literal["json", "dict", "str", "formatted"] = "str",
max_concurrent: Optional[int] = None, max_concurrent: Optional[int] = None,
transport: str = "streamable-http", transport: str = "streamable-http",
verbose: bool = True,
*args, *args,
**kwargs, **kwargs,
) -> List[Dict[str, Any]]: ) -> List[Dict[str, Any]]:
@ -872,88 +908,103 @@ async def execute_multiple_tools_on_multiple_mcp_servers(
output_type (Literal): Output format type. output_type (Literal): Output format type.
max_concurrent (Optional[int]): Max concurrent tasks. max_concurrent (Optional[int]): Max concurrent tasks.
transport (str): Transport type. transport (str): Transport type.
verbose (bool): Enable verbose logging. Defaults to True.
Returns: Returns:
List[Dict[str, Any]]: List of execution results. List[Dict[str, Any]]: List of execution results.
""" """
if not responses: if not responses:
logger.warning("No responses provided for execution") if verbose:
logger.warning("No responses provided for execution")
return [] return []
if not urls: if not urls:
raise MCPValidationError("No server URLs provided") raise MCPValidationError("No server URLs provided")
logger.info( if verbose:
f"Creating tool mapping for {len(urls)} servers using transport: {transport}" logger.info(
) f"Creating tool mapping for {len(urls)} servers using transport: {transport}"
)
server_tool_mapping = await _create_server_tool_mapping_async( server_tool_mapping = await _create_server_tool_mapping_async(
urls=urls, urls=urls,
connections=connections, connections=connections,
format="openai", format="openai",
transport=transport, transport=transport,
verbose=verbose,
) )
if not server_tool_mapping: if not server_tool_mapping:
raise MCPExecutionError( raise MCPExecutionError(
"No tools found on any of the provided servers" "No tools found on any of the provided servers"
) )
logger.info( if verbose:
f"Found {len(server_tool_mapping)} unique functions across all servers" logger.info(
) f"Found {len(server_tool_mapping)} unique functions across all servers"
)
all_tool_calls = [] all_tool_calls = []
logger.info( if verbose:
f"Processing {len(responses)} responses for tool call extraction" logger.info(
) f"Processing {len(responses)} responses for tool call extraction"
)
if len(responses) > 10 and all( if len(responses) > 10 and all(
isinstance(r, str) and len(r) == 1 for r in responses isinstance(r, str) and len(r) == 1 for r in responses
): ):
logger.info( if verbose:
"Detected character-by-character response, reconstructing JSON string"
)
try:
reconstructed_response = "".join(responses)
logger.info( logger.info(
f"Reconstructed response length: {len(reconstructed_response)}" "Detected character-by-character response, reconstructing JSON string"
) )
logger.debug( try:
f"Reconstructed response: {reconstructed_response}" reconstructed_response = "".join(responses)
) if verbose:
try:
json.loads(reconstructed_response)
logger.info( logger.info(
"Successfully validated reconstructed JSON response" f"Reconstructed response length: {len(reconstructed_response)}"
)
except json.JSONDecodeError as e:
logger.warning(
f"Reconstructed response is not valid JSON: {str(e)}"
) )
logger.debug( logger.debug(
f"First 100 chars: {reconstructed_response[:100]}" f"Reconstructed response: {reconstructed_response}"
)
logger.debug(
f"Last 100 chars: {reconstructed_response[-100:]}"
) )
try:
json.loads(reconstructed_response)
if verbose:
logger.info(
"Successfully validated reconstructed JSON response"
)
except json.JSONDecodeError as e:
if verbose:
logger.warning(
f"Reconstructed response is not valid JSON: {str(e)}"
)
logger.debug(
f"First 100 chars: {reconstructed_response[:100]}"
)
logger.debug(
f"Last 100 chars: {reconstructed_response[-100:]}"
)
responses = [reconstructed_response] responses = [reconstructed_response]
except Exception as e: except Exception as e:
logger.warning( if verbose:
f"Failed to reconstruct response from characters: {str(e)}" logger.warning(
) f"Failed to reconstruct response from characters: {str(e)}"
)
for i, response in enumerate(responses): for i, response in enumerate(responses):
logger.debug( if verbose:
f"Processing response {i}: {type(response)} - {response}" logger.debug(
) f"Processing response {i}: {type(response)} - {response}"
)
if isinstance(response, str): if isinstance(response, str):
try: try:
response = json.loads(response) response = json.loads(response)
logger.debug( if verbose:
f"Parsed JSON string response {i}: {response}" logger.debug(
) f"Parsed JSON string response {i}: {response}"
)
except json.JSONDecodeError: except json.JSONDecodeError:
logger.warning( if verbose:
f"Failed to parse JSON response at index {i}: {response}" logger.warning(
) f"Failed to parse JSON response at index {i}: {response}"
)
continue continue
if isinstance(response, dict): if isinstance(response, dict):
if "function" in response: if "function" in response:
logger.debug( if verbose:
f"Found single tool call in response {i}: {response['function']}" logger.debug(
) f"Found single tool call in response {i}: {response['function']}"
)
if isinstance( if isinstance(
response["function"].get("arguments"), str response["function"].get("arguments"), str
): ):
@ -963,18 +1014,21 @@ async def execute_multiple_tools_on_multiple_mcp_servers(
response["function"]["arguments"] response["function"]["arguments"]
) )
) )
logger.debug( if verbose:
f"Parsed function arguments: {response['function']['arguments']}" logger.debug(
) f"Parsed function arguments: {response['function']['arguments']}"
)
except json.JSONDecodeError: except json.JSONDecodeError:
logger.warning( if verbose:
f"Failed to parse function arguments: {response['function']['arguments']}" logger.warning(
) f"Failed to parse function arguments: {response['function']['arguments']}"
)
all_tool_calls.append((i, response)) all_tool_calls.append((i, response))
elif "tool_calls" in response: elif "tool_calls" in response:
logger.debug( if verbose:
f"Found multiple tool calls in response {i}: {len(response['tool_calls'])} calls" logger.debug(
) f"Found multiple tool calls in response {i}: {len(response['tool_calls'])} calls"
)
for tool_call in response["tool_calls"]: for tool_call in response["tool_calls"]:
if isinstance( if isinstance(
tool_call.get("function", {}).get( tool_call.get("function", {}).get(
@ -988,44 +1042,55 @@ async def execute_multiple_tools_on_multiple_mcp_servers(
tool_call["function"]["arguments"] tool_call["function"]["arguments"]
) )
) )
logger.debug( if verbose:
f"Parsed tool call arguments: {tool_call['function']['arguments']}" logger.debug(
) f"Parsed tool call arguments: {tool_call['function']['arguments']}"
)
except json.JSONDecodeError: except json.JSONDecodeError:
logger.warning( if verbose:
f"Failed to parse tool call arguments: {tool_call['function']['arguments']}" logger.warning(
) f"Failed to parse tool call arguments: {tool_call['function']['arguments']}"
)
all_tool_calls.append((i, tool_call)) all_tool_calls.append((i, tool_call))
elif "name" in response and "arguments" in response: elif "name" in response and "arguments" in response:
logger.debug( if verbose:
f"Found direct tool call in response {i}: {response}" logger.debug(
) f"Found direct tool call in response {i}: {response}"
)
if isinstance(response.get("arguments"), str): if isinstance(response.get("arguments"), str):
try: try:
response["arguments"] = json.loads( response["arguments"] = json.loads(
response["arguments"] response["arguments"]
) )
logger.debug( if verbose:
f"Parsed direct tool call arguments: {response['arguments']}" logger.debug(
) f"Parsed direct tool call arguments: {response['arguments']}"
)
except json.JSONDecodeError: except json.JSONDecodeError:
logger.warning( if verbose:
f"Failed to parse direct tool call arguments: {response['arguments']}" logger.warning(
) f"Failed to parse direct tool call arguments: {response['arguments']}"
)
all_tool_calls.append((i, {"function": response})) all_tool_calls.append((i, {"function": response}))
else: else:
logger.debug( if verbose:
f"Response {i} is a dict but doesn't match expected tool call formats: {list(response.keys())}" logger.debug(
) f"Response {i} is a dict but doesn't match expected tool call formats: {list(response.keys())}"
)
else: else:
logger.warning( if verbose:
f"Unsupported response type at index {i}: {type(response)}" logger.warning(
) f"Unsupported response type at index {i}: {type(response)}"
)
continue continue
if not all_tool_calls: if not all_tool_calls:
logger.warning("No tool calls found in responses") if verbose:
logger.warning("No tool calls found in responses")
return [] return []
logger.info(f"Found {len(all_tool_calls)} tool calls to execute") if verbose:
logger.info(
f"Found {len(all_tool_calls)} tool calls to execute"
)
max_concurrent = max_concurrent or len(all_tool_calls) max_concurrent = max_concurrent or len(all_tool_calls)
semaphore = asyncio.Semaphore(max_concurrent) semaphore = asyncio.Semaphore(max_concurrent)
@ -1036,9 +1101,10 @@ async def execute_multiple_tools_on_multiple_mcp_servers(
"name", "unknown" "name", "unknown"
) )
if function_name not in server_tool_mapping: if function_name not in server_tool_mapping:
logger.warning( if verbose:
f"Function '{function_name}' not found on any server" logger.warning(
) f"Function '{function_name}' not found on any server"
)
return { return {
"response_index": response_index, "response_index": response_index,
"function_name": function_name, "function_name": function_name,
@ -1052,6 +1118,7 @@ async def execute_multiple_tools_on_multiple_mcp_servers(
server_info=server_info, server_info=server_info,
output_type=output_type, output_type=output_type,
transport=transport, transport=transport,
verbose=verbose,
) )
result["response_index"] = response_index result["response_index"] = response_index
return result return result
@ -1082,9 +1149,10 @@ async def execute_multiple_tools_on_multiple_mcp_servers(
) )
else: else:
processed_results.append(result) processed_results.append(result)
logger.info( if verbose:
f"Completed execution of {len(processed_results)} tool calls" logger.info(
) f"Completed execution of {len(processed_results)} tool calls"
)
return processed_results return processed_results
@ -1095,6 +1163,7 @@ def execute_multiple_tools_on_multiple_mcp_servers_sync(
output_type: Literal["json", "dict", "str", "formatted"] = "str", output_type: Literal["json", "dict", "str", "formatted"] = "str",
max_concurrent: Optional[int] = None, max_concurrent: Optional[int] = None,
transport: str = "streamable-http", transport: str = "streamable-http",
verbose: bool = True,
*args, *args,
**kwargs, **kwargs,
) -> List[Dict[str, Any]]: ) -> List[Dict[str, Any]]:
@ -1107,6 +1176,7 @@ def execute_multiple_tools_on_multiple_mcp_servers_sync(
output_type (Literal): Output format type. output_type (Literal): Output format type.
max_concurrent (Optional[int]): Max concurrent tasks. max_concurrent (Optional[int]): Max concurrent tasks.
transport (str): Transport type. transport (str): Transport type.
verbose (bool): Enable verbose logging. Defaults to True.
Returns: Returns:
List[Dict[str, Any]]: List of execution results. List[Dict[str, Any]]: List of execution results.
""" """
@ -1120,6 +1190,7 @@ def execute_multiple_tools_on_multiple_mcp_servers_sync(
output_type=output_type, output_type=output_type,
max_concurrent=max_concurrent, max_concurrent=max_concurrent,
transport=transport, transport=transport,
verbose=verbose,
*args, *args,
**kwargs, **kwargs,
) )

@ -1,6 +1,6 @@
from typing import Any, List from typing import Any, List
from docstring_parser import parse from swarms.utils.docstring_parser import parse
from pydantic import BaseModel from pydantic import BaseModel
from swarms.utils.loguru_logger import initialize_logger from swarms.utils.loguru_logger import initialize_logger
@ -39,12 +39,14 @@ def check_pydantic_name(pydantic_type: type[BaseModel]) -> str:
def base_model_to_openai_function( def base_model_to_openai_function(
pydantic_type: type[BaseModel], pydantic_type: type[BaseModel],
output_str: bool = False,
) -> dict[str, Any]: ) -> dict[str, Any]:
""" """
Convert a Pydantic model to a dictionary representation of functions. Convert a Pydantic model to a dictionary representation of functions.
Args: Args:
pydantic_type (type[BaseModel]): The Pydantic model type to convert. pydantic_type (type[BaseModel]): The Pydantic model type to convert.
output_str (bool): Whether to return string output format. Defaults to False.
Returns: Returns:
dict[str, Any]: A dictionary representation of the functions. dict[str, Any]: A dictionary representation of the functions.
@ -85,7 +87,7 @@ def base_model_to_openai_function(
_remove_a_key(parameters, "title") _remove_a_key(parameters, "title")
_remove_a_key(parameters, "additionalProperties") _remove_a_key(parameters, "additionalProperties")
return { result = {
"function_call": { "function_call": {
"name": name, "name": name,
}, },
@ -98,6 +100,14 @@ def base_model_to_openai_function(
], ],
} }
# Handle output_str parameter
if output_str:
import json
return json.dumps(result, indent=2)
return result
def multi_base_model_to_openai_function( def multi_base_model_to_openai_function(
pydantic_types: List[BaseModel] = None, pydantic_types: List[BaseModel] = None,
@ -114,13 +124,21 @@ def multi_base_model_to_openai_function(
""" """
functions: list[dict[str, Any]] = [ functions: list[dict[str, Any]] = [
base_model_to_openai_function(pydantic_type, output_str)[ base_model_to_openai_function(
"functions" pydantic_type, output_str=False
][0] )["functions"][0]
for pydantic_type in pydantic_types for pydantic_type in pydantic_types
] ]
return { result = {
"function_call": "auto", "function_call": "auto",
"functions": functions, "functions": functions,
} }
# Handle output_str parameter
if output_str:
import json
return json.dumps(result, indent=2)
return result

@ -0,0 +1,140 @@
"""
Custom docstring parser implementation to replace the docstring_parser package.
This module provides a simple docstring parser that extracts parameter information
and descriptions from Python docstrings in Google/NumPy style format.
"""
import re
from typing import List, Optional, NamedTuple
class DocstringParam(NamedTuple):
"""Represents a parameter in a docstring."""
arg_name: str
description: str
class DocstringInfo(NamedTuple):
"""Represents parsed docstring information."""
short_description: Optional[str]
params: List[DocstringParam]
def parse(docstring: str) -> DocstringInfo:
"""
Parse a docstring and extract parameter information and description.
Args:
docstring (str): The docstring to parse.
Returns:
DocstringInfo: Parsed docstring information containing short description and parameters.
"""
if not docstring or not docstring.strip():
return DocstringInfo(short_description=None, params=[])
# Clean up the docstring
lines = [line.strip() for line in docstring.strip().split("\n")]
# Extract short description (first non-empty line that's not a section header)
short_description = None
for line in lines:
if line and not line.startswith(
(
"Args:",
"Parameters:",
"Returns:",
"Yields:",
"Raises:",
"Note:",
"Example:",
"Examples:",
)
):
short_description = line
break
# Extract parameters
params = []
# Look for Args: or Parameters: section
in_args_section = False
current_param = None
for line in lines:
# Check if we're entering the Args/Parameters section
if line.lower().startswith(("args:", "parameters:")):
in_args_section = True
continue
# Check if we're leaving the Args/Parameters section
if (
in_args_section
and line
and not line.startswith(" ")
and not line.startswith("\t")
):
# Check if this is a new section header
if line.lower().startswith(
(
"returns:",
"yields:",
"raises:",
"note:",
"example:",
"examples:",
"see also:",
"see_also:",
)
):
in_args_section = False
if current_param:
params.append(current_param)
current_param = None
continue
if in_args_section and line:
# Check if this line starts a new parameter (starts with parameter name)
# Pattern: param_name (type): description
param_match = re.match(
r"^(\w+)\s*(?:\([^)]*\))?\s*:\s*(.+)$", line
)
if param_match:
# Save previous parameter if exists
if current_param:
params.append(current_param)
param_name = param_match.group(1)
param_desc = param_match.group(2).strip()
current_param = DocstringParam(
arg_name=param_name, description=param_desc
)
elif current_param and (
line.startswith(" ") or line.startswith("\t")
):
# This is a continuation of the current parameter description
current_param = DocstringParam(
arg_name=current_param.arg_name,
description=current_param.description
+ " "
+ line.strip(),
)
elif not line.startswith(" ") and not line.startswith(
"\t"
):
# This might be a new section, stop processing args
in_args_section = False
if current_param:
params.append(current_param)
current_param = None
# Add the last parameter if it exists
if current_param:
params.append(current_param)
return DocstringInfo(
short_description=short_description, params=params
)

File diff suppressed because it is too large Load Diff

Binary file not shown.

After

Width:  |  Height:  |  Size: 175 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 178 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 130 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

@ -0,0 +1,91 @@
agent_count,test_name,model_name,latency_ms,throughput_rps,memory_usage_mb,cpu_usage_percent,success_rate,error_count,total_requests,concurrent_requests,timestamp,cost_usd,tokens_used,response_quality_score,additional_metrics,agent_creation_time,tool_registration_time,execution_time,total_latency,chaining_steps,chaining_success,error_scenarios_tested,recovery_rate,resource_cycles,avg_memory_delta,memory_leak_detected
1,scaling_test,gpt-4o-mini,1131.7063331604004,4.131429224630576,1.25,0.0,1.0,0,20,5,1759345643.9453266,0.0015359999999999996,10240,0.8548663728748707,"{'min_latency_ms': 562.7951622009277, 'max_latency_ms': 1780.4391384124756, 'p95_latency_ms': np.float64(1744.0685987472534), 'p99_latency_ms': np.float64(1773.1650304794312), 'total_time_s': 4.84093976020813, 'initial_memory_mb': 291.5546875, 'final_memory_mb': 292.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.679999999999998e-05, 'quality_std': 0.0675424923987846, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
6,scaling_test,gpt-4o-mini,1175.6950378417969,3.7575854004826277,0.0,0.0,1.0,0,20,5,1759345654.225195,0.0015359999999999996,10240,0.8563524483655013,"{'min_latency_ms': 535.4223251342773, 'max_latency_ms': 1985.3930473327637, 'p95_latency_ms': np.float64(1975.6355285644531), 'p99_latency_ms': np.float64(1983.4415435791016), 'total_time_s': 5.322566986083984, 'initial_memory_mb': 293.1796875, 'final_memory_mb': 293.1796875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.679999999999998e-05, 'quality_std': 0.05770982402152013, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
11,scaling_test,gpt-4o-mini,996.9684720039368,4.496099509029146,0.0,0.0,1.0,0,20,5,1759345662.8977199,0.0015359999999999996,10240,0.8844883644941982,"{'min_latency_ms': 45.22204399108887, 'max_latency_ms': 1962.2983932495117, 'p95_latency_ms': np.float64(1647.7753758430483), 'p99_latency_ms': np.float64(1899.3937897682185), 'total_time_s': 4.448300123214722, 'initial_memory_mb': 293.5546875, 'final_memory_mb': 293.5546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.679999999999998e-05, 'quality_std': 0.043434832388308614, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
16,scaling_test,gpt-4o-mini,1112.8681421279907,3.587833950074127,0.0,0.0,1.0,0,20,5,1759345673.162652,0.0015359999999999996,10240,0.8563855623109009,"{'min_latency_ms': 564.1369819641113, 'max_latency_ms': 1951.472282409668, 'p95_latency_ms': np.float64(1897.4883794784546), 'p99_latency_ms': np.float64(1940.6755018234253), 'total_time_s': 5.57439398765564, 'initial_memory_mb': 293.8046875, 'final_memory_mb': 293.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.679999999999998e-05, 'quality_std': 0.05691925404970228, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
1,scaling_test,gpt-4o,1298.2240080833435,3.3670995599405846,0.125,0.0,1.0,0,20,5,1759345683.2065425,0.0512,10240,0.9279627852934385,"{'min_latency_ms': 693.6078071594238, 'max_latency_ms': 1764.8026943206787, 'p95_latency_ms': np.float64(1681.7602753639221), 'p99_latency_ms': np.float64(1748.1942105293274), 'total_time_s': 5.939830303192139, 'initial_memory_mb': 293.8046875, 'final_memory_mb': 293.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.050879141399088765, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
6,scaling_test,gpt-4o,1264.4854545593262,3.5293826102318846,0.0,0.0,1.0,0,20,5,1759345692.6439528,0.0512,10240,0.9737471278894755,"{'min_latency_ms': 175.65083503723145, 'max_latency_ms': 1990.2207851409912, 'p95_latency_ms': np.float64(1910.3824019432068), 'p99_latency_ms': np.float64(1974.2531085014343), 'total_time_s': 5.66671347618103, 'initial_memory_mb': 293.9296875, 'final_memory_mb': 293.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.038542680129780495, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
11,scaling_test,gpt-4o,1212.0607376098633,3.799000004302323,0.125,0.0,1.0,0,20,5,1759345701.8719423,0.0512,10240,0.9366077507029601,"{'min_latency_ms': 542.8001880645752, 'max_latency_ms': 1973.801851272583, 'p95_latency_ms': np.float64(1969.2555904388428), 'p99_latency_ms': np.float64(1972.892599105835), 'total_time_s': 5.264543294906616, 'initial_memory_mb': 293.9296875, 'final_memory_mb': 294.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.044670864578792276, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
16,scaling_test,gpt-4o,1367.1631932258606,3.1229790107314654,0.0,0.0,1.0,0,20,5,1759345711.9738443,0.0512,10240,0.9328922198254587,"{'min_latency_ms': 715.888261795044, 'max_latency_ms': 1905.6315422058105, 'p95_latency_ms': np.float64(1890.480661392212), 'p99_latency_ms': np.float64(1902.6013660430908), 'total_time_s': 6.404141664505005, 'initial_memory_mb': 294.0546875, 'final_memory_mb': 294.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.05146728864962903, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
1,scaling_test,gpt-4-turbo,1429.1370868682861,3.3141614744089267,0.125,0.0,1.0,0,20,5,1759345722.7650242,0.1024,10240,0.960928099222926,"{'min_latency_ms': 637.6686096191406, 'max_latency_ms': 1994.9300289154053, 'p95_latency_ms': np.float64(1973.6997246742249), 'p99_latency_ms': np.float64(1990.6839680671692), 'total_time_s': 6.0347089767456055, 'initial_memory_mb': 294.0546875, 'final_memory_mb': 294.1796875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.0429193742204114, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
6,scaling_test,gpt-4-turbo,1167.8012132644653,3.933946564951724,0.0,0.0,1.0,0,20,5,1759345731.809648,0.1024,10240,0.9575695597206497,"{'min_latency_ms': 521.2328433990479, 'max_latency_ms': 1973.503828048706, 'p95_latency_ms': np.float64(1931.3542008399963), 'p99_latency_ms': np.float64(1965.073902606964), 'total_time_s': 5.083953142166138, 'initial_memory_mb': 294.1796875, 'final_memory_mb': 294.1796875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.04742414087184447, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
11,scaling_test,gpt-4-turbo,1435.1954460144043,3.0793869953124613,0.0,0.0,1.0,0,20,5,1759345741.9117725,0.1024,10240,0.9564233524947511,"{'min_latency_ms': 711.4903926849365, 'max_latency_ms': 2034.2109203338623, 'p95_latency_ms': np.float64(1998.979663848877), 'p99_latency_ms': np.float64(2027.1646690368652), 'total_time_s': 6.4947991371154785, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.03428874308764032, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
16,scaling_test,gpt-4-turbo,1092.1013355255127,4.057819053252887,0.0,0.0,1.0,0,20,5,1759345749.8833907,0.1024,10240,0.9521218582720758,"{'min_latency_ms': 554.4416904449463, 'max_latency_ms': 1968.658447265625, 'p95_latency_ms': np.float64(1637.098050117493), 'p99_latency_ms': np.float64(1902.346367835998), 'total_time_s': 4.92875599861145, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.043763298033728824, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
1,scaling_test,claude-3-5-sonnet,1046.9236850738525,4.047496446876068,0.0,0.0,1.0,0,20,5,1759345757.9539518,0.03071999999999999,10240,0.9511838758969231,"{'min_latency_ms': 184.94415283203125, 'max_latency_ms': 1966.0136699676514, 'p95_latency_ms': np.float64(1677.8094530105593), 'p99_latency_ms': np.float64(1908.3728265762325), 'total_time_s': 4.941326141357422, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999996, 'quality_std': 0.03727295215254124, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
6,scaling_test,claude-3-5-sonnet,1381.3772201538086,3.283979343278356,0.0,0.0,1.0,0,20,5,1759345768.7153368,0.03071999999999999,10240,0.957817098536435,"{'min_latency_ms': 543.0643558502197, 'max_latency_ms': 1937.4654293060303, 'p95_latency_ms': np.float64(1931.4598441123962), 'p99_latency_ms': np.float64(1936.2643122673035), 'total_time_s': 6.090172290802002, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999996, 'quality_std': 0.044335695599357156, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
11,scaling_test,claude-3-5-sonnet,1314.3961310386658,3.5243521468336656,0.0,0.0,1.0,0,20,5,1759345778.6269403,0.03071999999999999,10240,0.9749641888502683,"{'min_latency_ms': 535.1722240447998, 'max_latency_ms': 1983.6831092834473, 'p95_latency_ms': np.float64(1918.512487411499), 'p99_latency_ms': np.float64(1970.6489849090576), 'total_time_s': 5.674801826477051, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999996, 'quality_std': 0.03856740540886548, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
16,scaling_test,claude-3-5-sonnet,1120.720875263214,3.7028070875807546,0.0,0.0,1.0,0,20,5,1759345788.3161702,0.03071999999999999,10240,0.9344569749738585,"{'min_latency_ms': 207.9324722290039, 'max_latency_ms': 2018.561601638794, 'p95_latency_ms': np.float64(1963.4979844093323), 'p99_latency_ms': np.float64(2007.5488781929016), 'total_time_s': 5.401307582855225, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999996, 'quality_std': 0.04750434388073592, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
1,scaling_test,claude-3-haiku,1268.5401320457458,3.539921687652236,0.0,0.0,1.0,0,20,5,1759345797.6495905,0.0256,10240,0.8406194607723803,"{'min_latency_ms': 534.9514484405518, 'max_latency_ms': 1956.9103717803955, 'p95_latency_ms': np.float64(1938.3319020271301), 'p99_latency_ms': np.float64(1953.1946778297424), 'total_time_s': 5.6498425006866455, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.053962632063170944, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
6,scaling_test,claude-3-haiku,1377.644693851471,3.189212271479164,0.0,0.0,1.0,0,20,5,1759345808.2179801,0.0256,10240,0.8370154862115219,"{'min_latency_ms': 661.4456176757812, 'max_latency_ms': 2013.9634609222412, 'p95_latency_ms': np.float64(1985.2455973625183), 'p99_latency_ms': np.float64(2008.2198882102966), 'total_time_s': 6.271141052246094, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.057589803133820325, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
11,scaling_test,claude-3-haiku,1161.9974493980408,3.6778795132801156,0.0,0.0,1.0,0,20,5,1759345817.2541294,0.0256,10240,0.8421329247896683,"{'min_latency_ms': 549.6580600738525, 'max_latency_ms': 1785.23588180542, 'p95_latency_ms': np.float64(1730.9520959854126), 'p99_latency_ms': np.float64(1774.3791246414185), 'total_time_s': 5.437916040420532, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.05774508247670216, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
16,scaling_test,claude-3-haiku,1365.4750227928162,2.998821435629251,0.0,0.0,1.0,0,20,5,1759345827.8750126,0.0256,10240,0.8483772503724578,"{'min_latency_ms': 767.146110534668, 'max_latency_ms': 1936.8767738342285, 'p95_latency_ms': np.float64(1919.3583130836487), 'p99_latency_ms': np.float64(1933.3730816841125), 'total_time_s': 6.669286727905273, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.05705131022796498, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
1,scaling_test,claude-3-sonnet,1360.187566280365,3.089520735450049,0.0,0.0,1.0,0,20,5,1759345837.7737727,0.15360000000000001,10240,0.8835217044830507,"{'min_latency_ms': 550.3547191619873, 'max_latency_ms': 1977.1480560302734, 'p95_latency_ms': np.float64(1924.659264087677), 'p99_latency_ms': np.float64(1966.6502976417542), 'total_time_s': 6.473495960235596, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.058452629496046606, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
6,scaling_test,claude-3-sonnet,1256.138801574707,3.4732685564079335,0.0,0.0,1.0,0,20,5,1759345848.5701082,0.15360000000000001,10240,0.8863139635356961,"{'min_latency_ms': 641.2796974182129, 'max_latency_ms': 1980.7326793670654, 'p95_latency_ms': np.float64(1846.4025855064392), 'p99_latency_ms': np.float64(1953.86666059494), 'total_time_s': 5.758264780044556, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.05783521510861833, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
11,scaling_test,claude-3-sonnet,1306.07008934021,3.5020347317551495,0.0,0.0,1.0,0,20,5,1759345858.6472163,0.15360000000000001,10240,0.9094961422561505,"{'min_latency_ms': 591.8083190917969, 'max_latency_ms': 1971.1270332336426, 'p95_latency_ms': np.float64(1944.3620324134827), 'p99_latency_ms': np.float64(1965.7740330696106), 'total_time_s': 5.710965633392334, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.042442911768923584, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
16,scaling_test,claude-3-sonnet,1307.1481943130493,3.262938882676132,0.0,0.0,1.0,0,20,5,1759345869.905544,0.15360000000000001,10240,0.8938240662052681,"{'min_latency_ms': 646.7251777648926, 'max_latency_ms': 1990.9627437591553, 'p95_latency_ms': np.float64(1935.0676536560059), 'p99_latency_ms': np.float64(1979.7837257385254), 'total_time_s': 6.129443645477295, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.04247877605865338, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
1,scaling_test,gemini-1.5-pro,1401.3476371765137,2.943218490521141,0.0,0.0,1.0,0,20,5,1759345881.238218,0.0128,10240,0.9409363720199192,"{'min_latency_ms': 520.9827423095703, 'max_latency_ms': 1970.2589511871338, 'p95_latency_ms': np.float64(1958.1118822097778), 'p99_latency_ms': np.float64(1967.8295373916626), 'total_time_s': 6.7952821254730225, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.05267230653872383, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
6,scaling_test,gemini-1.5-pro,1341.485834121704,3.3982951582179024,0.0,0.0,1.0,0,20,5,1759345889.5553467,0.0128,10240,0.9355344625586725,"{'min_latency_ms': 503.9515495300293, 'max_latency_ms': 1978.0657291412354, 'p95_latency_ms': np.float64(1966.320013999939), 'p99_latency_ms': np.float64(1975.716586112976), 'total_time_s': 5.885303974151611, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.054780000845711954, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
11,scaling_test,gemini-1.5-pro,1344.3536400794983,3.445457146125384,0.0,0.0,1.0,0,20,5,1759345898.4512925,0.0128,10240,0.9276983017835836,"{'min_latency_ms': 615.3252124786377, 'max_latency_ms': 1981.612205505371, 'p95_latency_ms': np.float64(1803.935217857361), 'p99_latency_ms': np.float64(1946.0768079757688), 'total_time_s': 5.8047449588775635, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.05905363250623063, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
16,scaling_test,gemini-1.5-pro,1202.2199511528015,3.696869831400932,0.0,0.0,1.0,0,20,5,1759345907.5707264,0.0128,10240,0.9307740387961949,"{'min_latency_ms': 589.9953842163086, 'max_latency_ms': 1967.3075675964355, 'p95_latency_ms': np.float64(1913.6008977890015), 'p99_latency_ms': np.float64(1956.5662336349487), 'total_time_s': 5.409982204437256, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.04978369465928124, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
1,scaling_test,gemini-1.5-flash,1053.9512276649475,3.823265280376166,0.0,0.0,1.0,0,20,5,1759345915.0947819,0.007679999999999998,10240,0.8813998853517441,"{'min_latency_ms': -36.76271438598633, 'max_latency_ms': 1967.0710563659668, 'p95_latency_ms': np.float64(1855.4362535476685), 'p99_latency_ms': np.float64(1944.744095802307), 'total_time_s': 5.231130599975586, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0003839999999999999, 'quality_std': 0.050008698196664016, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
6,scaling_test,gemini-1.5-flash,1155.3911447525024,3.615636866719992,0.0,0.0,1.0,0,20,5,1759345925.0694563,0.007679999999999998,10240,0.9025102091839412,"{'min_latency_ms': 502.6116371154785, 'max_latency_ms': 1947.0453262329102, 'p95_latency_ms': np.float64(1765.414369106293), 'p99_latency_ms': np.float64(1910.7191348075864), 'total_time_s': 5.531528949737549, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0003839999999999999, 'quality_std': 0.059194105459554974, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
11,scaling_test,gemini-1.5-flash,1217.6612257957458,3.756965086673101,0.0,0.0,1.0,0,20,5,1759345934.1183383,0.007679999999999998,10240,0.8709830012564668,"{'min_latency_ms': 560.8868598937988, 'max_latency_ms': 2007.932424545288, 'p95_latency_ms': np.float64(1776.0017752647402), 'p99_latency_ms': np.float64(1961.5462946891782), 'total_time_s': 5.323445796966553, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0003839999999999999, 'quality_std': 0.052873446152615404, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
16,scaling_test,gemini-1.5-flash,1351.5228390693665,3.367995990496259,0.0,0.0,1.0,0,20,5,1759345942.2099788,0.007679999999999998,10240,0.872315613940513,"{'min_latency_ms': 689.1014575958252, 'max_latency_ms': 1980.147361755371, 'p95_latency_ms': np.float64(1956.2964797019958), 'p99_latency_ms': np.float64(1975.377185344696), 'total_time_s': 5.938249349594116, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0003839999999999999, 'quality_std': 0.05361394744479093, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
1,scaling_test,llama-3.1-8b,1306.591236591339,3.3070039261320594,0.0,0.0,1.0,0,20,5,1759345952.8692935,0.002048000000000001,10240,0.7778348786353027,"{'min_latency_ms': 555.4070472717285, 'max_latency_ms': 1988.0244731903076, 'p95_latency_ms': np.float64(1957.3988199234009), 'p99_latency_ms': np.float64(1981.8993425369263), 'total_time_s': 6.047770261764526, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000006, 'quality_std': 0.05832225784189981, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
6,scaling_test,llama-3.1-8b,1199.6222853660583,3.634358086220239,0.0,0.0,1.0,0,20,5,1759345963.5152647,0.002048000000000001,10240,0.7696592403957419,"{'min_latency_ms': 541.0621166229248, 'max_latency_ms': 1914.41011428833, 'p95_latency_ms': np.float64(1768.0468797683716), 'p99_latency_ms': np.float64(1885.1374673843382), 'total_time_s': 5.503035068511963, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000006, 'quality_std': 0.06176209698043544, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
11,scaling_test,llama-3.1-8b,1143.358552455902,4.173916297150752,0.0,0.0,1.0,0,20,5,1759345973.8406181,0.002048000000000001,10240,0.7857043630038748,"{'min_latency_ms': 631.817102432251, 'max_latency_ms': 1720.1111316680908, 'p95_latency_ms': np.float64(1547.544610500336), 'p99_latency_ms': np.float64(1685.5978274345396), 'total_time_s': 4.791662931442261, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000006, 'quality_std': 0.06142254552174686, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
16,scaling_test,llama-3.1-8b,1228.6048531532288,3.613465135130269,0.0,0.0,1.0,0,20,5,1759345982.2759545,0.002048000000000001,10240,0.7706622409066766,"{'min_latency_ms': 539.0913486480713, 'max_latency_ms': 1971.7633724212646, 'p95_latency_ms': np.float64(1819.2362308502197), 'p99_latency_ms': np.float64(1941.2579441070554), 'total_time_s': 5.534853458404541, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000006, 'quality_std': 0.05320944570994387, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
1,scaling_test,llama-3.1-70b,1424.0724563598633,2.989394263900763,0.0,0.0,1.0,0,20,5,1759345993.4949126,0.008192000000000005,10240,0.8731561293258354,"{'min_latency_ms': 700.6974220275879, 'max_latency_ms': 1959.3937397003174, 'p95_latency_ms': np.float64(1924.493396282196), 'p99_latency_ms': np.float64(1952.4136710166931), 'total_time_s': 6.690318584442139, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00040960000000000025, 'quality_std': 0.0352234743129485, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
6,scaling_test,llama-3.1-70b,1090.003514289856,4.145917207566353,0.0,0.0,1.0,0,20,5,1759346002.3353932,0.008192000000000005,10240,0.8796527768140011,"{'min_latency_ms': 508.23211669921875, 'max_latency_ms': 1798.6392974853516, 'p95_latency_ms': np.float64(1785.5579257011414), 'p99_latency_ms': np.float64(1796.0230231285095), 'total_time_s': 4.824023008346558, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00040960000000000025, 'quality_std': 0.06407982743031454, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
11,scaling_test,llama-3.1-70b,964.3666982650757,4.70392645090585,0.0,0.0,1.0,0,20,5,1759346010.6974216,0.008192000000000005,10240,0.8992009479579495,"{'min_latency_ms': 135.56504249572754, 'max_latency_ms': 1794.3906784057617, 'p95_latency_ms': np.float64(1775.5030393600464), 'p99_latency_ms': np.float64(1790.6131505966187), 'total_time_s': 4.251767158508301, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00040960000000000025, 'quality_std': 0.050182727925105516, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
16,scaling_test,llama-3.1-70b,1258.9476823806763,3.653831604110515,0.125,0.0,1.0,0,20,5,1759346020.388094,0.008192000000000005,10240,0.8930892849911802,"{'min_latency_ms': 620.0413703918457, 'max_latency_ms': 1916.384220123291, 'p95_latency_ms': np.float64(1765.2448296546936), 'p99_latency_ms': np.float64(1886.1563420295713), 'total_time_s': 5.473706007003784, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.5546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00040960000000000025, 'quality_std': 0.04969618373257882, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,gpt-4o-mini,1273.702096939087,0.7851086796926611,0.0,0.0,1.0,0,10,1,1759346033.2373884,0.0007680000000000001,5120,0.8342026655690804,"{'min_latency_ms': 741.3482666015625, 'max_latency_ms': 1817.1906471252441, 'p95_latency_ms': np.float64(1794.5520520210266), 'p99_latency_ms': np.float64(1812.6629281044006), 'total_time_s': 12.737090110778809, 'initial_memory_mb': 294.5546875, 'final_memory_mb': 294.5546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000001e-05, 'quality_std': 0.0446055902590032, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,gpt-4o-mini,1511.399483680725,2.933763102440156,0.25,0.0,1.0,0,10,6,1759346036.647214,0.0007680000000000001,5120,0.8471277213854321,"{'min_latency_ms': 800.0023365020752, 'max_latency_ms': 1982.2335243225098, 'p95_latency_ms': np.float64(1942.5656914710999), 'p99_latency_ms': np.float64(1974.2999577522278), 'total_time_s': 3.4085915088653564, 'initial_memory_mb': 294.5546875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000001e-05, 'quality_std': 0.06432848764341552, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,gpt-4o,1150.0491619110107,0.8695228900132853,0.0,0.0,1.0,0,10,1,1759346048.2587333,0.0256,5120,0.9599583095352598,"{'min_latency_ms': 544.191837310791, 'max_latency_ms': 1584.9177837371826, 'p95_latency_ms': np.float64(1511.2051010131834), 'p99_latency_ms': np.float64(1570.1752471923828), 'total_time_s': 11.50055980682373, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.057087428808928614, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,gpt-4o,1241.9081926345825,3.22981029743519,0.0,0.0,1.0,0,10,6,1759346051.3563757,0.0256,5120,0.9585199558650109,"{'min_latency_ms': 644.8915004730225, 'max_latency_ms': 1933.1202507019043, 'p95_latency_ms': np.float64(1865.2720570564268), 'p99_latency_ms': np.float64(1919.5506119728088), 'total_time_s': 3.0961570739746094, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.04062204558012218, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,gpt-4-turbo,1581.8750381469727,0.6321581179029606,0.0,0.0,1.0,0,10,1,1759346067.3017964,0.0512,5120,0.9324427514695872,"{'min_latency_ms': 833.935022354126, 'max_latency_ms': 2019.5622444152832, 'p95_latency_ms': np.float64(1978.4671545028687), 'p99_latency_ms': np.float64(2011.3432264328003), 'total_time_s': 15.818827152252197, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.04654046504268862, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,gpt-4-turbo,1153.432297706604,3.2168993240245847,0.0,0.0,1.0,0,10,6,1759346070.4116762,0.0512,5120,0.9790878168553954,"{'min_latency_ms': 635.2591514587402, 'max_latency_ms': 1833.7628841400146, 'p95_latency_ms': np.float64(1808.298635482788), 'p99_latency_ms': np.float64(1828.6700344085693), 'total_time_s': 3.108583450317383, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.038783270511690816, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,claude-3-5-sonnet,1397.6783752441406,0.7154680102707422,0.0,0.0,1.0,0,10,1,1759346084.5017824,0.015359999999999999,5120,0.9421283071854264,"{'min_latency_ms': 532.8092575073242, 'max_latency_ms': 2028.5301208496094, 'p95_latency_ms': np.float64(1968.815779685974), 'p99_latency_ms': np.float64(2016.5872526168823), 'total_time_s': 13.976865291595459, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999998, 'quality_std': 0.041911119259679885, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,claude-3-5-sonnet,1215.26198387146,3.6278421983995233,0.0,0.0,1.0,0,10,6,1759346087.2596216,0.015359999999999999,5120,0.9131170426955485,"{'min_latency_ms': 568.2053565979004, 'max_latency_ms': 1612.9648685455322, 'p95_latency_ms': np.float64(1559.6276402473447), 'p99_latency_ms': np.float64(1602.2974228858948), 'total_time_s': 2.7564594745635986, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999998, 'quality_std': 0.04319876804321411, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,claude-3-haiku,1299.2276906967163,0.7696826190331395,0.0,0.0,1.0,0,10,1,1759346100.364407,0.0128,5120,0.8252745814485088,"{'min_latency_ms': 668.3671474456787, 'max_latency_ms': 2041.351318359375, 'p95_latency_ms': np.float64(1843.0875778198238), 'p99_latency_ms': np.float64(2001.6985702514648), 'total_time_s': 12.992368221282959, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.058205855327116265, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,claude-3-haiku,1297.508192062378,3.6581654644321087,0.0,0.0,1.0,0,10,6,1759346103.0993996,0.0128,5120,0.8496515913760503,"{'min_latency_ms': 649.4293212890625, 'max_latency_ms': 1873.1675148010254, 'p95_latency_ms': np.float64(1843.8988208770752), 'p99_latency_ms': np.float64(1867.3137760162354), 'total_time_s': 2.7336106300354004, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.06872259975771335, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,claude-3-sonnet,1239.8123741149902,0.8065692205263874,0.0,0.0,1.0,0,10,1,1759346114.9650035,0.07680000000000001,5120,0.8917269647002374,"{'min_latency_ms': 559.9334239959717, 'max_latency_ms': 1828.9196491241455, 'p95_latency_ms': np.float64(1804.089903831482), 'p99_latency_ms': np.float64(1823.9537000656128), 'total_time_s': 12.398191928863525, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.06728256480558785, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,claude-3-sonnet,1325.3875255584717,3.2305613290400945,0.0,0.0,1.0,0,10,6,1759346118.062173,0.07680000000000001,5120,0.8904253939966993,"{'min_latency_ms': 598.4294414520264, 'max_latency_ms': 1956.3815593719482, 'p95_latency_ms': np.float64(1906.8223834037778), 'p99_latency_ms': np.float64(1946.4697241783142), 'total_time_s': 3.0954372882843018, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.06220445402424322, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,gemini-1.5-pro,1264.2754554748535,0.7909630217832475,0.0,0.0,1.0,0,10,1,1759346130.8282964,0.0064,5120,0.8998460053229075,"{'min_latency_ms': 532.9890251159668, 'max_latency_ms': 1795.492172241211, 'p95_latency_ms': np.float64(1745.6329107284544), 'p99_latency_ms': np.float64(1785.5203199386597), 'total_time_s': 12.642816066741943, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.04050886994282564, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,gemini-1.5-pro,1342.9006338119507,3.7829150181123015,0.0,0.0,1.0,0,10,6,1759346133.472956,0.0064,5120,0.9029938738274873,"{'min_latency_ms': 701.9498348236084, 'max_latency_ms': 1964.576005935669, 'p95_latency_ms': np.float64(1872.5560665130613), 'p99_latency_ms': np.float64(1946.1720180511475), 'total_time_s': 2.6434640884399414, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.05723923041822323, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,gemini-1.5-flash,1368.2588577270508,0.7308515907093506,0.0,0.0,1.0,0,10,1,1759346147.2717574,0.0038399999999999997,5120,0.8795901650694117,"{'min_latency_ms': 620.3913688659668, 'max_latency_ms': 2018.2685852050781, 'p95_latency_ms': np.float64(1993.7742233276367), 'p99_latency_ms': np.float64(2013.3697128295898), 'total_time_s': 13.682668447494507, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00038399999999999996, 'quality_std': 0.05927449072307118, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,gemini-1.5-flash,1207.8629732131958,3.2879592824302044,0.0,0.0,1.0,0,10,6,1759346150.314617,0.0038399999999999997,5120,0.8611774574826484,"{'min_latency_ms': 594.973087310791, 'max_latency_ms': 1811.2657070159912, 'p95_latency_ms': np.float64(1681.6352963447569), 'p99_latency_ms': np.float64(1785.3396248817444), 'total_time_s': 3.041400194168091, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00038399999999999996, 'quality_std': 0.07904328865026665, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,llama-3.1-8b,1144.2910194396973,0.8738903631276332,0.0,0.0,1.0,0,10,1,1759346161.882389,0.0010240000000000002,5120,0.7805684315735588,"{'min_latency_ms': 594.846248626709, 'max_latency_ms': 1759.0994834899902, 'p95_latency_ms': np.float64(1631.7564606666563), 'p99_latency_ms': np.float64(1733.6308789253235), 'total_time_s': 11.443083047866821, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000002, 'quality_std': 0.0613021253594286, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,llama-3.1-8b,1128.666615486145,3.527006383973853,0.0,0.0,1.0,0,10,6,1759346164.7190907,0.0010240000000000002,5120,0.7915276538063776,"{'min_latency_ms': 610.3026866912842, 'max_latency_ms': 1934.2899322509766, 'p95_latency_ms': np.float64(1909.2738270759583), 'p99_latency_ms': np.float64(1929.286711215973), 'total_time_s': 2.835265636444092, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000002, 'quality_std': 0.055242108041169316, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,llama-3.1-70b,1341.410732269287,0.7454805363345477,0.0,0.0,1.0,0,10,1,1759346178.2571824,0.004096000000000001,5120,0.8513858389112968,"{'min_latency_ms': 566.3845539093018, 'max_latency_ms': 1769.1750526428223, 'p95_latency_ms': np.float64(1743.9924359321594), 'p99_latency_ms': np.float64(1764.1385293006897), 'total_time_s': 13.414166450500488, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004096000000000001, 'quality_std': 0.06286695897481548, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,concurrent_test,llama-3.1-70b,1410.3811264038086,3.52022788340447,0.0,0.0,1.0,0,10,6,1759346181.0992308,0.004096000000000001,5120,0.8534058400920448,"{'min_latency_ms': 572.9773044586182, 'max_latency_ms': 1928.0850887298584, 'p95_latency_ms': np.float64(1903.529143333435), 'p99_latency_ms': np.float64(1923.1738996505737), 'total_time_s': 2.8407251834869385, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004096000000000001, 'quality_std': 0.059750620144052545, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gpt-4o-mini,1177.2440481185913,3.97501008701798,0.0,0.0,1.0,0,50,5,1759346193.7901201,0.0038400000000000023,25600,0.8512259391579574,"{'min_latency_ms': 537.5485420227051, 'max_latency_ms': 2001.0862350463867, 'p95_latency_ms': np.float64(1892.5400853157041), 'p99_latency_ms': np.float64(1985.4257130622864), 'total_time_s': 12.578584432601929, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000005e-05, 'quality_std': 0.0581968026848211, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gpt-4o-mini,1229.8026752471924,3.9282369679460363,0.0,0.0,1.0,0,50,5,1759346206.6300905,0.0038400000000000023,25600,0.8537868196468017,"{'min_latency_ms': 518.6026096343994, 'max_latency_ms': 1944.331407546997, 'p95_latency_ms': np.float64(1909.6850633621214), 'p99_latency_ms': np.float64(1940.652117729187), 'total_time_s': 12.72835636138916, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000005e-05, 'quality_std': 0.05181407518487485, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gpt-4o-mini,1274.8144483566284,3.7483119966709824,0.0,0.0,1.0,0,50,5,1759346220.0900073,0.0038400000000000023,25600,0.8487480924622282,"{'min_latency_ms': 529.292106628418, 'max_latency_ms': 1996.4158535003662, 'p95_latency_ms': np.float64(1960.6919050216675), 'p99_latency_ms': np.float64(1988.2149648666382), 'total_time_s': 13.339337825775146, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000005e-05, 'quality_std': 0.05812899461310237, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gpt-4o,1174.5057010650635,4.0514136389986115,0.0,0.0,1.0,0,50,5,1759346232.557784,0.12800000000000017,25600,0.9484191580718665,"{'min_latency_ms': 286.58127784729004, 'max_latency_ms': 1877.345085144043, 'p95_latency_ms': np.float64(1735.1435780525208), 'p99_latency_ms': np.float64(1842.000467777252), 'total_time_s': 12.341371297836304, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0025600000000000032, 'quality_std': 0.0491398572941036, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gpt-4o,1225.388593673706,3.875932429633176,0.125,0.0,1.0,0,50,5,1759346245.5669534,0.12800000000000017,25600,0.9557179217710832,"{'min_latency_ms': 514.6803855895996, 'max_latency_ms': 2034.6620082855225, 'p95_latency_ms': np.float64(1909.4360709190366), 'p99_latency_ms': np.float64(2010.34743309021), 'total_time_s': 12.900121688842773, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0025600000000000032, 'quality_std': 0.04870463047338363, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gpt-4o,1244.0021991729736,3.7266446101546777,0.0,0.0,1.0,0,50,5,1759346259.1414776,0.12800000000000017,25600,0.9458944372937584,"{'min_latency_ms': 521.9912528991699, 'max_latency_ms': 1986.6855144500732, 'p95_latency_ms': np.float64(1953.3554077148438), 'p99_latency_ms': np.float64(1978.9683985710144), 'total_time_s': 13.416895151138306, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0025600000000000032, 'quality_std': 0.04851286804634898, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gpt-4-turbo,1181.3615322113037,4.124998416603219,0.0,0.0,1.0,0,50,5,1759346271.374578,0.25600000000000034,25600,0.9651345363111258,"{'min_latency_ms': 353.2071113586426, 'max_latency_ms': 1966.524362564087, 'p95_latency_ms': np.float64(1945.0057744979858), 'p99_latency_ms': np.float64(1965.7717752456665), 'total_time_s': 12.121216773986816, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0051200000000000065, 'quality_std': 0.04338778763022959, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gpt-4-turbo,1291.4055681228638,3.77552400952112,0.0,0.0,1.0,0,50,5,1759346284.731812,0.25600000000000034,25600,0.9689389907566063,"{'min_latency_ms': 555.095911026001, 'max_latency_ms': 2027.0910263061523, 'p95_latency_ms': np.float64(1966.5393114089964), 'p99_latency_ms': np.float64(2018.9284563064575), 'total_time_s': 13.243194818496704, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0051200000000000065, 'quality_std': 0.04154143035607859, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gpt-4-turbo,1261.4208269119263,3.663208321130074,0.0,0.0,1.0,0,50,5,1759346298.4905493,0.25600000000000034,25600,0.9573488473081913,"{'min_latency_ms': 284.8320007324219, 'max_latency_ms': 2011.866807937622, 'p95_latency_ms': np.float64(1975.5298137664795), 'p99_latency_ms': np.float64(2000.7115292549133), 'total_time_s': 13.649237394332886, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0051200000000000065, 'quality_std': 0.04380501534660363, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,claude-3-5-sonnet,1270.3543138504028,3.7944320989090614,0.0,0.0,1.0,0,50,5,1759346311.7936022,0.07680000000000001,25600,0.948463600922609,"{'min_latency_ms': 622.9770183563232, 'max_latency_ms': 1970.0510501861572, 'p95_latency_ms': np.float64(1868.455410003662), 'p99_latency_ms': np.float64(1957.5506472587585), 'total_time_s': 13.177202463150024, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.001536, 'quality_std': 0.04872900892927657, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,claude-3-5-sonnet,1154.527621269226,4.107802148818313,0.0,0.0,1.0,0,50,5,1759346324.0782034,0.07680000000000001,25600,0.9535056752128789,"{'min_latency_ms': 526.8404483795166, 'max_latency_ms': 1841.3877487182617, 'p95_latency_ms': np.float64(1815.3946280479431), 'p99_latency_ms': np.float64(1837.1384692192078), 'total_time_s': 12.171959161758423, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.001536, 'quality_std': 0.04600056992617095, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,claude-3-5-sonnet,1341.6658163070679,3.5050325493977805,0.0,0.0,1.0,0,50,5,1759346338.4560573,0.07680000000000001,25600,0.947231761746643,"{'min_latency_ms': 607.1841716766357, 'max_latency_ms': 1968.3496952056885, 'p95_latency_ms': np.float64(1938.420307636261), 'p99_latency_ms': np.float64(1963.8122081756592), 'total_time_s': 14.265202760696411, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.001536, 'quality_std': 0.0468041040494112, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,claude-3-haiku,1268.9041805267334,3.6527405734902607,0.125,0.0,1.0,0,50,5,1759346352.2760284,0.06400000000000008,25600,0.8657832919908838,"{'min_latency_ms': 576.9007205963135, 'max_latency_ms': 1978.3263206481934, 'p95_latency_ms': np.float64(1900.9657382965088), 'p99_latency_ms': np.float64(1977.4397349357605), 'total_time_s': 13.688352346420288, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0012800000000000016, 'quality_std': 0.05791027367020173, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,claude-3-haiku,1273.6989831924438,3.7602543777430877,0.0,0.0,1.0,0,50,5,1759346365.681829,0.06400000000000008,25600,0.8396294693060197,"{'min_latency_ms': 521.7316150665283, 'max_latency_ms': 1988.7199401855469, 'p95_latency_ms': np.float64(1945.9344744682312), 'p99_latency_ms': np.float64(1987.1683859825134), 'total_time_s': 13.296972751617432, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0012800000000000016, 'quality_std': 0.06291349263235946, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,claude-3-haiku,1234.9269914627075,3.9335082345318124,0.0,0.0,1.0,0,50,5,1759346378.5192664,0.06400000000000008,25600,0.8469784358915146,"{'min_latency_ms': 529.503345489502, 'max_latency_ms': 1981.7008972167969, 'p95_latency_ms': np.float64(1859.1547846794128), 'p99_latency_ms': np.float64(1963.3227896690369), 'total_time_s': 12.711299180984497, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0012800000000000016, 'quality_std': 0.061722943046806616, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,claude-3-sonnet,1195.9008169174194,4.06962738382444,0.0,0.0,1.0,0,50,5,1759346390.9144897,0.3840000000000003,25600,0.9026531444228556,"{'min_latency_ms': -36.6673469543457, 'max_latency_ms': 1991.610050201416, 'p95_latency_ms': np.float64(1819.4202184677124), 'p99_latency_ms': np.float64(1987.222683429718), 'total_time_s': 12.286137104034424, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000005, 'quality_std': 0.058229589360407986, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,claude-3-sonnet,1372.0379829406738,3.502253345465805,0.0,0.0,1.0,0,50,5,1759346405.3043494,0.3840000000000003,25600,0.8837364473272626,"{'min_latency_ms': 543.1270599365234, 'max_latency_ms': 1992.779016494751, 'p95_latency_ms': np.float64(1931.822681427002), 'p99_latency_ms': np.float64(1987.4089169502258), 'total_time_s': 14.276522874832153, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000005, 'quality_std': 0.05634614113838598, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,claude-3-sonnet,1257.2709035873413,3.7764857062182706,0.0,0.0,1.0,0,50,5,1759346418.6521854,0.3840000000000003,25600,0.9053414058751514,"{'min_latency_ms': 529.8404693603516, 'max_latency_ms': 1990.1280403137207, 'p95_latency_ms': np.float64(1911.1806631088257), 'p99_latency_ms': np.float64(1976.6331052780151), 'total_time_s': 13.239822387695312, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000005, 'quality_std': 0.050506656009957705, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gemini-1.5-pro,1221.5951490402222,3.8372908969845323,0.0,0.0,1.0,0,50,5,1759346431.7921565,0.03200000000000004,25600,0.9365925291921394,"{'min_latency_ms': 329.1811943054199, 'max_latency_ms': 1995.384693145752, 'p95_latency_ms': np.float64(1965.0332808494568), 'p99_latency_ms': np.float64(1988.3063769340515), 'total_time_s': 13.030025959014893, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0006400000000000008, 'quality_std': 0.04847128641002876, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gemini-1.5-pro,1351.8355464935303,3.6227975436552606,0.0,0.0,1.0,0,50,5,1759346445.7126448,0.03200000000000004,25600,0.9323552590826123,"{'min_latency_ms': 515.129566192627, 'max_latency_ms': 2008.0702304840088, 'p95_latency_ms': np.float64(1958.6564779281616), 'p99_latency_ms': np.float64(2004.1296029090881), 'total_time_s': 13.801488876342773, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0006400000000000008, 'quality_std': 0.055840796126395656, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gemini-1.5-pro,1240.622534751892,3.8813384098374453,0.0,0.0,1.0,0,50,5,1759346458.7192729,0.03200000000000004,25600,0.9407390543744837,"{'min_latency_ms': -29.146671295166016, 'max_latency_ms': 1934.4398975372314, 'p95_latency_ms': np.float64(1849.7230291366577), 'p99_latency_ms': np.float64(1918.0084466934204), 'total_time_s': 12.8821542263031, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0006400000000000008, 'quality_std': 0.050597003908357786, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gemini-1.5-flash,1237.6702642440796,3.812923495644346,0.0,0.0,1.0,0,50,5,1759346471.9588974,0.019200000000000002,25600,0.8556073429019542,"{'min_latency_ms': 536.4787578582764, 'max_latency_ms': 2010.1728439331055, 'p95_latency_ms': np.float64(1911.8669629096985), 'p99_latency_ms': np.float64(1976.080708503723), 'total_time_s': 13.113297462463379, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.000384, 'quality_std': 0.06082135675952047, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gemini-1.5-flash,1180.0980806350708,4.016049090832003,0.0,0.0,1.0,0,50,5,1759346484.5327744,0.019200000000000002,25600,0.8718428063415768,"{'min_latency_ms': 109.58051681518555, 'max_latency_ms': 1993.358850479126, 'p95_latency_ms': np.float64(1872.3165988922117), 'p99_latency_ms': np.float64(1992.416422367096), 'total_time_s': 12.450047016143799, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.000384, 'quality_std': 0.0613916834940056, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,gemini-1.5-flash,1194.4490098953247,4.009936119483076,0.0,0.0,1.0,0,50,5,1759346497.1201088,0.019200000000000002,25600,0.8652112059805899,"{'min_latency_ms': 520.3211307525635, 'max_latency_ms': 1942.4259662628174, 'p95_latency_ms': np.float64(1834.6370577812195), 'p99_latency_ms': np.float64(1890.3984904289243), 'total_time_s': 12.469026565551758, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.000384, 'quality_std': 0.05312368368226588, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,llama-3.1-8b,1306.2016773223877,3.683763547696555,0.0,0.0,1.0,0,50,5,1759346510.812732,0.005119999999999998,25600,0.7727309350554936,"{'min_latency_ms': 527.4953842163086, 'max_latency_ms': 1997.086524963379, 'p95_latency_ms': np.float64(1942.7793741226194), 'p99_latency_ms': np.float64(1994.0643763542175), 'total_time_s': 13.573075294494629, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010239999999999995, 'quality_std': 0.05596283861854901, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,llama-3.1-8b,1304.1251468658447,3.617383744773005,0.0,0.0,1.0,0,50,5,1759346524.7711937,0.005119999999999998,25600,0.785787220179362,"{'min_latency_ms': 112.00571060180664, 'max_latency_ms': 2015.146255493164, 'p95_latency_ms': np.float64(2001.4938592910767), 'p99_latency_ms': np.float64(2012.321424484253), 'total_time_s': 13.822144269943237, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010239999999999995, 'quality_std': 0.0552285639827787, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,llama-3.1-8b,1290.5346298217773,3.671522710311051,0.0,0.0,1.0,0,50,5,1759346538.5084107,0.005119999999999998,25600,0.7771978709125356,"{'min_latency_ms': 565.7510757446289, 'max_latency_ms': 1945.1093673706055, 'p95_latency_ms': np.float64(1906.785237789154), 'p99_latency_ms': np.float64(1942.4526476860046), 'total_time_s': 13.618327856063843, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010239999999999995, 'quality_std': 0.057252814774054535, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,llama-3.1-70b,1213.9334726333618,3.947675276737486,0.0,0.0,1.0,0,50,5,1759346551.2951744,0.02047999999999999,25600,0.8683286341213061,"{'min_latency_ms': -79.86569404602051, 'max_latency_ms': 2014.9149894714355, 'p95_latency_ms': np.float64(1919.9433565139768), 'p99_latency_ms': np.float64(1992.4925136566162), 'total_time_s': 12.665682077407837, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004095999999999998, 'quality_std': 0.05862810413022958, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,llama-3.1-70b,1298.1958770751953,3.7049711897976763,0.0,0.0,1.0,0,50,5,1759346564.9280033,0.02047999999999999,25600,0.8889975698232048,"{'min_latency_ms': 503.5574436187744, 'max_latency_ms': 2020.4124450683594, 'p95_latency_ms': np.float64(1901.4497756958008), 'p99_latency_ms': np.float64(1986.3133001327512), 'total_time_s': 13.495381593704224, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004095999999999998, 'quality_std': 0.053463278827038344, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
5,memory_test,llama-3.1-70b,1187.040138244629,4.165139112812611,0.0,0.0,1.0,0,50,5,1759346577.0467978,0.02047999999999999,25600,0.8884529182459214,"{'min_latency_ms': 506.2377452850342, 'max_latency_ms': 2026.6106128692627, 'p95_latency_ms': np.float64(1958.3556652069092), 'p99_latency_ms': np.float64(2007.5032830238342), 'total_time_s': 12.004400968551636, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004095999999999998, 'quality_std': 0.05625669416735748, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
1 agent_count test_name model_name latency_ms throughput_rps memory_usage_mb cpu_usage_percent success_rate error_count total_requests concurrent_requests timestamp cost_usd tokens_used response_quality_score additional_metrics agent_creation_time tool_registration_time execution_time total_latency chaining_steps chaining_success error_scenarios_tested recovery_rate resource_cycles avg_memory_delta memory_leak_detected
2 1 scaling_test gpt-4o-mini 1131.7063331604004 4.131429224630576 1.25 0.0 1.0 0 20 5 1759345643.9453266 0.0015359999999999996 10240 0.8548663728748707 {'min_latency_ms': 562.7951622009277, 'max_latency_ms': 1780.4391384124756, 'p95_latency_ms': np.float64(1744.0685987472534), 'p99_latency_ms': np.float64(1773.1650304794312), 'total_time_s': 4.84093976020813, 'initial_memory_mb': 291.5546875, 'final_memory_mb': 292.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.679999999999998e-05, 'quality_std': 0.0675424923987846, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
3 6 scaling_test gpt-4o-mini 1175.6950378417969 3.7575854004826277 0.0 0.0 1.0 0 20 5 1759345654.225195 0.0015359999999999996 10240 0.8563524483655013 {'min_latency_ms': 535.4223251342773, 'max_latency_ms': 1985.3930473327637, 'p95_latency_ms': np.float64(1975.6355285644531), 'p99_latency_ms': np.float64(1983.4415435791016), 'total_time_s': 5.322566986083984, 'initial_memory_mb': 293.1796875, 'final_memory_mb': 293.1796875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.679999999999998e-05, 'quality_std': 0.05770982402152013, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
4 11 scaling_test gpt-4o-mini 996.9684720039368 4.496099509029146 0.0 0.0 1.0 0 20 5 1759345662.8977199 0.0015359999999999996 10240 0.8844883644941982 {'min_latency_ms': 45.22204399108887, 'max_latency_ms': 1962.2983932495117, 'p95_latency_ms': np.float64(1647.7753758430483), 'p99_latency_ms': np.float64(1899.3937897682185), 'total_time_s': 4.448300123214722, 'initial_memory_mb': 293.5546875, 'final_memory_mb': 293.5546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.679999999999998e-05, 'quality_std': 0.043434832388308614, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
5 16 scaling_test gpt-4o-mini 1112.8681421279907 3.587833950074127 0.0 0.0 1.0 0 20 5 1759345673.162652 0.0015359999999999996 10240 0.8563855623109009 {'min_latency_ms': 564.1369819641113, 'max_latency_ms': 1951.472282409668, 'p95_latency_ms': np.float64(1897.4883794784546), 'p99_latency_ms': np.float64(1940.6755018234253), 'total_time_s': 5.57439398765564, 'initial_memory_mb': 293.8046875, 'final_memory_mb': 293.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.679999999999998e-05, 'quality_std': 0.05691925404970228, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
6 1 scaling_test gpt-4o 1298.2240080833435 3.3670995599405846 0.125 0.0 1.0 0 20 5 1759345683.2065425 0.0512 10240 0.9279627852934385 {'min_latency_ms': 693.6078071594238, 'max_latency_ms': 1764.8026943206787, 'p95_latency_ms': np.float64(1681.7602753639221), 'p99_latency_ms': np.float64(1748.1942105293274), 'total_time_s': 5.939830303192139, 'initial_memory_mb': 293.8046875, 'final_memory_mb': 293.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.050879141399088765, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
7 6 scaling_test gpt-4o 1264.4854545593262 3.5293826102318846 0.0 0.0 1.0 0 20 5 1759345692.6439528 0.0512 10240 0.9737471278894755 {'min_latency_ms': 175.65083503723145, 'max_latency_ms': 1990.2207851409912, 'p95_latency_ms': np.float64(1910.3824019432068), 'p99_latency_ms': np.float64(1974.2531085014343), 'total_time_s': 5.66671347618103, 'initial_memory_mb': 293.9296875, 'final_memory_mb': 293.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.038542680129780495, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
8 11 scaling_test gpt-4o 1212.0607376098633 3.799000004302323 0.125 0.0 1.0 0 20 5 1759345701.8719423 0.0512 10240 0.9366077507029601 {'min_latency_ms': 542.8001880645752, 'max_latency_ms': 1973.801851272583, 'p95_latency_ms': np.float64(1969.2555904388428), 'p99_latency_ms': np.float64(1972.892599105835), 'total_time_s': 5.264543294906616, 'initial_memory_mb': 293.9296875, 'final_memory_mb': 294.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.044670864578792276, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
9 16 scaling_test gpt-4o 1367.1631932258606 3.1229790107314654 0.0 0.0 1.0 0 20 5 1759345711.9738443 0.0512 10240 0.9328922198254587 {'min_latency_ms': 715.888261795044, 'max_latency_ms': 1905.6315422058105, 'p95_latency_ms': np.float64(1890.480661392212), 'p99_latency_ms': np.float64(1902.6013660430908), 'total_time_s': 6.404141664505005, 'initial_memory_mb': 294.0546875, 'final_memory_mb': 294.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.05146728864962903, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
10 1 scaling_test gpt-4-turbo 1429.1370868682861 3.3141614744089267 0.125 0.0 1.0 0 20 5 1759345722.7650242 0.1024 10240 0.960928099222926 {'min_latency_ms': 637.6686096191406, 'max_latency_ms': 1994.9300289154053, 'p95_latency_ms': np.float64(1973.6997246742249), 'p99_latency_ms': np.float64(1990.6839680671692), 'total_time_s': 6.0347089767456055, 'initial_memory_mb': 294.0546875, 'final_memory_mb': 294.1796875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.0429193742204114, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
11 6 scaling_test gpt-4-turbo 1167.8012132644653 3.933946564951724 0.0 0.0 1.0 0 20 5 1759345731.809648 0.1024 10240 0.9575695597206497 {'min_latency_ms': 521.2328433990479, 'max_latency_ms': 1973.503828048706, 'p95_latency_ms': np.float64(1931.3542008399963), 'p99_latency_ms': np.float64(1965.073902606964), 'total_time_s': 5.083953142166138, 'initial_memory_mb': 294.1796875, 'final_memory_mb': 294.1796875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.04742414087184447, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
12 11 scaling_test gpt-4-turbo 1435.1954460144043 3.0793869953124613 0.0 0.0 1.0 0 20 5 1759345741.9117725 0.1024 10240 0.9564233524947511 {'min_latency_ms': 711.4903926849365, 'max_latency_ms': 2034.2109203338623, 'p95_latency_ms': np.float64(1998.979663848877), 'p99_latency_ms': np.float64(2027.1646690368652), 'total_time_s': 6.4947991371154785, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.03428874308764032, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
13 16 scaling_test gpt-4-turbo 1092.1013355255127 4.057819053252887 0.0 0.0 1.0 0 20 5 1759345749.8833907 0.1024 10240 0.9521218582720758 {'min_latency_ms': 554.4416904449463, 'max_latency_ms': 1968.658447265625, 'p95_latency_ms': np.float64(1637.098050117493), 'p99_latency_ms': np.float64(1902.346367835998), 'total_time_s': 4.92875599861145, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.043763298033728824, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
14 1 scaling_test claude-3-5-sonnet 1046.9236850738525 4.047496446876068 0.0 0.0 1.0 0 20 5 1759345757.9539518 0.03071999999999999 10240 0.9511838758969231 {'min_latency_ms': 184.94415283203125, 'max_latency_ms': 1966.0136699676514, 'p95_latency_ms': np.float64(1677.8094530105593), 'p99_latency_ms': np.float64(1908.3728265762325), 'total_time_s': 4.941326141357422, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999996, 'quality_std': 0.03727295215254124, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
15 6 scaling_test claude-3-5-sonnet 1381.3772201538086 3.283979343278356 0.0 0.0 1.0 0 20 5 1759345768.7153368 0.03071999999999999 10240 0.957817098536435 {'min_latency_ms': 543.0643558502197, 'max_latency_ms': 1937.4654293060303, 'p95_latency_ms': np.float64(1931.4598441123962), 'p99_latency_ms': np.float64(1936.2643122673035), 'total_time_s': 6.090172290802002, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999996, 'quality_std': 0.044335695599357156, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
16 11 scaling_test claude-3-5-sonnet 1314.3961310386658 3.5243521468336656 0.0 0.0 1.0 0 20 5 1759345778.6269403 0.03071999999999999 10240 0.9749641888502683 {'min_latency_ms': 535.1722240447998, 'max_latency_ms': 1983.6831092834473, 'p95_latency_ms': np.float64(1918.512487411499), 'p99_latency_ms': np.float64(1970.6489849090576), 'total_time_s': 5.674801826477051, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999996, 'quality_std': 0.03856740540886548, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
17 16 scaling_test claude-3-5-sonnet 1120.720875263214 3.7028070875807546 0.0 0.0 1.0 0 20 5 1759345788.3161702 0.03071999999999999 10240 0.9344569749738585 {'min_latency_ms': 207.9324722290039, 'max_latency_ms': 2018.561601638794, 'p95_latency_ms': np.float64(1963.4979844093323), 'p99_latency_ms': np.float64(2007.5488781929016), 'total_time_s': 5.401307582855225, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999996, 'quality_std': 0.04750434388073592, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
18 1 scaling_test claude-3-haiku 1268.5401320457458 3.539921687652236 0.0 0.0 1.0 0 20 5 1759345797.6495905 0.0256 10240 0.8406194607723803 {'min_latency_ms': 534.9514484405518, 'max_latency_ms': 1956.9103717803955, 'p95_latency_ms': np.float64(1938.3319020271301), 'p99_latency_ms': np.float64(1953.1946778297424), 'total_time_s': 5.6498425006866455, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.053962632063170944, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
19 6 scaling_test claude-3-haiku 1377.644693851471 3.189212271479164 0.0 0.0 1.0 0 20 5 1759345808.2179801 0.0256 10240 0.8370154862115219 {'min_latency_ms': 661.4456176757812, 'max_latency_ms': 2013.9634609222412, 'p95_latency_ms': np.float64(1985.2455973625183), 'p99_latency_ms': np.float64(2008.2198882102966), 'total_time_s': 6.271141052246094, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.057589803133820325, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
20 11 scaling_test claude-3-haiku 1161.9974493980408 3.6778795132801156 0.0 0.0 1.0 0 20 5 1759345817.2541294 0.0256 10240 0.8421329247896683 {'min_latency_ms': 549.6580600738525, 'max_latency_ms': 1785.23588180542, 'p95_latency_ms': np.float64(1730.9520959854126), 'p99_latency_ms': np.float64(1774.3791246414185), 'total_time_s': 5.437916040420532, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.05774508247670216, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
21 16 scaling_test claude-3-haiku 1365.4750227928162 2.998821435629251 0.0 0.0 1.0 0 20 5 1759345827.8750126 0.0256 10240 0.8483772503724578 {'min_latency_ms': 767.146110534668, 'max_latency_ms': 1936.8767738342285, 'p95_latency_ms': np.float64(1919.3583130836487), 'p99_latency_ms': np.float64(1933.3730816841125), 'total_time_s': 6.669286727905273, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.05705131022796498, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
22 1 scaling_test claude-3-sonnet 1360.187566280365 3.089520735450049 0.0 0.0 1.0 0 20 5 1759345837.7737727 0.15360000000000001 10240 0.8835217044830507 {'min_latency_ms': 550.3547191619873, 'max_latency_ms': 1977.1480560302734, 'p95_latency_ms': np.float64(1924.659264087677), 'p99_latency_ms': np.float64(1966.6502976417542), 'total_time_s': 6.473495960235596, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.058452629496046606, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
23 6 scaling_test claude-3-sonnet 1256.138801574707 3.4732685564079335 0.0 0.0 1.0 0 20 5 1759345848.5701082 0.15360000000000001 10240 0.8863139635356961 {'min_latency_ms': 641.2796974182129, 'max_latency_ms': 1980.7326793670654, 'p95_latency_ms': np.float64(1846.4025855064392), 'p99_latency_ms': np.float64(1953.86666059494), 'total_time_s': 5.758264780044556, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.05783521510861833, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
24 11 scaling_test claude-3-sonnet 1306.07008934021 3.5020347317551495 0.0 0.0 1.0 0 20 5 1759345858.6472163 0.15360000000000001 10240 0.9094961422561505 {'min_latency_ms': 591.8083190917969, 'max_latency_ms': 1971.1270332336426, 'p95_latency_ms': np.float64(1944.3620324134827), 'p99_latency_ms': np.float64(1965.7740330696106), 'total_time_s': 5.710965633392334, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.042442911768923584, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
25 16 scaling_test claude-3-sonnet 1307.1481943130493 3.262938882676132 0.0 0.0 1.0 0 20 5 1759345869.905544 0.15360000000000001 10240 0.8938240662052681 {'min_latency_ms': 646.7251777648926, 'max_latency_ms': 1990.9627437591553, 'p95_latency_ms': np.float64(1935.0676536560059), 'p99_latency_ms': np.float64(1979.7837257385254), 'total_time_s': 6.129443645477295, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.04247877605865338, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
26 1 scaling_test gemini-1.5-pro 1401.3476371765137 2.943218490521141 0.0 0.0 1.0 0 20 5 1759345881.238218 0.0128 10240 0.9409363720199192 {'min_latency_ms': 520.9827423095703, 'max_latency_ms': 1970.2589511871338, 'p95_latency_ms': np.float64(1958.1118822097778), 'p99_latency_ms': np.float64(1967.8295373916626), 'total_time_s': 6.7952821254730225, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.05267230653872383, 'data_size_processed': 1000, 'model_provider': 'gemini'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
27 6 scaling_test gemini-1.5-pro 1341.485834121704 3.3982951582179024 0.0 0.0 1.0 0 20 5 1759345889.5553467 0.0128 10240 0.9355344625586725 {'min_latency_ms': 503.9515495300293, 'max_latency_ms': 1978.0657291412354, 'p95_latency_ms': np.float64(1966.320013999939), 'p99_latency_ms': np.float64(1975.716586112976), 'total_time_s': 5.885303974151611, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.054780000845711954, 'data_size_processed': 1000, 'model_provider': 'gemini'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
28 11 scaling_test gemini-1.5-pro 1344.3536400794983 3.445457146125384 0.0 0.0 1.0 0 20 5 1759345898.4512925 0.0128 10240 0.9276983017835836 {'min_latency_ms': 615.3252124786377, 'max_latency_ms': 1981.612205505371, 'p95_latency_ms': np.float64(1803.935217857361), 'p99_latency_ms': np.float64(1946.0768079757688), 'total_time_s': 5.8047449588775635, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.05905363250623063, 'data_size_processed': 1000, 'model_provider': 'gemini'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
29 16 scaling_test gemini-1.5-pro 1202.2199511528015 3.696869831400932 0.0 0.0 1.0 0 20 5 1759345907.5707264 0.0128 10240 0.9307740387961949 {'min_latency_ms': 589.9953842163086, 'max_latency_ms': 1967.3075675964355, 'p95_latency_ms': np.float64(1913.6008977890015), 'p99_latency_ms': np.float64(1956.5662336349487), 'total_time_s': 5.409982204437256, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.04978369465928124, 'data_size_processed': 1000, 'model_provider': 'gemini'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
30 1 scaling_test gemini-1.5-flash 1053.9512276649475 3.823265280376166 0.0 0.0 1.0 0 20 5 1759345915.0947819 0.007679999999999998 10240 0.8813998853517441 {'min_latency_ms': -36.76271438598633, 'max_latency_ms': 1967.0710563659668, 'p95_latency_ms': np.float64(1855.4362535476685), 'p99_latency_ms': np.float64(1944.744095802307), 'total_time_s': 5.231130599975586, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0003839999999999999, 'quality_std': 0.050008698196664016, 'data_size_processed': 1000, 'model_provider': 'gemini'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
31 6 scaling_test gemini-1.5-flash 1155.3911447525024 3.615636866719992 0.0 0.0 1.0 0 20 5 1759345925.0694563 0.007679999999999998 10240 0.9025102091839412 {'min_latency_ms': 502.6116371154785, 'max_latency_ms': 1947.0453262329102, 'p95_latency_ms': np.float64(1765.414369106293), 'p99_latency_ms': np.float64(1910.7191348075864), 'total_time_s': 5.531528949737549, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0003839999999999999, 'quality_std': 0.059194105459554974, 'data_size_processed': 1000, 'model_provider': 'gemini'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
32 11 scaling_test gemini-1.5-flash 1217.6612257957458 3.756965086673101 0.0 0.0 1.0 0 20 5 1759345934.1183383 0.007679999999999998 10240 0.8709830012564668 {'min_latency_ms': 560.8868598937988, 'max_latency_ms': 2007.932424545288, 'p95_latency_ms': np.float64(1776.0017752647402), 'p99_latency_ms': np.float64(1961.5462946891782), 'total_time_s': 5.323445796966553, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0003839999999999999, 'quality_std': 0.052873446152615404, 'data_size_processed': 1000, 'model_provider': 'gemini'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
33 16 scaling_test gemini-1.5-flash 1351.5228390693665 3.367995990496259 0.0 0.0 1.0 0 20 5 1759345942.2099788 0.007679999999999998 10240 0.872315613940513 {'min_latency_ms': 689.1014575958252, 'max_latency_ms': 1980.147361755371, 'p95_latency_ms': np.float64(1956.2964797019958), 'p99_latency_ms': np.float64(1975.377185344696), 'total_time_s': 5.938249349594116, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0003839999999999999, 'quality_std': 0.05361394744479093, 'data_size_processed': 1000, 'model_provider': 'gemini'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
34 1 scaling_test llama-3.1-8b 1306.591236591339 3.3070039261320594 0.0 0.0 1.0 0 20 5 1759345952.8692935 0.002048000000000001 10240 0.7778348786353027 {'min_latency_ms': 555.4070472717285, 'max_latency_ms': 1988.0244731903076, 'p95_latency_ms': np.float64(1957.3988199234009), 'p99_latency_ms': np.float64(1981.8993425369263), 'total_time_s': 6.047770261764526, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000006, 'quality_std': 0.05832225784189981, 'data_size_processed': 1000, 'model_provider': 'llama'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
35 6 scaling_test llama-3.1-8b 1199.6222853660583 3.634358086220239 0.0 0.0 1.0 0 20 5 1759345963.5152647 0.002048000000000001 10240 0.7696592403957419 {'min_latency_ms': 541.0621166229248, 'max_latency_ms': 1914.41011428833, 'p95_latency_ms': np.float64(1768.0468797683716), 'p99_latency_ms': np.float64(1885.1374673843382), 'total_time_s': 5.503035068511963, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000006, 'quality_std': 0.06176209698043544, 'data_size_processed': 1000, 'model_provider': 'llama'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
36 11 scaling_test llama-3.1-8b 1143.358552455902 4.173916297150752 0.0 0.0 1.0 0 20 5 1759345973.8406181 0.002048000000000001 10240 0.7857043630038748 {'min_latency_ms': 631.817102432251, 'max_latency_ms': 1720.1111316680908, 'p95_latency_ms': np.float64(1547.544610500336), 'p99_latency_ms': np.float64(1685.5978274345396), 'total_time_s': 4.791662931442261, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000006, 'quality_std': 0.06142254552174686, 'data_size_processed': 1000, 'model_provider': 'llama'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
37 16 scaling_test llama-3.1-8b 1228.6048531532288 3.613465135130269 0.0 0.0 1.0 0 20 5 1759345982.2759545 0.002048000000000001 10240 0.7706622409066766 {'min_latency_ms': 539.0913486480713, 'max_latency_ms': 1971.7633724212646, 'p95_latency_ms': np.float64(1819.2362308502197), 'p99_latency_ms': np.float64(1941.2579441070554), 'total_time_s': 5.534853458404541, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000006, 'quality_std': 0.05320944570994387, 'data_size_processed': 1000, 'model_provider': 'llama'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
38 1 scaling_test llama-3.1-70b 1424.0724563598633 2.989394263900763 0.0 0.0 1.0 0 20 5 1759345993.4949126 0.008192000000000005 10240 0.8731561293258354 {'min_latency_ms': 700.6974220275879, 'max_latency_ms': 1959.3937397003174, 'p95_latency_ms': np.float64(1924.493396282196), 'p99_latency_ms': np.float64(1952.4136710166931), 'total_time_s': 6.690318584442139, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00040960000000000025, 'quality_std': 0.0352234743129485, 'data_size_processed': 1000, 'model_provider': 'llama'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
39 6 scaling_test llama-3.1-70b 1090.003514289856 4.145917207566353 0.0 0.0 1.0 0 20 5 1759346002.3353932 0.008192000000000005 10240 0.8796527768140011 {'min_latency_ms': 508.23211669921875, 'max_latency_ms': 1798.6392974853516, 'p95_latency_ms': np.float64(1785.5579257011414), 'p99_latency_ms': np.float64(1796.0230231285095), 'total_time_s': 4.824023008346558, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00040960000000000025, 'quality_std': 0.06407982743031454, 'data_size_processed': 1000, 'model_provider': 'llama'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
40 11 scaling_test llama-3.1-70b 964.3666982650757 4.70392645090585 0.0 0.0 1.0 0 20 5 1759346010.6974216 0.008192000000000005 10240 0.8992009479579495 {'min_latency_ms': 135.56504249572754, 'max_latency_ms': 1794.3906784057617, 'p95_latency_ms': np.float64(1775.5030393600464), 'p99_latency_ms': np.float64(1790.6131505966187), 'total_time_s': 4.251767158508301, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00040960000000000025, 'quality_std': 0.050182727925105516, 'data_size_processed': 1000, 'model_provider': 'llama'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
41 16 scaling_test llama-3.1-70b 1258.9476823806763 3.653831604110515 0.125 0.0 1.0 0 20 5 1759346020.388094 0.008192000000000005 10240 0.8930892849911802 {'min_latency_ms': 620.0413703918457, 'max_latency_ms': 1916.384220123291, 'p95_latency_ms': np.float64(1765.2448296546936), 'p99_latency_ms': np.float64(1886.1563420295713), 'total_time_s': 5.473706007003784, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.5546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00040960000000000025, 'quality_std': 0.04969618373257882, 'data_size_processed': 1000, 'model_provider': 'llama'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
42 5 concurrent_test gpt-4o-mini 1273.702096939087 0.7851086796926611 0.0 0.0 1.0 0 10 1 1759346033.2373884 0.0007680000000000001 5120 0.8342026655690804 {'min_latency_ms': 741.3482666015625, 'max_latency_ms': 1817.1906471252441, 'p95_latency_ms': np.float64(1794.5520520210266), 'p99_latency_ms': np.float64(1812.6629281044006), 'total_time_s': 12.737090110778809, 'initial_memory_mb': 294.5546875, 'final_memory_mb': 294.5546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000001e-05, 'quality_std': 0.0446055902590032, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
43 5 concurrent_test gpt-4o-mini 1511.399483680725 2.933763102440156 0.25 0.0 1.0 0 10 6 1759346036.647214 0.0007680000000000001 5120 0.8471277213854321 {'min_latency_ms': 800.0023365020752, 'max_latency_ms': 1982.2335243225098, 'p95_latency_ms': np.float64(1942.5656914710999), 'p99_latency_ms': np.float64(1974.2999577522278), 'total_time_s': 3.4085915088653564, 'initial_memory_mb': 294.5546875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000001e-05, 'quality_std': 0.06432848764341552, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
44 5 concurrent_test gpt-4o 1150.0491619110107 0.8695228900132853 0.0 0.0 1.0 0 10 1 1759346048.2587333 0.0256 5120 0.9599583095352598 {'min_latency_ms': 544.191837310791, 'max_latency_ms': 1584.9177837371826, 'p95_latency_ms': np.float64(1511.2051010131834), 'p99_latency_ms': np.float64(1570.1752471923828), 'total_time_s': 11.50055980682373, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.057087428808928614, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
45 5 concurrent_test gpt-4o 1241.9081926345825 3.22981029743519 0.0 0.0 1.0 0 10 6 1759346051.3563757 0.0256 5120 0.9585199558650109 {'min_latency_ms': 644.8915004730225, 'max_latency_ms': 1933.1202507019043, 'p95_latency_ms': np.float64(1865.2720570564268), 'p99_latency_ms': np.float64(1919.5506119728088), 'total_time_s': 3.0961570739746094, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.04062204558012218, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
46 5 concurrent_test gpt-4-turbo 1581.8750381469727 0.6321581179029606 0.0 0.0 1.0 0 10 1 1759346067.3017964 0.0512 5120 0.9324427514695872 {'min_latency_ms': 833.935022354126, 'max_latency_ms': 2019.5622444152832, 'p95_latency_ms': np.float64(1978.4671545028687), 'p99_latency_ms': np.float64(2011.3432264328003), 'total_time_s': 15.818827152252197, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.04654046504268862, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
47 5 concurrent_test gpt-4-turbo 1153.432297706604 3.2168993240245847 0.0 0.0 1.0 0 10 6 1759346070.4116762 0.0512 5120 0.9790878168553954 {'min_latency_ms': 635.2591514587402, 'max_latency_ms': 1833.7628841400146, 'p95_latency_ms': np.float64(1808.298635482788), 'p99_latency_ms': np.float64(1828.6700344085693), 'total_time_s': 3.108583450317383, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.038783270511690816, 'data_size_processed': 1000, 'model_provider': 'gpt'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
48 5 concurrent_test claude-3-5-sonnet 1397.6783752441406 0.7154680102707422 0.0 0.0 1.0 0 10 1 1759346084.5017824 0.015359999999999999 5120 0.9421283071854264 {'min_latency_ms': 532.8092575073242, 'max_latency_ms': 2028.5301208496094, 'p95_latency_ms': np.float64(1968.815779685974), 'p99_latency_ms': np.float64(2016.5872526168823), 'total_time_s': 13.976865291595459, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999998, 'quality_std': 0.041911119259679885, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
49 5 concurrent_test claude-3-5-sonnet 1215.26198387146 3.6278421983995233 0.0 0.0 1.0 0 10 6 1759346087.2596216 0.015359999999999999 5120 0.9131170426955485 {'min_latency_ms': 568.2053565979004, 'max_latency_ms': 1612.9648685455322, 'p95_latency_ms': np.float64(1559.6276402473447), 'p99_latency_ms': np.float64(1602.2974228858948), 'total_time_s': 2.7564594745635986, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999998, 'quality_std': 0.04319876804321411, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
50 5 concurrent_test claude-3-haiku 1299.2276906967163 0.7696826190331395 0.0 0.0 1.0 0 10 1 1759346100.364407 0.0128 5120 0.8252745814485088 {'min_latency_ms': 668.3671474456787, 'max_latency_ms': 2041.351318359375, 'p95_latency_ms': np.float64(1843.0875778198238), 'p99_latency_ms': np.float64(2001.6985702514648), 'total_time_s': 12.992368221282959, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.058205855327116265, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
51 5 concurrent_test claude-3-haiku 1297.508192062378 3.6581654644321087 0.0 0.0 1.0 0 10 6 1759346103.0993996 0.0128 5120 0.8496515913760503 {'min_latency_ms': 649.4293212890625, 'max_latency_ms': 1873.1675148010254, 'p95_latency_ms': np.float64(1843.8988208770752), 'p99_latency_ms': np.float64(1867.3137760162354), 'total_time_s': 2.7336106300354004, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.06872259975771335, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
52 5 concurrent_test claude-3-sonnet 1239.8123741149902 0.8065692205263874 0.0 0.0 1.0 0 10 1 1759346114.9650035 0.07680000000000001 5120 0.8917269647002374 {'min_latency_ms': 559.9334239959717, 'max_latency_ms': 1828.9196491241455, 'p95_latency_ms': np.float64(1804.089903831482), 'p99_latency_ms': np.float64(1823.9537000656128), 'total_time_s': 12.398191928863525, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.06728256480558785, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
53 5 concurrent_test claude-3-sonnet 1325.3875255584717 3.2305613290400945 0.0 0.0 1.0 0 10 6 1759346118.062173 0.07680000000000001 5120 0.8904253939966993 {'min_latency_ms': 598.4294414520264, 'max_latency_ms': 1956.3815593719482, 'p95_latency_ms': np.float64(1906.8223834037778), 'p99_latency_ms': np.float64(1946.4697241783142), 'total_time_s': 3.0954372882843018, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.06220445402424322, 'data_size_processed': 1000, 'model_provider': 'claude'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
54 5 concurrent_test gemini-1.5-pro 1264.2754554748535 0.7909630217832475 0.0 0.0 1.0 0 10 1 1759346130.8282964 0.0064 5120 0.8998460053229075 {'min_latency_ms': 532.9890251159668, 'max_latency_ms': 1795.492172241211, 'p95_latency_ms': np.float64(1745.6329107284544), 'p99_latency_ms': np.float64(1785.5203199386597), 'total_time_s': 12.642816066741943, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.04050886994282564, 'data_size_processed': 1000, 'model_provider': 'gemini'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
55 5 concurrent_test gemini-1.5-pro 1342.9006338119507 3.7829150181123015 0.0 0.0 1.0 0 10 6 1759346133.472956 0.0064 5120 0.9029938738274873 {'min_latency_ms': 701.9498348236084, 'max_latency_ms': 1964.576005935669, 'p95_latency_ms': np.float64(1872.5560665130613), 'p99_latency_ms': np.float64(1946.1720180511475), 'total_time_s': 2.6434640884399414, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.05723923041822323, 'data_size_processed': 1000, 'model_provider': 'gemini'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
56 5 concurrent_test gemini-1.5-flash 1368.2588577270508 0.7308515907093506 0.0 0.0 1.0 0 10 1 1759346147.2717574 0.0038399999999999997 5120 0.8795901650694117 {'min_latency_ms': 620.3913688659668, 'max_latency_ms': 2018.2685852050781, 'p95_latency_ms': np.float64(1993.7742233276367), 'p99_latency_ms': np.float64(2013.3697128295898), 'total_time_s': 13.682668447494507, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00038399999999999996, 'quality_std': 0.05927449072307118, 'data_size_processed': 1000, 'model_provider': 'gemini'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
57 5 concurrent_test gemini-1.5-flash 1207.8629732131958 3.2879592824302044 0.0 0.0 1.0 0 10 6 1759346150.314617 0.0038399999999999997 5120 0.8611774574826484 {'min_latency_ms': 594.973087310791, 'max_latency_ms': 1811.2657070159912, 'p95_latency_ms': np.float64(1681.6352963447569), 'p99_latency_ms': np.float64(1785.3396248817444), 'total_time_s': 3.041400194168091, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00038399999999999996, 'quality_std': 0.07904328865026665, 'data_size_processed': 1000, 'model_provider': 'gemini'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
58 5 concurrent_test llama-3.1-8b 1144.2910194396973 0.8738903631276332 0.0 0.0 1.0 0 10 1 1759346161.882389 0.0010240000000000002 5120 0.7805684315735588 {'min_latency_ms': 594.846248626709, 'max_latency_ms': 1759.0994834899902, 'p95_latency_ms': np.float64(1631.7564606666563), 'p99_latency_ms': np.float64(1733.6308789253235), 'total_time_s': 11.443083047866821, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000002, 'quality_std': 0.0613021253594286, 'data_size_processed': 1000, 'model_provider': 'llama'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
59 5 concurrent_test llama-3.1-8b 1128.666615486145 3.527006383973853 0.0 0.0 1.0 0 10 6 1759346164.7190907 0.0010240000000000002 5120 0.7915276538063776 {'min_latency_ms': 610.3026866912842, 'max_latency_ms': 1934.2899322509766, 'p95_latency_ms': np.float64(1909.2738270759583), 'p99_latency_ms': np.float64(1929.286711215973), 'total_time_s': 2.835265636444092, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000002, 'quality_std': 0.055242108041169316, 'data_size_processed': 1000, 'model_provider': 'llama'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
60 5 concurrent_test llama-3.1-70b 1341.410732269287 0.7454805363345477 0.0 0.0 1.0 0 10 1 1759346178.2571824 0.004096000000000001 5120 0.8513858389112968 {'min_latency_ms': 566.3845539093018, 'max_latency_ms': 1769.1750526428223, 'p95_latency_ms': np.float64(1743.9924359321594), 'p99_latency_ms': np.float64(1764.1385293006897), 'total_time_s': 13.414166450500488, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004096000000000001, 'quality_std': 0.06286695897481548, 'data_size_processed': 1000, 'model_provider': 'llama'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
61 5 concurrent_test llama-3.1-70b 1410.3811264038086 3.52022788340447 0.0 0.0 1.0 0 10 6 1759346181.0992308 0.004096000000000001 5120 0.8534058400920448 {'min_latency_ms': 572.9773044586182, 'max_latency_ms': 1928.0850887298584, 'p95_latency_ms': np.float64(1903.529143333435), 'p99_latency_ms': np.float64(1923.1738996505737), 'total_time_s': 2.8407251834869385, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004096000000000001, 'quality_std': 0.059750620144052545, 'data_size_processed': 1000, 'model_provider': 'llama'} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
62 5 memory_test gpt-4o-mini 1177.2440481185913 3.97501008701798 0.0 0.0 1.0 0 50 5 1759346193.7901201 0.0038400000000000023 25600 0.8512259391579574 {'min_latency_ms': 537.5485420227051, 'max_latency_ms': 2001.0862350463867, 'p95_latency_ms': np.float64(1892.5400853157041), 'p99_latency_ms': np.float64(1985.4257130622864), 'total_time_s': 12.578584432601929, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000005e-05, 'quality_std': 0.0581968026848211, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 0} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
63 5 memory_test gpt-4o-mini 1229.8026752471924 3.9282369679460363 0.0 0.0 1.0 0 50 5 1759346206.6300905 0.0038400000000000023 25600 0.8537868196468017 {'min_latency_ms': 518.6026096343994, 'max_latency_ms': 1944.331407546997, 'p95_latency_ms': np.float64(1909.6850633621214), 'p99_latency_ms': np.float64(1940.652117729187), 'total_time_s': 12.72835636138916, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000005e-05, 'quality_std': 0.05181407518487485, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 1} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
64 5 memory_test gpt-4o-mini 1274.8144483566284 3.7483119966709824 0.0 0.0 1.0 0 50 5 1759346220.0900073 0.0038400000000000023 25600 0.8487480924622282 {'min_latency_ms': 529.292106628418, 'max_latency_ms': 1996.4158535003662, 'p95_latency_ms': np.float64(1960.6919050216675), 'p99_latency_ms': np.float64(1988.2149648666382), 'total_time_s': 13.339337825775146, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000005e-05, 'quality_std': 0.05812899461310237, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 2} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
65 5 memory_test gpt-4o 1174.5057010650635 4.0514136389986115 0.0 0.0 1.0 0 50 5 1759346232.557784 0.12800000000000017 25600 0.9484191580718665 {'min_latency_ms': 286.58127784729004, 'max_latency_ms': 1877.345085144043, 'p95_latency_ms': np.float64(1735.1435780525208), 'p99_latency_ms': np.float64(1842.000467777252), 'total_time_s': 12.341371297836304, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0025600000000000032, 'quality_std': 0.0491398572941036, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 0} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
66 5 memory_test gpt-4o 1225.388593673706 3.875932429633176 0.125 0.0 1.0 0 50 5 1759346245.5669534 0.12800000000000017 25600 0.9557179217710832 {'min_latency_ms': 514.6803855895996, 'max_latency_ms': 2034.6620082855225, 'p95_latency_ms': np.float64(1909.4360709190366), 'p99_latency_ms': np.float64(2010.34743309021), 'total_time_s': 12.900121688842773, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0025600000000000032, 'quality_std': 0.04870463047338363, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 1} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
67 5 memory_test gpt-4o 1244.0021991729736 3.7266446101546777 0.0 0.0 1.0 0 50 5 1759346259.1414776 0.12800000000000017 25600 0.9458944372937584 {'min_latency_ms': 521.9912528991699, 'max_latency_ms': 1986.6855144500732, 'p95_latency_ms': np.float64(1953.3554077148438), 'p99_latency_ms': np.float64(1978.9683985710144), 'total_time_s': 13.416895151138306, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0025600000000000032, 'quality_std': 0.04851286804634898, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 2} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
68 5 memory_test gpt-4-turbo 1181.3615322113037 4.124998416603219 0.0 0.0 1.0 0 50 5 1759346271.374578 0.25600000000000034 25600 0.9651345363111258 {'min_latency_ms': 353.2071113586426, 'max_latency_ms': 1966.524362564087, 'p95_latency_ms': np.float64(1945.0057744979858), 'p99_latency_ms': np.float64(1965.7717752456665), 'total_time_s': 12.121216773986816, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0051200000000000065, 'quality_std': 0.04338778763022959, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 0} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
69 5 memory_test gpt-4-turbo 1291.4055681228638 3.77552400952112 0.0 0.0 1.0 0 50 5 1759346284.731812 0.25600000000000034 25600 0.9689389907566063 {'min_latency_ms': 555.095911026001, 'max_latency_ms': 2027.0910263061523, 'p95_latency_ms': np.float64(1966.5393114089964), 'p99_latency_ms': np.float64(2018.9284563064575), 'total_time_s': 13.243194818496704, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0051200000000000065, 'quality_std': 0.04154143035607859, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 1} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
70 5 memory_test gpt-4-turbo 1261.4208269119263 3.663208321130074 0.0 0.0 1.0 0 50 5 1759346298.4905493 0.25600000000000034 25600 0.9573488473081913 {'min_latency_ms': 284.8320007324219, 'max_latency_ms': 2011.866807937622, 'p95_latency_ms': np.float64(1975.5298137664795), 'p99_latency_ms': np.float64(2000.7115292549133), 'total_time_s': 13.649237394332886, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0051200000000000065, 'quality_std': 0.04380501534660363, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 2} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
71 5 memory_test claude-3-5-sonnet 1270.3543138504028 3.7944320989090614 0.0 0.0 1.0 0 50 5 1759346311.7936022 0.07680000000000001 25600 0.948463600922609 {'min_latency_ms': 622.9770183563232, 'max_latency_ms': 1970.0510501861572, 'p95_latency_ms': np.float64(1868.455410003662), 'p99_latency_ms': np.float64(1957.5506472587585), 'total_time_s': 13.177202463150024, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.001536, 'quality_std': 0.04872900892927657, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 0} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
72 5 memory_test claude-3-5-sonnet 1154.527621269226 4.107802148818313 0.0 0.0 1.0 0 50 5 1759346324.0782034 0.07680000000000001 25600 0.9535056752128789 {'min_latency_ms': 526.8404483795166, 'max_latency_ms': 1841.3877487182617, 'p95_latency_ms': np.float64(1815.3946280479431), 'p99_latency_ms': np.float64(1837.1384692192078), 'total_time_s': 12.171959161758423, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.001536, 'quality_std': 0.04600056992617095, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 1} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
73 5 memory_test claude-3-5-sonnet 1341.6658163070679 3.5050325493977805 0.0 0.0 1.0 0 50 5 1759346338.4560573 0.07680000000000001 25600 0.947231761746643 {'min_latency_ms': 607.1841716766357, 'max_latency_ms': 1968.3496952056885, 'p95_latency_ms': np.float64(1938.420307636261), 'p99_latency_ms': np.float64(1963.8122081756592), 'total_time_s': 14.265202760696411, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.001536, 'quality_std': 0.0468041040494112, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 2} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
74 5 memory_test claude-3-haiku 1268.9041805267334 3.6527405734902607 0.125 0.0 1.0 0 50 5 1759346352.2760284 0.06400000000000008 25600 0.8657832919908838 {'min_latency_ms': 576.9007205963135, 'max_latency_ms': 1978.3263206481934, 'p95_latency_ms': np.float64(1900.9657382965088), 'p99_latency_ms': np.float64(1977.4397349357605), 'total_time_s': 13.688352346420288, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0012800000000000016, 'quality_std': 0.05791027367020173, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 0} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
75 5 memory_test claude-3-haiku 1273.6989831924438 3.7602543777430877 0.0 0.0 1.0 0 50 5 1759346365.681829 0.06400000000000008 25600 0.8396294693060197 {'min_latency_ms': 521.7316150665283, 'max_latency_ms': 1988.7199401855469, 'p95_latency_ms': np.float64(1945.9344744682312), 'p99_latency_ms': np.float64(1987.1683859825134), 'total_time_s': 13.296972751617432, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0012800000000000016, 'quality_std': 0.06291349263235946, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 1} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
76 5 memory_test claude-3-haiku 1234.9269914627075 3.9335082345318124 0.0 0.0 1.0 0 50 5 1759346378.5192664 0.06400000000000008 25600 0.8469784358915146 {'min_latency_ms': 529.503345489502, 'max_latency_ms': 1981.7008972167969, 'p95_latency_ms': np.float64(1859.1547846794128), 'p99_latency_ms': np.float64(1963.3227896690369), 'total_time_s': 12.711299180984497, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0012800000000000016, 'quality_std': 0.061722943046806616, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 2} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
77 5 memory_test claude-3-sonnet 1195.9008169174194 4.06962738382444 0.0 0.0 1.0 0 50 5 1759346390.9144897 0.3840000000000003 25600 0.9026531444228556 {'min_latency_ms': -36.6673469543457, 'max_latency_ms': 1991.610050201416, 'p95_latency_ms': np.float64(1819.4202184677124), 'p99_latency_ms': np.float64(1987.222683429718), 'total_time_s': 12.286137104034424, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000005, 'quality_std': 0.058229589360407986, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 0} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
78 5 memory_test claude-3-sonnet 1372.0379829406738 3.502253345465805 0.0 0.0 1.0 0 50 5 1759346405.3043494 0.3840000000000003 25600 0.8837364473272626 {'min_latency_ms': 543.1270599365234, 'max_latency_ms': 1992.779016494751, 'p95_latency_ms': np.float64(1931.822681427002), 'p99_latency_ms': np.float64(1987.4089169502258), 'total_time_s': 14.276522874832153, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000005, 'quality_std': 0.05634614113838598, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 1} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
79 5 memory_test claude-3-sonnet 1257.2709035873413 3.7764857062182706 0.0 0.0 1.0 0 50 5 1759346418.6521854 0.3840000000000003 25600 0.9053414058751514 {'min_latency_ms': 529.8404693603516, 'max_latency_ms': 1990.1280403137207, 'p95_latency_ms': np.float64(1911.1806631088257), 'p99_latency_ms': np.float64(1976.6331052780151), 'total_time_s': 13.239822387695312, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000005, 'quality_std': 0.050506656009957705, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 2} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
80 5 memory_test gemini-1.5-pro 1221.5951490402222 3.8372908969845323 0.0 0.0 1.0 0 50 5 1759346431.7921565 0.03200000000000004 25600 0.9365925291921394 {'min_latency_ms': 329.1811943054199, 'max_latency_ms': 1995.384693145752, 'p95_latency_ms': np.float64(1965.0332808494568), 'p99_latency_ms': np.float64(1988.3063769340515), 'total_time_s': 13.030025959014893, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0006400000000000008, 'quality_std': 0.04847128641002876, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 0} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
81 5 memory_test gemini-1.5-pro 1351.8355464935303 3.6227975436552606 0.0 0.0 1.0 0 50 5 1759346445.7126448 0.03200000000000004 25600 0.9323552590826123 {'min_latency_ms': 515.129566192627, 'max_latency_ms': 2008.0702304840088, 'p95_latency_ms': np.float64(1958.6564779281616), 'p99_latency_ms': np.float64(2004.1296029090881), 'total_time_s': 13.801488876342773, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0006400000000000008, 'quality_std': 0.055840796126395656, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 1} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
82 5 memory_test gemini-1.5-pro 1240.622534751892 3.8813384098374453 0.0 0.0 1.0 0 50 5 1759346458.7192729 0.03200000000000004 25600 0.9407390543744837 {'min_latency_ms': -29.146671295166016, 'max_latency_ms': 1934.4398975372314, 'p95_latency_ms': np.float64(1849.7230291366577), 'p99_latency_ms': np.float64(1918.0084466934204), 'total_time_s': 12.8821542263031, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0006400000000000008, 'quality_std': 0.050597003908357786, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 2} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
83 5 memory_test gemini-1.5-flash 1237.6702642440796 3.812923495644346 0.0 0.0 1.0 0 50 5 1759346471.9588974 0.019200000000000002 25600 0.8556073429019542 {'min_latency_ms': 536.4787578582764, 'max_latency_ms': 2010.1728439331055, 'p95_latency_ms': np.float64(1911.8669629096985), 'p99_latency_ms': np.float64(1976.080708503723), 'total_time_s': 13.113297462463379, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.000384, 'quality_std': 0.06082135675952047, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 0} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
84 5 memory_test gemini-1.5-flash 1180.0980806350708 4.016049090832003 0.0 0.0 1.0 0 50 5 1759346484.5327744 0.019200000000000002 25600 0.8718428063415768 {'min_latency_ms': 109.58051681518555, 'max_latency_ms': 1993.358850479126, 'p95_latency_ms': np.float64(1872.3165988922117), 'p99_latency_ms': np.float64(1992.416422367096), 'total_time_s': 12.450047016143799, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.000384, 'quality_std': 0.0613916834940056, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 1} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
85 5 memory_test gemini-1.5-flash 1194.4490098953247 4.009936119483076 0.0 0.0 1.0 0 50 5 1759346497.1201088 0.019200000000000002 25600 0.8652112059805899 {'min_latency_ms': 520.3211307525635, 'max_latency_ms': 1942.4259662628174, 'p95_latency_ms': np.float64(1834.6370577812195), 'p99_latency_ms': np.float64(1890.3984904289243), 'total_time_s': 12.469026565551758, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.000384, 'quality_std': 0.05312368368226588, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 2} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
86 5 memory_test llama-3.1-8b 1306.2016773223877 3.683763547696555 0.0 0.0 1.0 0 50 5 1759346510.812732 0.005119999999999998 25600 0.7727309350554936 {'min_latency_ms': 527.4953842163086, 'max_latency_ms': 1997.086524963379, 'p95_latency_ms': np.float64(1942.7793741226194), 'p99_latency_ms': np.float64(1994.0643763542175), 'total_time_s': 13.573075294494629, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010239999999999995, 'quality_std': 0.05596283861854901, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 0} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
87 5 memory_test llama-3.1-8b 1304.1251468658447 3.617383744773005 0.0 0.0 1.0 0 50 5 1759346524.7711937 0.005119999999999998 25600 0.785787220179362 {'min_latency_ms': 112.00571060180664, 'max_latency_ms': 2015.146255493164, 'p95_latency_ms': np.float64(2001.4938592910767), 'p99_latency_ms': np.float64(2012.321424484253), 'total_time_s': 13.822144269943237, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010239999999999995, 'quality_std': 0.0552285639827787, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 1} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
88 5 memory_test llama-3.1-8b 1290.5346298217773 3.671522710311051 0.0 0.0 1.0 0 50 5 1759346538.5084107 0.005119999999999998 25600 0.7771978709125356 {'min_latency_ms': 565.7510757446289, 'max_latency_ms': 1945.1093673706055, 'p95_latency_ms': np.float64(1906.785237789154), 'p99_latency_ms': np.float64(1942.4526476860046), 'total_time_s': 13.618327856063843, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010239999999999995, 'quality_std': 0.057252814774054535, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 2} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
89 5 memory_test llama-3.1-70b 1213.9334726333618 3.947675276737486 0.0 0.0 1.0 0 50 5 1759346551.2951744 0.02047999999999999 25600 0.8683286341213061 {'min_latency_ms': -79.86569404602051, 'max_latency_ms': 2014.9149894714355, 'p95_latency_ms': np.float64(1919.9433565139768), 'p99_latency_ms': np.float64(1992.4925136566162), 'total_time_s': 12.665682077407837, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004095999999999998, 'quality_std': 0.05862810413022958, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 0} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
90 5 memory_test llama-3.1-70b 1298.1958770751953 3.7049711897976763 0.0 0.0 1.0 0 50 5 1759346564.9280033 0.02047999999999999 25600 0.8889975698232048 {'min_latency_ms': 503.5574436187744, 'max_latency_ms': 2020.4124450683594, 'p95_latency_ms': np.float64(1901.4497756958008), 'p99_latency_ms': np.float64(1986.3133001327512), 'total_time_s': 13.495381593704224, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004095999999999998, 'quality_std': 0.053463278827038344, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 1} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False
91 5 memory_test llama-3.1-70b 1187.040138244629 4.165139112812611 0.0 0.0 1.0 0 50 5 1759346577.0467978 0.02047999999999999 25600 0.8884529182459214 {'min_latency_ms': 506.2377452850342, 'max_latency_ms': 2026.6106128692627, 'p95_latency_ms': np.float64(1958.3556652069092), 'p99_latency_ms': np.float64(2007.5032830238342), 'total_time_s': 12.004400968551636, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004095999999999998, 'quality_std': 0.05625669416735748, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 2} 0.0 0.0 0.0 0.0 0 False 0 0.0 0 0.0 False

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Before

Width:  |  Height:  |  Size: 19 MiB

After

Width:  |  Height:  |  Size: 19 MiB

Before

Width:  |  Height:  |  Size: 492 KiB

After

Width:  |  Height:  |  Size: 492 KiB

@ -0,0 +1,293 @@
"""
Tests for bug #1115 fix in AutoSwarmBuilder.
This test module verifies the fix for AttributeError when creating agents
from AgentSpec Pydantic models in AutoSwarmBuilder.
Bug: https://github.com/kyegomez/swarms/issues/1115
"""
import pytest
from swarms.structs.agent import Agent
from swarms.structs.auto_swarm_builder import (
AgentSpec,
AutoSwarmBuilder,
)
from swarms.structs.ma_utils import set_random_models_for_agents
class TestAutoSwarmBuilderFix:
"""Tests for bug #1115 fix in AutoSwarmBuilder."""
def test_create_agents_from_specs_with_dict(self):
"""Test that create_agents_from_specs handles dict input correctly."""
builder = AutoSwarmBuilder()
# Create specs as a dictionary
specs = {
"agents": [
{
"agent_name": "test_agent_1",
"description": "Test agent 1 description",
"system_prompt": "You are a helpful assistant",
"model_name": "gpt-4o-mini",
"max_loops": 1,
}
]
}
agents = builder.create_agents_from_specs(specs)
# Verify agents were created correctly
assert len(agents) == 1
assert isinstance(agents[0], Agent)
assert agents[0].agent_name == "test_agent_1"
# Verify description was mapped to agent_description
assert hasattr(agents[0], "agent_description")
assert (
agents[0].agent_description == "Test agent 1 description"
)
def test_create_agents_from_specs_with_pydantic(self):
"""Test that create_agents_from_specs handles Pydantic model input correctly.
This is the main test for bug #1115 - it verifies that AgentSpec
Pydantic models can be unpacked correctly.
"""
builder = AutoSwarmBuilder()
# Create specs as Pydantic AgentSpec objects
agent_spec = AgentSpec(
agent_name="test_agent_pydantic",
description="Pydantic test agent",
system_prompt="You are a helpful assistant",
model_name="gpt-4o-mini",
max_loops=1,
)
specs = {"agents": [agent_spec]}
agents = builder.create_agents_from_specs(specs)
# Verify agents were created correctly
assert len(agents) == 1
assert isinstance(agents[0], Agent)
assert agents[0].agent_name == "test_agent_pydantic"
# Verify description was mapped to agent_description
assert hasattr(agents[0], "agent_description")
assert agents[0].agent_description == "Pydantic test agent"
def test_parameter_name_mapping(self):
"""Test that 'description' field maps to 'agent_description' correctly."""
builder = AutoSwarmBuilder()
# Test with dict that has 'description'
specs = {
"agents": [
{
"agent_name": "mapping_test",
"description": "This should map to agent_description",
"system_prompt": "You are helpful",
}
]
}
agents = builder.create_agents_from_specs(specs)
assert len(agents) == 1
agent = agents[0]
# Verify description was mapped
assert hasattr(agent, "agent_description")
assert (
agent.agent_description
== "This should map to agent_description"
)
def test_create_agents_from_specs_mixed_input(self):
"""Test that create_agents_from_specs handles mixed dict and Pydantic input."""
builder = AutoSwarmBuilder()
# Mix of dict and Pydantic objects
dict_spec = {
"agent_name": "dict_agent",
"description": "Dict agent description",
"system_prompt": "You are helpful",
}
pydantic_spec = AgentSpec(
agent_name="pydantic_agent",
description="Pydantic agent description",
system_prompt="You are smart",
)
specs = {"agents": [dict_spec, pydantic_spec]}
agents = builder.create_agents_from_specs(specs)
# Verify both agents were created
assert len(agents) == 2
assert all(isinstance(agent, Agent) for agent in agents)
# Verify both have correct descriptions
dict_agent = next(
a for a in agents if a.agent_name == "dict_agent"
)
pydantic_agent = next(
a for a in agents if a.agent_name == "pydantic_agent"
)
assert (
dict_agent.agent_description == "Dict agent description"
)
assert (
pydantic_agent.agent_description
== "Pydantic agent description"
)
def test_set_random_models_for_agents_with_valid_agents(
self,
):
"""Test set_random_models_for_agents with proper Agent objects."""
# Create proper Agent objects
agents = [
Agent(
agent_name="agent1",
system_prompt="You are agent 1",
max_loops=1,
),
Agent(
agent_name="agent2",
system_prompt="You are agent 2",
max_loops=1,
),
]
# Set random models
model_names = ["gpt-4o-mini", "gpt-4o", "claude-3-5-sonnet"]
result = set_random_models_for_agents(
agents=agents, model_names=model_names
)
# Verify results
assert len(result) == 2
assert all(isinstance(agent, Agent) for agent in result)
assert all(hasattr(agent, "model_name") for agent in result)
assert all(
agent.model_name in model_names for agent in result
)
def test_set_random_models_for_agents_with_single_agent(
self,
):
"""Test set_random_models_for_agents with a single agent."""
agent = Agent(
agent_name="single_agent",
system_prompt="You are helpful",
max_loops=1,
)
model_names = ["gpt-4o-mini", "gpt-4o"]
result = set_random_models_for_agents(
agents=agent, model_names=model_names
)
assert isinstance(result, Agent)
assert hasattr(result, "model_name")
assert result.model_name in model_names
def test_set_random_models_for_agents_with_none(self):
"""Test set_random_models_for_agents with None returns random model name."""
model_names = ["gpt-4o-mini", "gpt-4o", "claude-3-5-sonnet"]
result = set_random_models_for_agents(
agents=None, model_names=model_names
)
assert isinstance(result, str)
assert result in model_names
@pytest.mark.skip(
reason="This test requires API key and makes LLM calls"
)
def test_auto_swarm_builder_return_agents_objects_integration(
self,
):
"""Integration test for AutoSwarmBuilder with execution_type='return-agents-objects'.
This test requires OPENAI_API_KEY and makes actual LLM calls.
Run manually with: pytest -k test_auto_swarm_builder_return_agents_objects_integration -v
"""
builder = AutoSwarmBuilder(
execution_type="return-agents-objects",
model_name="gpt-4o-mini",
max_loops=1,
verbose=False,
)
agents = builder.run(
"Create a team of 2 data analysis agents with specific roles"
)
# Verify agents were created
assert isinstance(agents, list)
assert len(agents) >= 1
assert all(isinstance(agent, Agent) for agent in agents)
assert all(hasattr(agent, "agent_name") for agent in agents)
assert all(
hasattr(agent, "agent_description") for agent in agents
)
def test_agent_spec_to_agent_all_fields(self):
"""Test that all AgentSpec fields are properly passed to Agent."""
builder = AutoSwarmBuilder()
agent_spec = AgentSpec(
agent_name="full_test_agent",
description="Full test description",
system_prompt="You are a comprehensive test agent",
model_name="gpt-4o-mini",
auto_generate_prompt=False,
max_tokens=4096,
temperature=0.7,
role="worker",
max_loops=3,
goal="Test all parameters",
)
agents = builder.create_agents_from_specs(
{"agents": [agent_spec]}
)
assert len(agents) == 1
agent = agents[0]
# Verify all fields were set
assert agent.agent_name == "full_test_agent"
assert agent.agent_description == "Full test description"
# Agent may modify system_prompt by adding additional instructions
assert (
"You are a comprehensive test agent"
in agent.system_prompt
)
assert agent.max_loops == 3
assert agent.max_tokens == 4096
assert agent.temperature == 0.7
def test_create_agents_from_specs_empty_list(self):
"""Test that create_agents_from_specs handles empty agent list."""
builder = AutoSwarmBuilder()
specs = {"agents": []}
agents = builder.create_agents_from_specs(specs)
assert isinstance(agents, list)
assert len(agents) == 0
if __name__ == "__main__":
# Run tests with pytest
pytest.main([__file__, "-v", "--tb=short"])

@ -1,10 +1,8 @@
import uuid import uuid
from swarms.telemetry.main import ( from swarms.telemetry.main import (
generate_unique_identifier,
generate_user_id, generate_user_id,
get_machine_id, get_machine_id,
get_system_info,
) )
@ -24,33 +22,6 @@ def test_get_machine_id():
assert all(char in "0123456789abcdef" for char in machine_id) assert all(char in "0123456789abcdef" for char in machine_id)
def test_get_system_info():
# Get system information and ensure it's a dictionary with expected keys
system_info = get_system_info()
assert isinstance(system_info, dict)
expected_keys = [
"platform",
"platform_release",
"platform_version",
"architecture",
"hostname",
"ip_address",
"mac_address",
"processor",
"python_version",
]
assert all(key in system_info for key in expected_keys)
def test_generate_unique_identifier():
# Generate unique identifiers and ensure they are valid UUID strings
unique_id = generate_unique_identifier()
assert isinstance(unique_id, str)
assert uuid.UUID(
unique_id, version=5, namespace=uuid.NAMESPACE_DNS
)
def test_generate_user_id_edge_case(): def test_generate_user_id_edge_case():
# Test generate_user_id with multiple calls # Test generate_user_id with multiple calls
user_ids = set() user_ids = set()
@ -69,33 +40,13 @@ def test_get_machine_id_edge_case():
assert len(machine_ids) == 100 # Ensure generated IDs are unique assert len(machine_ids) == 100 # Ensure generated IDs are unique
def test_get_system_info_edge_case():
# Test get_system_info for consistency
system_info1 = get_system_info()
system_info2 = get_system_info()
assert (
system_info1 == system_info2
) # Ensure system info remains the same
def test_generate_unique_identifier_edge_case():
# Test generate_unique_identifier for uniqueness
unique_ids = set()
for _ in range(100):
unique_id = generate_unique_identifier()
unique_ids.add(unique_id)
assert len(unique_ids) == 100 # Ensure generated IDs are unique
def test_all(): def test_all():
test_generate_user_id() test_generate_user_id()
test_get_machine_id() test_get_machine_id()
test_get_system_info()
test_generate_unique_identifier()
test_generate_user_id_edge_case() test_generate_user_id_edge_case()
test_get_machine_id_edge_case() test_get_machine_id_edge_case()
test_get_system_info_edge_case()
test_generate_unique_identifier_edge_case()
test_all() test_all()

@ -1,37 +1,29 @@
import os
import json import json
import os
from datetime import datetime from datetime import datetime
from typing import List, Dict, Any, Callable from typing import Any, Callable, Dict, List
from dotenv import load_dotenv from dotenv import load_dotenv
from loguru import logger
# Basic Imports for Swarms # Basic Imports for Swarms
from swarms.structs import ( from swarms.structs import (
Agent, Agent,
SequentialWorkflow,
ConcurrentWorkflow,
AgentRearrange, AgentRearrange,
MixtureOfAgents, ConcurrentWorkflow,
SpreadSheetSwarm,
GroupChat, GroupChat,
MultiAgentRouter, InteractiveGroupChat,
MajorityVoting, MajorityVoting,
SwarmRouter, MixtureOfAgents,
MultiAgentRouter,
RoundRobinSwarm, RoundRobinSwarm,
InteractiveGroupChat, SequentialWorkflow,
SpreadSheetSwarm,
SwarmRouter,
) )
# Import swarms not in __init__.py directly
from swarms.structs.hiearchical_swarm import HierarchicalSwarm from swarms.structs.hiearchical_swarm import HierarchicalSwarm
from swarms.structs.tree_swarm import ForestSwarm, Tree, TreeAgent from swarms.structs.tree_swarm import ForestSwarm, Tree, TreeAgent
# Setup Logging
from loguru import logger
logger.add(
"test_runs/test_failures.log", rotation="10 MB", level="ERROR"
)
# Load environment variables # Load environment variables
load_dotenv() load_dotenv()
@ -463,8 +455,8 @@ def test_spreadsheet_swarm():
def test_hierarchical_swarm(): def test_hierarchical_swarm():
"""Test HierarchicalSwarm structure""" """Test HierarchicalSwarm structure"""
try: try:
from swarms.utils.litellm_wrapper import LiteLLM
from swarms.structs.hiearchical_swarm import SwarmSpec from swarms.structs.hiearchical_swarm import SwarmSpec
from swarms.utils.litellm_wrapper import LiteLLM
# Create worker agents # Create worker agents
workers = [ workers = [

@ -0,0 +1,150 @@
from pydantic import BaseModel
from swarms.tools.pydantic_to_json import (
base_model_to_openai_function,
multi_base_model_to_openai_function,
)
from swarms.tools.base_tool import BaseTool
# Test Pydantic model
class TestModel(BaseModel):
"""A test model for validation."""
name: str
age: int
email: str = "test@example.com"
def test_base_model_to_openai_function():
"""Test that base_model_to_openai_function accepts output_str parameter."""
print(
"Testing base_model_to_openai_function with output_str=False..."
)
result_dict = base_model_to_openai_function(
TestModel, output_str=False
)
print(f"✓ Dict result type: {type(result_dict)}")
print(f"✓ Dict result keys: {list(result_dict.keys())}")
print(
"\nTesting base_model_to_openai_function with output_str=True..."
)
result_str = base_model_to_openai_function(
TestModel, output_str=True
)
print(f"✓ String result type: {type(result_str)}")
print(f"✓ String result preview: {result_str[:100]}...")
def test_multi_base_model_to_openai_function():
"""Test that multi_base_model_to_openai_function handles output_str correctly."""
print(
"\nTesting multi_base_model_to_openai_function with output_str=False..."
)
result_dict = multi_base_model_to_openai_function(
[TestModel], output_str=False
)
print(f"✓ Dict result type: {type(result_dict)}")
print(f"✓ Dict result keys: {list(result_dict.keys())}")
print(
"\nTesting multi_base_model_to_openai_function with output_str=True..."
)
result_str = multi_base_model_to_openai_function(
[TestModel], output_str=True
)
print(f"✓ String result type: {type(result_str)}")
print(f"✓ String result preview: {result_str[:100]}...")
def test_base_tool_methods():
"""Test that BaseTool methods handle output_str parameter correctly."""
print(
"\nTesting BaseTool.base_model_to_dict with output_str=False..."
)
tool = BaseTool()
result_dict = tool.base_model_to_dict(TestModel, output_str=False)
print(f"✓ Dict result type: {type(result_dict)}")
print(f"✓ Dict result keys: {list(result_dict.keys())}")
print(
"\nTesting BaseTool.base_model_to_dict with output_str=True..."
)
result_str = tool.base_model_to_dict(TestModel, output_str=True)
print(f"✓ String result type: {type(result_str)}")
print(f"✓ String result preview: {result_str[:100]}...")
print(
"\nTesting BaseTool.multi_base_models_to_dict with output_str=False..."
)
result_dict = tool.multi_base_models_to_dict(
[TestModel], output_str=False
)
print(f"✓ Dict result type: {type(result_dict)}")
print(f"✓ Dict result length: {len(result_dict)}")
print(
"\nTesting BaseTool.multi_base_models_to_dict with output_str=True..."
)
result_str = tool.multi_base_models_to_dict(
[TestModel], output_str=True
)
print(f"✓ String result type: {type(result_str)}")
print(f"✓ String result preview: {result_str[:100]}...")
def test_agent_integration():
"""Test that the Agent class can use the fixed methods without errors."""
print("\nTesting Agent integration...")
try:
from swarms import Agent
# Create a simple agent with a tool schema
agent = Agent(
model_name="gpt-4o-mini",
tool_schema=TestModel,
max_loops=1,
verbose=True,
)
# This should not raise an error anymore
agent.handle_tool_schema_ops()
print(
"✓ Agent.handle_tool_schema_ops() completed successfully"
)
except Exception as e:
print(f"✗ Agent integration failed: {e}")
return False
return True
if __name__ == "__main__":
print("=" * 60)
print("Testing output_str parameter fix")
print("=" * 60)
try:
test_base_model_to_openai_function()
test_multi_base_model_to_openai_function()
test_base_tool_methods()
if test_agent_integration():
print("\n" + "=" * 60)
print(
"✅ All tests passed! The output_str parameter fix is working correctly."
)
print("=" * 60)
else:
print("\n" + "=" * 60)
print(
"❌ Some tests failed. Please check the implementation."
)
print("=" * 60)
except Exception as e:
print(f"\n❌ Test failed with error: {e}")
import traceback
traceback.print_exc()

@ -3,14 +3,6 @@ from dotenv import load_dotenv
load_dotenv() load_dotenv()
## [OPTIONAL] REGISTER MODEL - not all ollama models support function calling, litellm defaults to json mode tool calls if native tool calling not supported.
# litellm.register_model(model_cost={
# "ollama_chat/llama3.1": {
# "supports_function_calling": true
# },
# })
tools = [ tools = [
{ {
"type": "function", "type": "function",

@ -0,0 +1,431 @@
"""
Test suite for the custom docstring parser implementation.
This module contains comprehensive tests to ensure the docstring parser
works correctly with various docstring formats and edge cases.
"""
import pytest
from swarms.utils.docstring_parser import (
parse,
DocstringParam,
)
class TestDocstringParser:
"""Test cases for the docstring parser functionality."""
def test_empty_docstring(self):
"""Test parsing of empty docstring."""
result = parse("")
assert result.short_description is None
assert result.params == []
def test_none_docstring(self):
"""Test parsing of None docstring."""
result = parse(None)
assert result.short_description is None
assert result.params == []
def test_whitespace_only_docstring(self):
"""Test parsing of whitespace-only docstring."""
result = parse(" \n \t \n ")
assert result.short_description is None
assert result.params == []
def test_simple_docstring_no_args(self):
"""Test parsing of simple docstring without Args section."""
docstring = """
This is a simple function.
Returns:
str: A simple string
"""
result = parse(docstring)
assert (
result.short_description == "This is a simple function."
)
assert result.params == []
def test_docstring_with_args(self):
"""Test parsing of docstring with Args section."""
docstring = """
This is a test function.
Args:
param1 (str): First parameter
param2 (int): Second parameter
param3 (bool, optional): Third parameter with default
Returns:
str: Return value description
"""
result = parse(docstring)
assert result.short_description == "This is a test function."
assert len(result.params) == 3
assert result.params[0] == DocstringParam(
"param1", "First parameter"
)
assert result.params[1] == DocstringParam(
"param2", "Second parameter"
)
assert result.params[2] == DocstringParam(
"param3", "Third parameter with default"
)
def test_docstring_with_parameters_section(self):
"""Test parsing of docstring with Parameters section."""
docstring = """
Another test function.
Parameters:
name (str): The name parameter
age (int): The age parameter
Returns:
None: Nothing is returned
"""
result = parse(docstring)
assert result.short_description == "Another test function."
assert len(result.params) == 2
assert result.params[0] == DocstringParam(
"name", "The name parameter"
)
assert result.params[1] == DocstringParam(
"age", "The age parameter"
)
def test_docstring_with_multiline_param_description(self):
"""Test parsing of docstring with multiline parameter descriptions."""
docstring = """
Function with multiline descriptions.
Args:
param1 (str): This is a very long description
that spans multiple lines and should be
properly concatenated.
param2 (int): Short description
Returns:
str: Result
"""
result = parse(docstring)
assert (
result.short_description
== "Function with multiline descriptions."
)
assert len(result.params) == 2
expected_desc = "This is a very long description that spans multiple lines and should be properly concatenated."
assert result.params[0] == DocstringParam(
"param1", expected_desc
)
assert result.params[1] == DocstringParam(
"param2", "Short description"
)
def test_docstring_without_type_annotations(self):
"""Test parsing of docstring without type annotations."""
docstring = """
Function without type annotations.
Args:
param1: First parameter without type
param2: Second parameter without type
Returns:
str: Result
"""
result = parse(docstring)
assert (
result.short_description
== "Function without type annotations."
)
assert len(result.params) == 2
assert result.params[0] == DocstringParam(
"param1", "First parameter without type"
)
assert result.params[1] == DocstringParam(
"param2", "Second parameter without type"
)
def test_pydantic_style_docstring(self):
"""Test parsing of Pydantic-style docstring."""
docstring = """
Convert a Pydantic model to a dictionary representation of functions.
Args:
pydantic_type (type[BaseModel]): The Pydantic model type to convert.
Returns:
dict[str, Any]: A dictionary representation of the functions.
"""
result = parse(docstring)
assert (
result.short_description
== "Convert a Pydantic model to a dictionary representation of functions."
)
assert len(result.params) == 1
assert result.params[0] == DocstringParam(
"pydantic_type", "The Pydantic model type to convert."
)
def test_docstring_with_various_sections(self):
"""Test parsing of docstring with multiple sections."""
docstring = """
Complex function with multiple sections.
Args:
input_data (dict): Input data dictionary
validate (bool): Whether to validate input
Returns:
dict: Processed data
Raises:
ValueError: If input is invalid
Note:
This is a note section
Example:
>>> result = complex_function({"key": "value"})
"""
result = parse(docstring)
assert (
result.short_description
== "Complex function with multiple sections."
)
assert len(result.params) == 2
assert result.params[0] == DocstringParam(
"input_data", "Input data dictionary"
)
assert result.params[1] == DocstringParam(
"validate", "Whether to validate input"
)
def test_docstring_with_see_also_section(self):
"""Test parsing of docstring with See Also section."""
docstring = """
Function with See Also section.
Args:
param1 (str): First parameter
See Also:
related_function: For related functionality
"""
result = parse(docstring)
assert (
result.short_description
== "Function with See Also section."
)
assert len(result.params) == 1
assert result.params[0] == DocstringParam(
"param1", "First parameter"
)
def test_docstring_with_see_also_underscore_section(self):
"""Test parsing of docstring with See_Also section (underscore variant)."""
docstring = """
Function with See_Also section.
Args:
param1 (str): First parameter
See_Also:
related_function: For related functionality
"""
result = parse(docstring)
assert (
result.short_description
== "Function with See_Also section."
)
assert len(result.params) == 1
assert result.params[0] == DocstringParam(
"param1", "First parameter"
)
def test_docstring_with_yields_section(self):
"""Test parsing of docstring with Yields section."""
docstring = """
Generator function.
Args:
items (list): List of items to process
Yields:
str: Processed item
"""
result = parse(docstring)
assert result.short_description == "Generator function."
assert len(result.params) == 1
assert result.params[0] == DocstringParam(
"items", "List of items to process"
)
def test_docstring_with_raises_section(self):
"""Test parsing of docstring with Raises section."""
docstring = """
Function that can raise exceptions.
Args:
value (int): Value to process
Raises:
ValueError: If value is negative
"""
result = parse(docstring)
assert (
result.short_description
== "Function that can raise exceptions."
)
assert len(result.params) == 1
assert result.params[0] == DocstringParam(
"value", "Value to process"
)
def test_docstring_with_examples_section(self):
"""Test parsing of docstring with Examples section."""
docstring = """
Function with examples.
Args:
x (int): Input value
Examples:
>>> result = example_function(5)
>>> print(result)
"""
result = parse(docstring)
assert result.short_description == "Function with examples."
assert len(result.params) == 1
assert result.params[0] == DocstringParam("x", "Input value")
def test_docstring_with_note_section(self):
"""Test parsing of docstring with Note section."""
docstring = """
Function with a note.
Args:
data (str): Input data
Note:
This function is deprecated
"""
result = parse(docstring)
assert result.short_description == "Function with a note."
assert len(result.params) == 1
assert result.params[0] == DocstringParam(
"data", "Input data"
)
def test_docstring_with_complex_type_annotations(self):
"""Test parsing of docstring with complex type annotations."""
docstring = """
Function with complex types.
Args:
data (List[Dict[str, Any]]): Complex data structure
callback (Callable[[str], int]): Callback function
optional (Optional[str], optional): Optional parameter
Returns:
Union[str, None]: Result or None
"""
result = parse(docstring)
assert (
result.short_description == "Function with complex types."
)
assert len(result.params) == 3
assert result.params[0] == DocstringParam(
"data", "Complex data structure"
)
assert result.params[1] == DocstringParam(
"callback", "Callback function"
)
assert result.params[2] == DocstringParam(
"optional", "Optional parameter"
)
def test_docstring_with_no_description(self):
"""Test parsing of docstring with no description, only Args."""
docstring = """
Args:
param1 (str): First parameter
param2 (int): Second parameter
"""
result = parse(docstring)
assert result.short_description is None
assert len(result.params) == 2
assert result.params[0] == DocstringParam(
"param1", "First parameter"
)
assert result.params[1] == DocstringParam(
"param2", "Second parameter"
)
def test_docstring_with_empty_args_section(self):
"""Test parsing of docstring with empty Args section."""
docstring = """
Function with empty Args section.
Args:
Returns:
str: Result
"""
result = parse(docstring)
assert (
result.short_description
== "Function with empty Args section."
)
assert result.params == []
def test_docstring_with_mixed_indentation(self):
"""Test parsing of docstring with mixed indentation."""
docstring = """
Function with mixed indentation.
Args:
param1 (str): First parameter
with continuation
param2 (int): Second parameter
"""
result = parse(docstring)
assert (
result.short_description
== "Function with mixed indentation."
)
assert len(result.params) == 2
assert result.params[0] == DocstringParam(
"param1", "First parameter with continuation"
)
assert result.params[1] == DocstringParam(
"param2", "Second parameter"
)
def test_docstring_with_tab_indentation(self):
"""Test parsing of docstring with tab indentation."""
docstring = """
Function with tab indentation.
Args:
param1 (str): First parameter
param2 (int): Second parameter
"""
result = parse(docstring)
assert (
result.short_description
== "Function with tab indentation."
)
assert len(result.params) == 2
assert result.params[0] == DocstringParam(
"param1", "First parameter"
)
assert result.params[1] == DocstringParam(
"param2", "Second parameter"
)
if __name__ == "__main__":
pytest.main([__file__])

@ -1,6 +1,3 @@
#!/usr/bin/env python3
"""Test script to verify the improved formatter markdown rendering."""
from swarms.utils.formatter import Formatter from swarms.utils.formatter import Formatter

@ -1,21 +1,9 @@
import asyncio import asyncio
import sys
from loguru import logger from loguru import logger
from swarms.utils.litellm_wrapper import LiteLLM from swarms.utils.litellm_wrapper import LiteLLM
# Configure loguru logger
logger.remove() # Remove default handler
logger.add(
"test_litellm.log",
rotation="1 MB",
format="{time} | {level} | {message}",
level="DEBUG",
)
logger.add(sys.stdout, level="INFO")
tools = [ tools = [
{ {
"type": "function", "type": "function",

@ -1,4 +1,4 @@
from swarms.utils import math_eval from swarms.utils.math_eval import math_eval
def func1_no_exception(x): def func1_no_exception(x):

@ -1,21 +1,17 @@
#!/usr/bin/env python3
"""
Test script demonstrating markdown output functionality with a real swarm
Uses the current state of formatter.py to show agent markdown output capabilities
"""
import os import os
from dotenv import load_dotenv from dotenv import load_dotenv
# Load environment variables # Load environment variables
load_dotenv() load_dotenv()
from swarms import Agent from swarms import (
from swarms.structs import ( Agent,
SequentialWorkflow,
ConcurrentWorkflow, ConcurrentWorkflow,
GroupChat, GroupChat,
SequentialWorkflow,
) )
from swarms.utils.formatter import Formatter from swarms.utils.formatter import Formatter

@ -1,120 +0,0 @@
import pytest
from swarms.utils import print_class_parameters
class TestObject:
def __init__(self, value1, value2: int):
pass
class TestObject2:
def __init__(self: "TestObject2", value1, value2: int = 5):
pass
def test_class_with_complex_parameters():
class ComplexArgs:
def __init__(self, value1: list, value2: dict = {}):
pass
output = {"value1": "<class 'list'>", "value2": "<class 'dict'>"}
assert (
print_class_parameters(ComplexArgs, api_format=True) == output
)
def test_empty_class():
class Empty:
pass
with pytest.raises(Exception):
print_class_parameters(Empty)
def test_class_with_no_annotations():
class NoAnnotations:
def __init__(self, value1, value2):
pass
output = {
"value1": "<class 'inspect._empty'>",
"value2": "<class 'inspect._empty'>",
}
assert (
print_class_parameters(NoAnnotations, api_format=True)
== output
)
def test_class_with_partial_annotations():
class PartialAnnotations:
def __init__(self, value1, value2: int):
pass
output = {
"value1": "<class 'inspect._empty'>",
"value2": "<class 'int'>",
}
assert (
print_class_parameters(PartialAnnotations, api_format=True)
== output
)
@pytest.mark.parametrize(
"obj, expected",
[
(
TestObject,
{
"value1": "<class 'inspect._empty'>",
"value2": "<class 'int'>",
},
),
(
TestObject2,
{
"value1": "<class 'inspect._empty'>",
"value2": "<class 'int'>",
},
),
],
)
def test_parametrized_class_parameters(obj, expected):
assert print_class_parameters(obj, api_format=True) == expected
@pytest.mark.parametrize(
"value",
[
int,
float,
str,
list,
set,
dict,
bool,
tuple,
complex,
bytes,
bytearray,
memoryview,
range,
frozenset,
slice,
object,
],
)
def test_not_class_exception(value):
with pytest.raises(Exception):
print_class_parameters(value)
def test_api_format_flag():
assert print_class_parameters(TestObject2, api_format=True) == {
"value1": "<class 'inspect._empty'>",
"value2": "<class 'int'>",
}
print_class_parameters(TestObject)
# TODO: Capture printed output and assert correctness.
Loading…
Cancel
Save