Merge branch 'kyegomez:master' into master3

3 months ago · 161fee9368
parent 2a0adac5ce 24b4288943
commit 161fee9368
62 changed files with 7251 additions and 800 deletions
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@ -18,12 +18,12 @@ If you're adding a new integration, please include:
 Maintainer responsibilities:
-  - General / Misc / if you don't know who to tag: kye@apac.ai
+  - General / Misc / if you don't know who to tag: kye@swarms.world
-  - DataLoaders / VectorStores / Retrievers: kye@apac.ai
+  - DataLoaders / VectorStores / Retrievers: kye@swarms.world
-  - swarms.models: kye@apac.ai
+  - swarms.models: kye@swarms.world
-  - swarms.memory: kye@apac.ai
+  - swarms.memory: kye@swarms.world
-  - swarms.structures: kye@apac.ai
+  - swarms.structures: kye@swarms.world
-If no one reviews your PR within a few days, feel free to email Kye at kye@apac.ai
+If no one reviews your PR within a few days, feel free to email Kye at kye@swarms.world
 See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/kyegomez/swarms
--- a/.github/workflows/welcome.yml
+++ b/.github/workflows/welcome.yml
@ -11,7 +11,7 @@ jobs:
    permissions: write-all
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/first-interaction@v3.0.0
+      - uses: actions/first-interaction@v3.1.0
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}
          issue-message:
--- a/CODE_OF_CONDUCT.md
+++ b/CODE_OF_CONDUCT.md
@ -60,7 +60,7 @@ representative at an online or offline event.
 Instances of abusive, harassing, or otherwise unacceptable behavior may be
 reported to the community leaders responsible for enforcement at
-kye@apac.ai.
+kye@swarms.world.
 All complaints will be reviewed and investigated promptly and fairly.
 All community leaders are obligated to respect the privacy and security of the
--- a/SECURITY.md
+++ b/SECURITY.md
@ -27,7 +27,7 @@
 * * * * *
-If you discover a security vulnerability in any of the above versions, please report it immediately to our security team by sending an email to kye@apac.ai. We take security vulnerabilities seriously and appreciate your efforts in disclosing them responsibly.
+If you discover a security vulnerability in any of the above versions, please report it immediately to our security team by sending an email to kye@swarms.world. We take security vulnerabilities seriously and appreciate your efforts in disclosing them responsibly.
 Please provide detailed information on the vulnerability, including steps to reproduce, potential impact, and any known mitigations. Our security team will acknowledge receipt of your report within 24 hours and will provide regular updates on the progress of the investigation.
--- a/examples/multi_agent/asb/asb_research.py
+++ b/examples/multi_agent/asb/asb_research.py
@ -1,20 +1,18 @@
-import orjson
+import json
 from dotenv import load_dotenv
-from swarms.structs.auto_swarm_builder import AutoSwarmBuilder
+from swarms import AutoSwarmBuilder
 load_dotenv()
 swarm = AutoSwarmBuilder(
    name="My Swarm",
    description="My Swarm Description",
    verbose=True,
    max_loops=1,
-    return_agents=True,
+    execution_type="return-agents",
    model_name="gpt-4.1",
 )
 result = swarm.run(
    task="Build a swarm to write a research paper on the topic of AI"
 )
-print(orjson.dumps(result, option=orjson.OPT_INDENT_2).decode())
+print(json.dumps(result, indent=2))
--- a/docs/examples/aop_medical.md
+++ b/docs/examples/aop_medical.md
@ -0,0 +1,171 @@
 # Medical AOP Example
 A real-world demonstration of the Agent Orchestration Protocol (AOP) using medical agents deployed as MCP tools.
 ## Overview
 This example showcases how to:
 - Deploy multiple medical agents as MCP tools via AOP
 - Use discovery tools for dynamic agent collaboration
 - Execute real tool calls with structured schemas
 - Integrate with keyless APIs for enhanced context
 ## Architecture
 ```mermaid
 graph LR
    A[Medical Agents] --> B[AOP MCP Server<br/>Port 8000]
    B --> C[Client<br/>Cursor/Python]
    B --> D[Discovery Tools]
    B --> E[Tool Execution]
    subgraph "Medical Agents"
        F[Chief Medical Officer]
        G[Virologist]
        H[Internist]
        I[Medical Coder]
        J[Diagnostic Synthesizer]
    end
    A --> F
    A --> G
    A --> H
    A --> I
    A --> J
 ```
 ### Medical Agents
 - **Chief Medical Officer**: Coordination, diagnosis, triage
 - **Virologist**: Viral disease analysis and ICD-10 coding
 - **Internist**: Internal medicine evaluation and HCC tagging
 - **Medical Coder**: ICD-10 code assignment and compliance
 - **Diagnostic Synthesizer**: Final report synthesis with confidence levels
 ## Files
 | File | Description |
 |------|-------------|
 | `medical_aop/server.py` | AOP server exposing medical agents as MCP tools |
 | `medical_aop/client.py` | Discovery client with real tool execution |
 | `README.md` | This documentation |
 ## Usage
 ### 1. Start the AOP Server
 ```bash
 python -m examples.aop_examples.medical_aop.server
 ```
 ### 2. Configure Cursor MCP Integration
 Add to `~/.cursor/mcp.json`:
 ```json
 {
  "mcpServers": {
    "Medical AOP": {
      "type": "http",
      "url": "http://localhost:8000/mcp"
    }
  }
 }
 ```
 ### 3. Use in Cursor
 Enable "Medical AOP" in Cursor's MCP settings, then:
 #### Discover agents:
 ```
 Call tool discover_agents with: {}
 ```
 #### Execute medical coding:
 ```
 Call tool Medical Coder with: {"task":"Patient: 45M, egfr 59 ml/min/1.73; non-African American. Provide ICD-10 suggestions and coding notes.","priority":"normal","include_images":false}
 ```
 #### Review infection control:
 ```
 Call tool Chief Medical Officer with: {"task":"Review current hospital infection control protocols in light of recent MRSA outbreak in ICU. Provide executive summary, policy adjustment recommendations, and estimated implementation costs.","priority":"high"}
 ```
 ### 4. Run Python Client
 ```bash
 python -m examples.aop_examples.medical_aop.client
 ```
 ## Features
 ### Structured Schemas
 - Custom input/output schemas with validation
 - Priority levels (low/normal/high)
 - Image processing support
 - Confidence scoring
 ### Discovery Tools
 | Tool | Description |
 |------|-------------|
 | `discover_agents` | List all available agents |
 | `get_agent_details` | Detailed agent information |
 | `search_agents` | Keyword-based agent search |
 | `list_agents` | Simple agent name list |
 ### Real-world Integration
 - Keyless API integration (disease.sh for epidemiology data)
 - Structured medical coding workflows
 - Executive-level policy recommendations
 - Cost estimation and implementation timelines
 ## Response Format
 All tools return consistent JSON:
 ```json
 {
  "result": "Agent response text",
  "success": true,
  "error": null,
  "confidence": 0.95,
  "codes": ["N18.3", "Z51.11"]
 }
 ```
 ## Configuration
 ### Server Settings
 | Setting | Value |
 |---------|-------|
 | Port | 8000 |
 | Transport | streamable-http |
 | Timeouts | 40-50 seconds per agent |
 | Logging | INFO level with traceback enabled |
 ### Agent Metadata
 Each agent includes:
 - Tags for categorization
 - Capabilities for matching
 - Role classification
 - Model configuration
 ## Best Practices
 1. **Use structured inputs**: Leverage the custom schemas for better results
 2. **Chain agents**: Pass results between agents for comprehensive analysis
 3. **Monitor timeouts**: Adjust based on task complexity
 4. **Validate responses**: Check the `success` field in all responses
 5. **Use discovery**: Query available agents before hardcoding tool names
 ## Troubleshooting
 | Issue | Solution |
 |-------|----------|
 | Connection refused | Ensure server is running on port 8000 |
 | Tool not found | Use `discover_agents` to verify available tools |
 | Timeout errors | Increase timeout values for complex tasks |
 | Schema validation | Ensure input matches the defined JSON schema |
 ## References
 - [AOP Reference](https://docs.swarms.world/en/latest/swarms/structs/aop/)
 - [MCP Integration](https://docs.swarms.ai/examples/mcp-integration)
 - [Protocol Overview](https://docs.swarms.world/en/latest/protocol/overview/)
--- a/docs/mkdocs.yml
+++ b/docs/mkdocs.yml
@ -428,6 +428,9 @@ nav:
      - Web Scraper Agents: "developer_guides/web_scraper.md"
      - Smart Database: "examples/smart_database.md"
    - AOP:
      - Medical AOP Example: "examples/aop_medical.md"
  - Swarms Cloud API:
    - Overview: "swarms_cloud/migration.md"
--- a/docs/swarms/examples/aop_server_example.md
+++ b/docs/swarms/examples/aop_server_example.md
@ -0,0 +1,164 @@
 # AOP Server Setup Example
 This example demonstrates how to set up an Agent Orchestration Protocol (AOP) server with multiple specialized agents.
 ## Overview
 The AOP server allows you to deploy multiple agents that can be discovered and called by other agents or clients in the network. This example shows how to create a server with specialized agents for different tasks.
 ## Code Example
 ```python
 from swarms import Agent
 from swarms.structs.aop import (
    AOP,
 )
 # Create specialized agents
 research_agent = Agent(
    agent_name="Research-Agent",
    agent_description="Expert in research, data collection, and information gathering",
    model_name="anthropic/claude-sonnet-4-5",
    max_loops=1,
    top_p=None,
    dynamic_temperature_enabled=True,
    system_prompt="""You are a research specialist. Your role is to:
    1. Gather comprehensive information on any given topic
    2. Analyze data from multiple sources
    3. Provide well-structured research findings
    4. Cite sources and maintain accuracy
    5. Present findings in a clear, organized manner
    Always provide detailed, factual information with proper context.""",
 )
 analysis_agent = Agent(
    agent_name="Analysis-Agent",
    agent_description="Expert in data analysis, pattern recognition, and generating insights",
    model_name="anthropic/claude-sonnet-4-5",
    max_loops=1,
    top_p=None,
    dynamic_temperature_enabled=True,
    system_prompt="""You are an analysis specialist. Your role is to:
    1. Analyze data and identify patterns
    2. Generate actionable insights
    3. Create visualizations and summaries
    4. Provide statistical analysis
    5. Make data-driven recommendations
    Focus on extracting meaningful insights from information.""",
 )
 writing_agent = Agent(
    agent_name="Writing-Agent",
    agent_description="Expert in content creation, editing, and communication",
    model_name="anthropic/claude-sonnet-4-5",
    max_loops=1,
    top_p=None,
    dynamic_temperature_enabled=True,
    system_prompt="""You are a writing specialist. Your role is to:
    1. Create engaging, well-structured content
    2. Edit and improve existing text
    3. Adapt tone and style for different audiences
    4. Ensure clarity and coherence
    5. Follow best practices in writing
    Always produce high-quality, professional content.""",
 )
 code_agent = Agent(
    agent_name="Code-Agent",
    agent_description="Expert in programming, code review, and software development",
    model_name="anthropic/claude-sonnet-4-5",
    max_loops=1,
    top_p=None,
    dynamic_temperature_enabled=True,
    system_prompt="""You are a coding specialist. Your role is to:
    1. Write clean, efficient code
    2. Debug and fix issues
    3. Review and optimize code
    4. Explain programming concepts
    5. Follow best practices and standards
    Always provide working, well-documented code.""",
 )
 financial_agent = Agent(
    agent_name="Financial-Agent",
    agent_description="Expert in financial analysis, market research, and investment insights",
    model_name="anthropic/claude-sonnet-4-5",
    max_loops=1,
    top_p=None,
    dynamic_temperature_enabled=True,
    system_prompt="""You are a financial specialist. Your role is to:
    1. Analyze financial data and markets
    2. Provide investment insights
    3. Assess risk and opportunities
    4. Create financial reports
    5. Explain complex financial concepts
    Always provide accurate, well-reasoned financial analysis.""",
 )
 # Basic usage - individual agent addition
 deployer = AOP("MyAgentServer", verbose=True, port=5932)
 agents = [
    research_agent,
    analysis_agent,
    writing_agent,
    code_agent,
    financial_agent,
 ]
 deployer.add_agents_batch(agents)
 deployer.run()
 ```
 ## Key Components
 ### 1. Agent Creation
 Each agent is created with:
 - **agent_name**: Unique identifier for the agent
 - **agent_description**: Brief description of the agent's capabilities
 - **model_name**: The language model to use
 - **system_prompt**: Detailed instructions defining the agent's role and behavior
 ### 2. AOP Server Setup
 - **Server Name**: "MyAgentServer" - identifies your server
 - **Port**: 5932 - the port where the server will run
 - **Verbose**: True - enables detailed logging
 ### 3. Agent Registration
 - **add_agents_batch()**: Registers multiple agents at once
 - Agents become available for discovery and remote calls
 ## Usage
 1. **Start the Server**: Run the script to start the AOP server
 2. **Agent Discovery**: Other agents or clients can discover available agents
 3. **Remote Calls**: Agents can be called remotely by their names
 ## Server Features
 - **Agent Discovery**: Automatically registers agents for network discovery
 - **Remote Execution**: Agents can be called from other network nodes
 - **Load Balancing**: Distributes requests across available agents
 - **Health Monitoring**: Tracks agent status and availability
 ## Configuration Options
 - **Port**: Change the port number as needed
 - **Verbose**: Set to False for reduced logging
 - **Server Name**: Use a descriptive name for your server
 ## Next Steps
 - See [AOP Cluster Example](aop_cluster_example.md) for multi-server setups
 - Check [AOP Reference](../structs/aop.md) for advanced configuration options
 - Explore agent communication patterns in the examples directory
--- a/examples/aop_examples/README.md
+++ b/examples/aop_examples/README.md
@ -0,0 +1,66 @@
 # AOP Examples
 This directory contains runnable examples that demonstrate AOP (Agents over Protocol) patterns in Swarms: spinning up a simple MCP server, discovering available agents/tools, and invoking agent tools from client scripts.
 ## What’s inside
 - **Top-level demos**
  - [`example_new_agent_tools.py`](./example_new_agent_tools.py): End‑to‑end demo of agent discovery utilities (list/search agents, get details for one or many). Targets an MCP server at `http://localhost:5932/mcp`.
  - [`list_agents_and_call_them.py`](./list_agents_and_call_them.py): Utility helpers to fetch tools from an MCP server and call an agent‑style tool with a task prompt. Defaults to `http://localhost:8000/mcp`.
  - [`get_all_agents.py`](./get_all_agents.py): Minimal snippet to print all tools exposed by an MCP server as JSON. Defaults to `http://0.0.0.0:8000/mcp`.
 - **Server**
  - [`server/server.py`](./server/server.py): Simple MCP server entrypoint you can run locally to expose tools/agents for the client examples.
 - **Client**
  - [`client/aop_cluster_example.py`](./client/aop_cluster_example.py): Connect to an AOP cluster and interact with agents.
  - [`client/aop_queue_example.py`](./client/aop_queue_example.py): Example of queue‑style task submission to agents.
  - [`client/aop_raw_task_example.py`](./client/aop_raw_task_example.py): Shows how to send a raw task payload without additional wrappers.
  - [`client/aop_raw_client_code.py`](./client/aop_raw_client_code.py): Minimal, low‑level client calls against the MCP endpoint.
 - **Discovery**
  - [`discovery/example_agent_communication.py`](./discovery/example_agent_communication.py): Illustrates simple agent‑to‑agent or agent‑to‑service communication patterns.
  - [`discovery/example_aop_discovery.py`](./discovery/example_aop_discovery.py): Demonstrates discovering available agents/tools via AOP.
  - [`discovery/simple_discovery_example.py`](./discovery/simple_discovery_example.py): A pared‑down discovery walkthrough.
  - [`discovery/test_aop_discovery.py`](./discovery/test_aop_discovery.py): Test‑style script validating discovery functionality.
 ## Prerequisites
 - Python environment with project dependencies installed.
 - An MCP server running locally (you can use the provided server example).
 ## Quick start
 1. Start a local MCP server (in a separate terminal):
 ```bash
 python examples/aop_examples/server/server.py
 ```
 1. Try discovery utilities (adjust the URL if your server uses a different port):
 ```bash
 # List exposed tools (defaults to http://0.0.0.0:8000/mcp)
 python examples/aop_examples/get_all_agents.py
 # Fetch tools and call the first agent-like tool (defaults to http://localhost:8000/mcp)
 python examples/aop_examples/list_agents_and_call_them.py
 # Rich demo of agent info utilities (expects http://localhost:5932/mcp by default)
 python examples/aop_examples/example_new_agent_tools.py
 ```
 1. Explore client variants:
 ```bash
 python examples/aop_examples/client/aop_cluster_example.py
 python examples/aop_examples/client/aop_queue_example.py
 python examples/aop_examples/client/aop_raw_task_example.py
 python examples/aop_examples/client/aop_raw_client_code.py
 ```
 ## Tips
 - **Server URL/port**: Several examples assume `http://localhost:8000/mcp` or `http://localhost:5932/mcp`. If your server runs elsewhere, update the `server_path`/URL variables at the top of the scripts.
 - **Troubleshooting**: If a script reports “No tools available”, ensure the MCP server is running and that the endpoint path (`/mcp`) and port match the script.
 - **Next steps**: Use these scripts as templates—swap in your own tools/agents, change the search queries, or extend the client calls to fit your workflow.
--- a/examples/aop_examples/aop_cluster_example.py
+++ b/examples/aop_examples/aop_cluster_example.py
@ -1,68 +0,0 @@
 import json
 import asyncio
 from swarms.structs.aop import AOPCluster
 from swarms.tools.mcp_client_tools import execute_tool_call_simple
 async def discover_agents_example():
    """Example of how to call the discover_agents tool."""
    # Create AOP cluster connection
    aop_cluster = AOPCluster(
        urls=["http://localhost:5932/mcp"],
        transport="streamable-http",
    )
    # Check if discover_agents tool is available
    discover_tool = aop_cluster.find_tool_by_server_name(
        "discover_agents"
    )
    if discover_tool:
        try:
            # Create the tool call request
            tool_call_request = {
                "type": "function",
                "function": {
                    "name": "discover_agents",
                    "arguments": json.dumps(
                        {}
                    ),  # No specific agent name = get all
                },
            }
            # Execute the tool call
            result = await execute_tool_call_simple(
                response=tool_call_request,
                server_path="http://localhost:5932/mcp",
                output_type="dict",
                verbose=False,
            )
            print(json.dumps(result, indent=2))
            # Parse the result
            if isinstance(result, list) and len(result) > 0:
                discovery_data = result[0]
                if discovery_data.get("success"):
                    agents = discovery_data.get("agents", [])
                    return agents
                else:
                    return None
            else:
                return None
        except Exception:
            return None
    else:
        return None
 def main():
    """Main function to run the discovery example."""
    # Run the async function
    return asyncio.run(discover_agents_example())
 if __name__ == "__main__":
    main()
--- a/examples/aop_examples/client/aop_cluster_example.py
+++ b/examples/aop_examples/client/aop_cluster_example.py
@ -0,0 +1,47 @@
 import json
 import asyncio
 from swarms.structs.aop import AOPCluster
 from swarms.tools.mcp_client_tools import execute_tool_call_simple
 async def discover_agents_example():
    """
    Discover all agents using the AOPCluster and print the result.
    """
    aop_cluster = AOPCluster(
        urls=["http://localhost:5932/mcp"],
        transport="streamable-http",
    )
    tool = aop_cluster.find_tool_by_server_name("discover_agents")
    if not tool:
        print("discover_agents tool not found.")
        return None
    tool_call_request = {
        "type": "function",
        "function": {
            "name": "discover_agents",
            "arguments": "{}",
        },
    }
    result = await execute_tool_call_simple(
        response=tool_call_request,
        server_path="http://localhost:5932/mcp",
        output_type="dict",
        verbose=False,
    )
    print(json.dumps(result, indent=2))
    return result
 def main():
    """
    Run the discover_agents_example coroutine.
    """
    asyncio.run(discover_agents_example())
 if __name__ == "__main__":
    main()
--- a/examples/aop_examples/client/aop_queue_example.py
+++ b/examples/aop_examples/client/aop_queue_example.py
@ -0,0 +1,149 @@
 #!/usr/bin/env python3
 """
 Example demonstrating the AOP queue system for agent execution.
 This example shows how to use the new queue-based execution system
 in the AOP framework for improved performance and reliability.
 """
 import time
 from swarms import Agent
 from swarms.structs.aop import AOP
 def main():
    """Demonstrate AOP queue functionality."""
    # Create some sample agents
    agent1 = Agent(
        agent_name="Research Agent",
        agent_description="Specialized in research tasks",
        model_name="gpt-4",
        max_loops=1,
    )
    agent2 = Agent(
        agent_name="Writing Agent",
        agent_description="Specialized in writing tasks",
        model_name="gpt-4",
        max_loops=1,
    )
    # Create AOP with queue enabled
    aop = AOP(
        server_name="Queue Demo Cluster",
        description="A demonstration of queue-based agent execution",
        queue_enabled=True,
        max_workers_per_agent=2,  # 2 workers per agent
        max_queue_size_per_agent=100,  # Max 100 tasks per queue
        processing_timeout=60,  # 60 second timeout
        retry_delay=2.0,  # 2 second delay between retries
        verbose=True,
    )
    # Add agents to the cluster
    print("Adding agents to cluster...")
    aop.add_agent(agent1, tool_name="researcher")
    aop.add_agent(agent2, tool_name="writer")
    # Get initial queue stats
    print("\nInitial queue stats:")
    stats = aop.get_queue_stats()
    print(f"Stats: {stats}")
    # Add some tasks to the queues
    print("\nAdding tasks to queues...")
    # Add high priority research task
    research_task_id = aop.task_queues["researcher"].add_task(
        task="Research the latest developments in quantum computing",
        priority=10,  # High priority
        max_retries=2,
    )
    print(f"Added research task: {research_task_id}")
    # Add medium priority writing task
    writing_task_id = aop.task_queues["writer"].add_task(
        task="Write a summary of AI trends in 2024",
        priority=5,  # Medium priority
        max_retries=3,
    )
    print(f"Added writing task: {writing_task_id}")
    # Add multiple low priority tasks
    for i in range(3):
        task_id = aop.task_queues["researcher"].add_task(
            task=f"Research task {i+1}: Analyze market trends",
            priority=1,  # Low priority
            max_retries=1,
        )
        print(f"Added research task {i+1}: {task_id}")
    # Get updated queue stats
    print("\nUpdated queue stats:")
    stats = aop.get_queue_stats()
    print(f"Stats: {stats}")
    # Monitor task progress
    print("\nMonitoring task progress...")
    for _ in range(10):  # Monitor for 10 iterations
        time.sleep(1)
        # Check research task status
        research_status = aop.get_task_status(
            "researcher", research_task_id
        )
        print(
            f"Research task status: {research_status['task']['status'] if research_status['success'] else 'Error'}"
        )
        # Check writing task status
        writing_status = aop.get_task_status(
            "writer", writing_task_id
        )
        print(
            f"Writing task status: {writing_status['task']['status'] if writing_status['success'] else 'Error'}"
        )
        # Get current queue stats
        current_stats = aop.get_queue_stats()
        if current_stats["success"]:
            for agent_name, agent_stats in current_stats[
                "stats"
            ].items():
                print(
                    f"{agent_name}: {agent_stats['pending_tasks']} pending, {agent_stats['processing_tasks']} processing, {agent_stats['completed_tasks']} completed"
                )
        print("---")
    # Demonstrate queue management
    print("\nDemonstrating queue management...")
    # Pause the research agent queue
    print("Pausing research agent queue...")
    aop.pause_agent_queue("researcher")
    # Get queue status
    research_queue_status = aop.task_queues["researcher"].get_status()
    print(f"Research queue status: {research_queue_status.value}")
    # Resume the research agent queue
    print("Resuming research agent queue...")
    aop.resume_agent_queue("researcher")
    # Clear all queues
    print("Clearing all queues...")
    cleared = aop.clear_all_queues()
    print(f"Cleared tasks: {cleared}")
    # Final stats
    print("\nFinal queue stats:")
    final_stats = aop.get_queue_stats()
    print(f"Final stats: {final_stats}")
    print("\nQueue demonstration completed!")
 if __name__ == "__main__":
    main()
--- a/examples/aop_examples/client/aop_raw_client_code.py
+++ b/examples/aop_examples/client/aop_raw_client_code.py
@ -0,0 +1,88 @@
 import json
 import asyncio
 from swarms.structs.aop import AOPCluster
 from swarms.tools.mcp_client_tools import execute_tool_call_simple
 from mcp import ClientSession
 from mcp.client.streamable_http import streamablehttp_client
 async def discover_agents_example():
    """
    Discover all agents using the AOPCluster and print the result.
    """
    aop_cluster = AOPCluster(
        urls=["http://localhost:5932/mcp"],
        transport="streamable-http",
    )
    tool = aop_cluster.find_tool_by_server_name("discover_agents")
    if not tool:
        print("discover_agents tool not found.")
        return None
    tool_call_request = {
        "type": "function",
        "function": {
            "name": "discover_agents",
            "arguments": "{}",
        },
    }
    result = await execute_tool_call_simple(
        response=tool_call_request,
        server_path="http://localhost:5932/mcp",
        output_type="dict",
        verbose=False,
    )
    print(json.dumps(result, indent=2))
    return result
 async def raw_mcp_discover_agents_example():
    """
    Call the MCP server directly using the raw MCP client to execute the
    built-in "discover_agents" tool and print the JSON result.
    This demonstrates how to:
    - Initialize an MCP client over streamable HTTP
    - List available tools (optional)
    - Call a specific tool by name with arguments
    """
    url = "http://localhost:5932/mcp"
    # Open a raw MCP client connection
    async with streamablehttp_client(url, timeout=10) as ctx:
        if len(ctx) == 2:
            read, write = ctx
        else:
            read, write, *_ = ctx
        async with ClientSession(read, write) as session:
            # Initialize the MCP session and optionally inspect tools
            await session.initialize()
            # Optional: list tools (uncomment to print)
            # tools = await session.list_tools()
            # print(json.dumps(tools.model_dump(), indent=2))
            # Call the built-in discovery tool with empty arguments
            result = await session.call_tool(
                name="discover_agents",
                arguments={},
            )
            # Convert to dict for pretty printing
            print(json.dumps(result.model_dump(), indent=2))
            return result.model_dump()
 def main():
    """
    Run the helper-based and raw MCP client discovery examples.
    """
    asyncio.run(discover_agents_example())
    asyncio.run(raw_mcp_discover_agents_example())
 if __name__ == "__main__":
    main()
--- a/examples/aop_examples/client/aop_raw_task_example.py
+++ b/examples/aop_examples/client/aop_raw_task_example.py
@ -0,0 +1,107 @@
 import json
 import asyncio
 from mcp import ClientSession
 from mcp.client.streamable_http import streamablehttp_client
 async def call_agent_tool_raw(
    url: str,
    tool_name: str,
    task: str,
    img: str | None = None,
    imgs: list[str] | None = None,
    correct_answer: str | None = None,
 ) -> dict:
    """
    Call a specific agent tool on an MCP server using the raw MCP client.
    Args:
        url: MCP server URL (e.g., "http://localhost:5932/mcp").
        tool_name: Name of the tool/agent to invoke.
        task: Task prompt to execute.
        img: Optional single image path/URL.
        imgs: Optional list of image paths/URLs.
        correct_answer: Optional expected answer for validation.
    Returns:
        A dict containing the tool's JSON response.
    """
    # Open a raw MCP client connection over streamable HTTP
    async with streamablehttp_client(url, timeout=30) as ctx:
        if len(ctx) == 2:
            read, write = ctx
        else:
            read, write, *_ = ctx
        async with ClientSession(read, write) as session:
            # Initialize the MCP session
            await session.initialize()
            # Prepare arguments in the canonical AOP tool format
            arguments: dict = {"task": task}
            if img is not None:
                arguments["img"] = img
            if imgs is not None:
                arguments["imgs"] = imgs
            if correct_answer is not None:
                arguments["correct_answer"] = correct_answer
            # Invoke the tool by name
            result = await session.call_tool(
                name=tool_name, arguments=arguments
            )
            # Convert to dict for return/printing
            return result.model_dump()
 async def list_available_tools(url: str) -> dict:
    """
    List tools from an MCP server using the raw client.
    Args:
        url: MCP server URL (e.g., "http://localhost:5932/mcp").
    Returns:
        A dict representation of the tools listing.
    """
    async with streamablehttp_client(url, timeout=30) as ctx:
        if len(ctx) == 2:
            read, write = ctx
        else:
            read, write, *_ = ctx
        async with ClientSession(read, write) as session:
            await session.initialize()
            tools = await session.list_tools()
            return tools.model_dump()
 def main() -> None:
    """
    Demonstration entrypoint: list tools, then call a specified tool with a task.
    """
    url = "http://localhost:5932/mcp"
    tool_name = "Research-Agent"  # Change to your agent tool name
    task = "Summarize the latest advances in agent orchestration protocols."
    # List tools
    tools_info = asyncio.run(list_available_tools(url))
    print("Available tools:")
    print(json.dumps(tools_info, indent=2))
    # Call the tool
    print(f"\nCalling tool '{tool_name}' with task...\n")
    result = asyncio.run(
        call_agent_tool_raw(
            url=url,
            tool_name=tool_name,
            task=task,
        )
    )
    print(json.dumps(result, indent=2))
 if __name__ == "__main__":
    main()
--- a/examples/aop_examples/discovery/example_agent_communication.py
+++ b/examples/aop_examples/discovery/example_agent_communication.py
@ -12,7 +12,7 @@ def simulate_agent_discovery():
    """Simulate how an agent would use the discovery tool."""
    # Create a sample agent that will use the discovery tool
-    coordinator_agent = Agent(
+    Agent(
        agent_name="ProjectCoordinator",
        agent_description="Coordinates projects and assigns tasks to other agents",
        system_prompt="You are a project coordinator who helps organize work and delegate tasks to the most appropriate team members. You can discover information about other agents to make better decisions.",
@ -118,34 +118,6 @@ def simulate_agent_discovery():
    # Show what the MCP tool response would look like
    print("📡 Sample MCP tool response structure:")
    sample_response = {
        "success": True,
        "agents": [
            {
                "tool_name": "data_specialist",
                "agent_name": "DataSpecialist",
                "description": "Handles all data-related tasks and analysis",
                "short_system_prompt": "You are a data specialist with expertise in data processing, analysis, and visualization...",
                "tags": [
                    "data",
                    "analysis",
                    "python",
                    "sql",
                    "statistics",
                ],
                "capabilities": [
                    "data_processing",
                    "statistical_analysis",
                    "visualization",
                ],
                "role": "specialist",
                "model_name": "gpt-4o-mini",
                "max_loops": 1,
                "temperature": 0.5,
                "max_tokens": 4096,
            }
        ],
    }
    print("   discover_agents() -> {")
    print("     'success': True,")
--- a/examples/aop_examples/example_new_agent_tools.py
+++ b/examples/aop_examples/example_new_agent_tools.py
@ -15,7 +15,7 @@ async def demonstrate_new_agent_tools():
    """Demonstrate the new agent information tools."""
    # Create AOP cluster connection
-    aop_cluster = AOPCluster(
+    AOPCluster(
        urls=["http://localhost:5932/mcp"],
        transport="streamable-http",
    )
@ -77,7 +77,7 @@ async def demonstrate_new_agent_tools():
        if isinstance(result, list) and len(result) > 0:
            data = result[0]
            if data.get("success"):
-                agent_info = data.get("agent_info", {})
+                data.get("agent_info", {})
                discovery_info = data.get("discovery_info", {})
                print(
                    f"   Agent: {discovery_info.get('agent_name', 'Unknown')}"
--- a/examples/aop_examples/medical_aop/client.py
+++ b/examples/aop_examples/medical_aop/client.py
@ -0,0 +1,113 @@
 import asyncio
 import json
 from typing import Dict
 import requests
 from swarms.structs.aop import AOPCluster
 from swarms.tools.mcp_client_tools import execute_tool_call_simple
 def _select_tools_by_keyword(tools: list, keyword: str) -> list:
    """
    Return tools whose name or description contains the keyword
    (case-insensitive).
    """
    kw = keyword.lower()
    selected = []
    for t in tools:
        name = t.get("function", {}).get("name", "")
        desc = t.get("function", {}).get("description", "")
        if kw in name.lower() or kw in desc.lower():
            selected.append(t)
    return selected
 def _example_payload_from_schema(tools: list, tool_name: str) -> dict:
    """
    Construct a minimal example payload for a given tool using its JSON schema.
    Falls back to a generic 'task' if schema not present.
    """
    for t in tools:
        fn = t.get("function", {})
        if fn.get("name") == tool_name:
            schema = fn.get("parameters", {})
            required = schema.get("required", [])
            props = schema.get("properties", {})
            payload = {}
            for r in required:
                if r in props:
                    if props[r].get("type") == "string":
                        payload[r] = (
                            "Example patient case: 45M, egfr 59 ml/min/1.73"
                        )
                    elif props[r].get("type") == "boolean":
                        payload[r] = False
                    else:
                        payload[r] = None
            if not payload:
                payload = {
                    "task": "Provide ICD-10 suggestions for the case above"
                }
            return payload
    return {"task": "Provide ICD-10 suggestions for the case above"}
 def main() -> None:
    cluster = AOPCluster(
        urls=["http://localhost:8000/mcp"],
        transport="streamable-http",
    )
    tools = cluster.get_tools(output_type="dict")
    print(f"Tools: {len(tools)}")
    coding_tools = _select_tools_by_keyword(tools, "coder")
    names = [t.get("function", {}).get("name") for t in coding_tools]
    print(f"Coding-related tools: {names}")
    # Build a real payload for "Medical Coder" and execute the tool call
    tool_name = "Medical Coder"
    payload: Dict[str, object] = _example_payload_from_schema(tools, tool_name)
    # Enrich with public keyless data (epidemiology context via disease.sh)
    try:
        epi = requests.get(
            "https://disease.sh/v3/covid-19/countries/USA?strict=true",
            timeout=5,
        )
        if epi.ok:
            data = epi.json()
            epi_summary = (
                f"US COVID-19 context: cases={data.get('cases')}, "
                f"todayCases={data.get('todayCases')}, deaths={data.get('deaths')}"
            )
            base_task = payload.get("task") or ""
            payload["task"] = (
                f"{base_task}\n\nEpidemiology context (no key API): {epi_summary}"
            )
    except Exception:
        pass
    print("Calling tool:", tool_name)
    request = {
        "function": {
            "name": tool_name,
            "arguments": payload,
        }
    }
    result = asyncio.run(
        execute_tool_call_simple(
            response=request,
            server_path="http://localhost:8000/mcp",
            output_type="json",
            transport="streamable-http",
            verbose=False,
        )
    )
    print("Response:")
    print(result)
 if __name__ == "__main__":
    main()
--- a/examples/aop_examples/medical_aop/server.py
+++ b/examples/aop_examples/medical_aop/server.py
@ -0,0 +1,166 @@
 # Import medical agents defined in the demo module
 from examples.demos.medical.medical_coder_agent import (chief_medical_officer,
                                                        internist,
                                                        medical_coder,
                                                        synthesizer,
                                                        virologist)
 from swarms.structs.aop import AOP
 def _enrich_agents_metadata() -> None:
    """
    Add lightweight tags/capabilities/roles to imported agents for
    better discovery results.
    """
    chief_medical_officer.tags = [
        "coordination",
        "diagnosis",
        "triage",
    ]
    chief_medical_officer.capabilities = [
        "case-intake",
        "differential",
        "planning",
    ]
    chief_medical_officer.role = "coordinator"
    virologist.tags = ["virology", "infectious-disease"]
    virologist.capabilities = ["viral-analysis", "icd10-suggestion"]
    virologist.role = "specialist"
    internist.tags = ["internal-medicine", "evaluation"]
    internist.capabilities = [
        "system-review",
        "hcc-tagging",
        "risk-stratification",
    ]
    internist.role = "specialist"
    medical_coder.tags = ["coding", "icd10", "compliance"]
    medical_coder.capabilities = [
        "code-assignment",
        "documentation-review",
    ]
    medical_coder.role = "coder"
    synthesizer.tags = ["synthesis", "reporting"]
    synthesizer.capabilities = [
        "evidence-reconciliation",
        "final-report",
    ]
    synthesizer.role = "synthesizer"
 def _medical_input_schema() -> dict:
    return {
        "type": "object",
        "properties": {
            "task": {
                "type": "string",
                "description": "Patient case or instruction for the agent",
            },
            "priority": {
                "type": "string",
                "enum": ["low", "normal", "high"],
                "description": "Processing priority",
            },
            "include_images": {
                "type": "boolean",
                "description": "Whether to consider linked images if provided",
                "default": False,
            },
            "img": {
                "type": "string",
                "description": "Optional image path/URL",
            },
            "imgs": {
                "type": "array",
                "items": {"type": "string"},
                "description": "Optional list of images",
            },
        },
        "required": ["task"],
        "additionalProperties": False,
    }
 def _medical_output_schema() -> dict:
    return {
        "type": "object",
        "properties": {
            "result": {"type": "string"},
            "success": {"type": "boolean"},
            "error": {"type": "string"},
            "confidence": {
                "type": "number",
                "minimum": 0,
                "maximum": 1,
                "description": "Optional confidence in the assessment",
            },
            "codes": {
                "type": "array",
                "items": {"type": "string"},
                "description": "Optional list of suggested ICD-10 codes",
            },
        },
        "required": ["result", "success"],
        "additionalProperties": True,
    }
 def main() -> None:
    """
    Start an AOP MCP server that exposes the medical agents as tools with
    structured schemas and per-agent settings.
    """
    _enrich_agents_metadata()
    deployer = AOP(
        server_name="Medical-AOP-Server",
        port=8000,
        verbose=False,
        traceback_enabled=True,
        log_level="INFO",
        transport="streamable-http",
    )
    input_schema = _medical_input_schema()
    output_schema = _medical_output_schema()
    # Register each agent with a modest, role-appropriate timeout
    deployer.add_agent(
        chief_medical_officer,
        timeout=45,
        input_schema=input_schema,
        output_schema=output_schema,
    )
    deployer.add_agent(
        virologist,
        timeout=40,
        input_schema=input_schema,
        output_schema=output_schema,
    )
    deployer.add_agent(
        internist,
        timeout=40,
        input_schema=input_schema,
        output_schema=output_schema,
    )
    deployer.add_agent(
        medical_coder,
        timeout=50,
        input_schema=input_schema,
        output_schema=output_schema,
    )
    deployer.add_agent(
        synthesizer,
        timeout=45,
        input_schema=input_schema,
        output_schema=output_schema,
    )
    deployer.run()
 if __name__ == "__main__":
    main()
--- a/examples/aop_examples/server/server.py
+++ b/examples/aop_examples/server/server.py
--- a/examples/multi_agent/heavy_swarm_examples/heavy_swarm.py
+++ b/examples/multi_agent/heavy_swarm_examples/heavy_swarm.py
--- a/examples/multi_agent/simulations/agent_map/v0/demo_simulation.py
+++ b/examples/multi_agent/simulations/agent_map/v0/demo_simulation.py
@ -1,45 +1,11 @@
 #!/usr/bin/env python3
 """
 Demo script for the Agent Map Simulation.
 This script demonstrates how to set up and run a simulation where multiple AI agents
 move around a 2D map and automatically engage in conversations when they come into
 proximity with each other.
 NEW: Task-based simulation support! You can now specify what the agents should discuss:
    # Create simulation
    simulation = AgentMapSimulation(map_width=50, map_height=50)
    # Add your agents
    simulation.add_agent(my_agent1)
    simulation.add_agent(my_agent2)
    # Run with a specific task
    results = simulation.run(
        task="Discuss the impact of AI on financial markets",
        duration=300,  # 5 minutes
        with_visualization=True
    )
 Features demonstrated:
 - Creating agents with different specializations
 - Setting up the simulation environment
 - Running task-focused conversations
 - Live visualization
 - Monitoring conversation activity
 - Saving conversation summaries
 Run this script to see agents moving around and discussing specific topics!
 """
 import time
 from typing import List
 from swarms import Agent
-# Remove the formal collaboration prompt import
+from examples.multi_agent.simulations.agent_map.agent_map_simulation import (
-from simulations.agent_map_simulation import AgentMapSimulation
+    AgentMapSimulation,
 )
 # Create a natural conversation prompt for the simulation
 NATURAL_CONVERSATION_PROMPT = """
--- a/examples/multi_agent/simulations/agent_map/v0/example_usage.py
+++ b/examples/multi_agent/simulations/agent_map/v0/example_usage.py
@ -7,8 +7,12 @@ what topic the agents should discuss when they meet.
 """
 from swarms import Agent
-from simulations.agent_map_simulation import AgentMapSimulation
+from examples.multi_agent.simulations.agent_map.agent_map_simulation import (
-from simulations.v0.demo_simulation import NATURAL_CONVERSATION_PROMPT
+    AgentMapSimulation,
 )
 from examples.multi_agent.simulations.agent_map.v0.demo_simulation import (
    NATURAL_CONVERSATION_PROMPT,
 )
 def create_simple_agent(name: str, expertise: str) -> Agent:
--- a/examples/multi_agent/simulations/agent_map/v0/simple_hospital_demo.py
+++ b/examples/multi_agent/simulations/agent_map/v0/simple_hospital_demo.py
@ -19,7 +19,9 @@ CASE: 34-year-old female with sudden severe headache
 from typing import List
 from swarms import Agent
-from simulations.agent_map_simulation import AgentMapSimulation
+from examples.multi_agent.simulations.agent_map.agent_map_simulation import (
    AgentMapSimulation,
 )
 def create_medical_agent(
--- a/examples/multi_agent/simulations/agent_map/v0/test_group_conversations.py
+++ b/examples/multi_agent/simulations/agent_map/v0/test_group_conversations.py
@ -13,7 +13,7 @@ Run this to see agents naturally forming groups and having multi-party conversat
 from swarms import Agent
-from simulations.agent_map_simulation import (
+from examples.multi_agent.simulations.agent_map.agent_map_simulation import (
    AgentMapSimulation,
    Position,
 )
--- a/examples/multi_agent/simulations/agent_map/v0/test_simulation.py
+++ b/examples/multi_agent/simulations/agent_map/v0/test_simulation.py
@ -8,7 +8,7 @@ that all components work correctly without requiring a GUI.
 import time
 from swarms import Agent
-from simulations.agent_map_simulation import (
+from examples.multi_agent.simulations.agent_map.agent_map_simulation import (
    AgentMapSimulation,
    Position,
 )
--- a/examples/multi_agent/utils/uvloop_example.py
+++ b/examples/multi_agent/utils/uvloop_example.py
@ -1,122 +0,0 @@
 """
 Example demonstrating the use of uvloop for running multiple agents concurrently.
 This example shows how to use the new uvloop-based functions:
 - run_agents_concurrently_uvloop: For running multiple agents with the same task
 - run_agents_with_tasks_uvloop: For running agents with different tasks
 uvloop provides significant performance improvements over standard asyncio,
 especially for I/O-bound operations and concurrent task execution.
 """
 import os
 from swarms.structs.multi_agent_exec import (
    run_agents_concurrently_uvloop,
    run_agents_with_tasks_uvloop,
 )
 from swarms.structs.agent import Agent
 def create_example_agents(num_agents: int = 3):
    """Create example agents for demonstration."""
    agents = []
    for i in range(num_agents):
        agent = Agent(
            agent_name=f"Agent_{i+1}",
            system_prompt=f"You are Agent {i+1}, a helpful AI assistant.",
            model_name="gpt-4o-mini",  # Using a lightweight model for examples
            max_loops=1,
            autosave=False,
            verbose=False,
        )
        agents.append(agent)
    return agents
 def example_same_task():
    """Example: Running multiple agents with the same task using uvloop."""
    print("=== Example 1: Same Task for All Agents (uvloop) ===")
    agents = create_example_agents(3)
    task = (
        "Write a one-sentence summary about artificial intelligence."
    )
    print(f"Running {len(agents)} agents with the same task...")
    print(f"Task: {task}")
    try:
        results = run_agents_concurrently_uvloop(agents, task)
        print("\nResults:")
        for i, result in enumerate(results, 1):
            print(f"Agent {i}: {result}")
    except Exception as e:
        print(f"Error: {e}")
 def example_different_tasks():
    """Example: Running agents with different tasks using uvloop."""
    print(
        "\n=== Example 2: Different Tasks for Each Agent (uvloop) ==="
    )
    agents = create_example_agents(3)
    tasks = [
        "Explain what machine learning is in simple terms.",
        "Describe the benefits of cloud computing.",
        "What are the main challenges in natural language processing?",
    ]
    print(f"Running {len(agents)} agents with different tasks...")
    try:
        results = run_agents_with_tasks_uvloop(agents, tasks)
        print("\nResults:")
        for i, (result, task) in enumerate(zip(results, tasks), 1):
            print(f"Agent {i} (Task: {task[:50]}...):")
            print(f"  Response: {result}")
            print()
    except Exception as e:
        print(f"Error: {e}")
 def performance_comparison():
    """Demonstrate the performance benefit of uvloop vs standard asyncio."""
    print("\n=== Performance Comparison ===")
    # Note: This is a conceptual example. In practice, you'd need to measure actual performance
    print("uvloop vs Standard asyncio:")
    print("• uvloop: Cython-based event loop, ~2-4x faster")
    print("• Better for I/O-bound operations")
    print("• Lower latency and higher throughput")
    print("• Especially beneficial for concurrent agent execution")
    print("• Automatic fallback to asyncio if uvloop unavailable")
 if __name__ == "__main__":
    # Check if API key is available
    if not os.getenv("OPENAI_API_KEY"):
        print(
            "Please set your OPENAI_API_KEY environment variable to run this example."
        )
        print("Example: export OPENAI_API_KEY='your-api-key-here'")
        exit(1)
    print("🚀 uvloop Multi-Agent Execution Examples")
    print("=" * 50)
    # Run examples
    example_same_task()
    example_different_tasks()
    performance_comparison()
    print("\n✅ Examples completed!")
    print("\nTo use uvloop functions in your code:")
    print(
        "from swarms.structs.multi_agent_exec import run_agents_concurrently_uvloop"
    )
    print("results = run_agents_concurrently_uvloop(agents, task)")
--- a/examples/multi_agent/uvloop_example.py
+++ b/examples/multi_agent/uvloop_example.py
@ -0,0 +1,30 @@
 from swarms.structs.agent import Agent
 from swarms.structs.multi_agent_exec import (
    run_agents_concurrently_uvloop,
 )
 def create_example_agents(num_agents: int = 3):
    """Create example agents for demonstration."""
    agents = []
    for i in range(num_agents):
        agent = Agent(
            agent_name=f"Agent_{i+1}",
            system_prompt=f"You are Agent {i+1}, a helpful AI assistant.",
            model_name="gpt-4o-mini",  # Using a lightweight model for examples
            max_loops=1,
            autosave=False,
            verbose=False,
        )
        agents.append(agent)
    return agents
 agents = create_example_agents(3)
 task = "Write a one-sentence summary about artificial intelligence."
 results = run_agents_concurrently_uvloop(agents, task)
 print(results)
--- a/pyproject.toml
+++ b/pyproject.toml
@ -5,10 +5,10 @@ build-backend = "poetry.core.masonry.api"
 [tool.poetry]
 name = "swarms"
-version = "8.4.0"
+version = "8.4.1"
 description = "Swarms - TGSC"
 license = "MIT"
-authors = ["Kye Gomez <kye@apac.ai>"]
+authors = ["Kye Gomez <kye@swarms.world>"]
 homepage = "https://github.com/kyegomez/swarms"
 documentation = "https://docs.swarms.world"
 readme = "README.md"
@ -67,7 +67,6 @@ tenacity = "*"
 psutil = "*"
 python-dotenv = "*"
 PyYAML = "*"
 docstring_parser = "0.16" # TODO:
 networkx = "*"
 aiofiles = "*"
 rich = "*"
@ -78,7 +77,8 @@ mcp = "*"
 aiohttp = "*"
 orjson = "*"
 schedule = "*"
-uvloop = {version = "*", markers = "sys_platform != 'win32'"}
+uvloop = {version = "*", markers = "sys_platform == 'linux' or sys_platform == 'darwin'"}
 winloop = {version = "*", markers = "sys_platform == 'win32'"}
 [tool.poetry.scripts]
 swarms = "swarms.cli.main:main"
--- a/requirements.txt
+++ b/requirements.txt
@ -9,7 +9,6 @@ rich
 psutil
 python-dotenv
 PyYAML
 docstring_parser==0.16
 black
 ruff
 types-toml>=0.10.8.1
@ -26,4 +25,5 @@ mcp
 numpy
 orjson
 schedule
-uvloop
+uvloop; sys_platform == 'linux' or sys_platform == 'darwin' # linux or macos only
 winloop; sys_platform == 'win32' # windows only
--- a/swarms/structs/init.py
+++ b/swarms/structs/init.py
@ -1,6 +1,7 @@
 from swarms.structs.agent import Agent
 from swarms.structs.agent_loader import AgentLoader
 from swarms.structs.agent_rearrange import AgentRearrange, rearrange
 from swarms.structs.aop import AOP
 from swarms.structs.auto_swarm_builder import AutoSwarmBuilder
 from swarms.structs.base_structure import BaseStructure
 from swarms.structs.base_swarm import BaseSwarm
@ -184,4 +185,5 @@ __all__ = [
    "check_end",
    "AgentLoader",
    "BatchedGridWorkflow",
    "AOP",
 ]
--- a/swarms/structs/agent.py
+++ b/swarms/structs/agent.py
@ -2406,12 +2406,14 @@ class Agent:
            Dict[str, Any]: A dictionary representation of the class attributes.
        """
-        # Remove the llm object from the dictionary
+        # Create a copy of the dict to avoid mutating the original object
-        self.__dict__.pop("llm", None)
+        # Remove the llm object from the copy since it's not serializable
        dict_copy = self.__dict__.copy()
        dict_copy.pop("llm", None)
        return {
            attr_name: self._serialize_attr(attr_name, attr_value)
-            for attr_name, attr_value in self.__dict__.items()
+            for attr_name, attr_value in dict_copy.items()
        }
    def to_json(self, indent: int = 4, *args, **kwargs):
--- a/swarms/structs/aop.py
+++ b/swarms/structs/aop.py
--- a/swarms/structs/auto_swarm_builder.py
+++ b/swarms/structs/auto_swarm_builder.py
@ -489,6 +489,10 @@ class AutoSwarmBuilder:
        Returns:
            List[Agent]: List of created agents
        Notes:
            - Handles both dict and Pydantic AgentSpec inputs
            - Maps 'description' field to 'agent_description' for Agent compatibility
        """
        # Create agents from config
        agents = []
@ -504,7 +508,23 @@ class AutoSwarmBuilder:
            if isinstance(agent_config, dict):
                agent_config = AgentSpec(**agent_config)
-            agent = Agent(**agent_config)
+            # Convert Pydantic model to dict for Agent initialization
            if isinstance(agent_config, BaseModel):
                agent_data = agent_config.model_dump()
            else:
                agent_data = agent_config
            # Handle parameter name mapping: description -> agent_description
            if (
                "description" in agent_data
                and "agent_description" not in agent_data
            ):
                agent_data["agent_description"] = agent_data.pop(
                    "description"
                )
            # Create agent from processed data
            agent = Agent(**agent_data)
            agents.append(agent)
        return agents
--- a/swarms/structs/heavy_swarm.py
+++ b/swarms/structs/heavy_swarm.py
@ -17,6 +17,7 @@ from rich.progress import (
    TimeElapsedColumn,
 )
 from rich.table import Table
 from swarms.structs.agent import Agent
 from swarms.structs.conversation import Conversation
 from swarms.tools.tool_type import tool_type
@ -27,105 +28,198 @@ from swarms.utils.history_output_formatter import (
 from swarms.utils.litellm_wrapper import LiteLLM
 RESEARCH_AGENT_PROMPT = """
-Role: Research Agent. Systematic evidence collection and verification.
+You are a senior research agent. Your mission is to deliver fast, trustworthy, and reproducible research that supports decision-making.
-
+
-Instructions:
+Objective:
- Apply systematic methodology: identify primary/secondary sources, verify credibility, cross-reference claims.
+- Produce well-sourced, reproducible, and actionable research that directly answers the task.
- Use evidence hierarchy: peer-reviewed > industry reports > news > social media. Weight by recency and authority.
+
- For each claim, assess: source reliability, data quality, potential bias, methodology validity.
+Core responsibilities:
- If insufficient evidence, quantify gaps: "Missing: [specific data type] from [timeframe] for [scope]."
+- Frame the research scope and assumptions
-
+- Design and execute a systematic search strategy
-Output (≤400 tokens):
+- Extract and evaluate evidence
-1. Findings (≤8 bullets, 1 sentence each, [Ref N])
+- Triangulate across sources and assess reliability
-2. Evidence Quality Matrix (Source | Reliability | Recency | Bias Risk | Weight)
+- Present findings with limitations and next steps
-3. Confidence (High/Medium/Low + statistical rationale)
+
-4. Data Gaps (≤3 bullets, specific and actionable)
+Process:
-5. References (numbered, titles + URLs + access date)
+1. Clarify scope; state assumptions if details are missing
-
+2. Define search strategy (keywords, databases, time range)
-Constraints: Systematic verification only. No speculation or analysis.
+3. Collect sources, prioritizing primary and high-credibility ones
 4. Extract key claims, methods, and figures with provenance
 5. Score source credibility and reconcile conflicting claims
 6. Synthesize into actionable insights
 Scoring rubric (0–5 scale for each):
 - Credibility
 - Recency
 - Methodological transparency
 - Relevance
 - Consistency with other sources
 Deliverables:
 1. Concise summary (1–2 sentences)
 2. Key findings (bullet points)
 3. Evidence table (source id, claim, support level, credibility, link)
 4. Search log and methods
 5. Assumptions and unknowns
 6. Limitations and biases
 7. Recommendations and next steps
 8. Confidence score with justification
 9. Raw citations and extracts
 Citation rules:
 - Number citations inline [1], [2], and provide metadata in the evidence table
 - Explicitly label assumptions
 - Include provenance for paraphrased content
 Style and guardrails:
 - Objective, precise language
 - Present conflicting evidence fairly
 - Redact sensitive details unless explicitly authorized
 - If evidence is insufficient, state what is missing and suggest how to obtain it
 """
 ANALYSIS_AGENT_PROMPT = """
-Role: Analysis Agent. Statistical analysis and pattern recognition.
+You are an expert analysis agent. Your mission is to transform raw data or research into validated, decision-grade insights.
-
+
-Instructions:
+Objective:
- Apply analytical frameworks: correlation analysis, trend identification, causal inference, statistical significance testing.
+- Deliver statistically sound analyses and models with quantified uncertainty.
- Use quantitative methods: regression analysis, time series analysis, variance analysis, confidence intervals.
+
- For each insight, calculate: correlation coefficient, statistical significance (p-value), confidence interval, effect size.
+Core responsibilities:
- State assumptions explicitly and test for validity. Identify confounding variables and control for bias.
+- Assess data quality
-
+- Choose appropriate methods and justify them
-Output (≤400 tokens):
+- Run diagnostics and quantify uncertainty
-1. Analytical Methods (statistical approach + assumptions + limitations)
+- Interpret results in context and provide recommendations
-2. Quantitative Insights (≤6 items: finding + statistical measure + confidence interval)
+
-3. Statistical Assumptions (≤3 bullets: assumption + validity test + impact if violated)
+Process:
-4. Uncertainty Analysis (≤3 bullets: uncertainty type + magnitude + mitigation)
+1. Validate dataset (structure, missingness, ranges)
-5. Confidence (High/Medium/Low + statistical rationale + sample size)
+2. Clean and document transformations
-
+3. Explore (distributions, outliers, correlations)
-Constraints: Statistical rigor only. No alternatives or implementation.
+4. Select methods (justify choice)
 5. Fit models or perform tests; report parameters and uncertainty
 6. Run sensitivity and robustness checks
 7. Interpret results and link to decisions
 Deliverables:
 1. Concise summary (key implication in 1–2 sentences)
 2. Dataset overview
 3. Methods and assumptions
 4. Results (tables, coefficients, metrics, units)
 5. Diagnostics and robustness
 6. Quantified uncertainty
 7. Practical interpretation and recommendations
 8. Limitations and biases
 9. Optional reproducible code/pseudocode
 Style and guardrails:
 - Rigorous but stakeholder-friendly explanations
 - Clearly distinguish correlation from causation
 - Present conservative results when evidence is weak
 """
 ALTERNATIVES_AGENT_PROMPT = """
-Role: Alternatives Agent. Strategic option generation and multi-criteria analysis.
+You are an alternatives agent. Your mission is to generate a diverse portfolio of solutions and evaluate trade-offs consistently.
-
+
-Instructions:
+Objective:
- Apply decision theory: generate 3–4 mutually exclusive options using systematic decomposition.
+- Present multiple credible strategies, evaluate them against defined criteria, and recommend a primary and fallback path.
- Use multi-criteria decision analysis (MCDA): weighted scoring, pairwise comparison, sensitivity analysis.
+
- For each option, calculate: NPV/ROI, implementation complexity, resource requirements, timeline, success probability.
+Core responsibilities:
- Apply scenario analysis: best-case, most-likely, worst-case outcomes with probability distributions.
+- Generate a balanced set of alternatives
-
+- Evaluate each using a consistent set of criteria
-Output (≤500 tokens):
+- Provide implementation outlines and risk mitigation
- Options:
+
-  - Option Name
+Process:
-    - Summary (1 sentence)
+1. Define evaluation criteria and weights
-    - Quantitative Scores: Impact X/5, Effort Y/5, Risk Z/5, ROI %, Timeline (months)
+2. Generate at least four distinct alternatives
-    - Pros (≤2), Cons (≤2), Preconditions (≤2)
+3. For each option, describe scope, cost, timeline, resources, risks, and success metrics
-    - Scenario Analysis: Best (probability), Most-likely (probability), Worst (probability)
+4. Score options in a trade-off matrix
- Decision Matrix: Option | Impact | Effort | Risk | ROI | Timeline | Weighted Score
+5. Rank and recommend primary and fallback strategies
- Selection Criteria (≤3 bullets: decision rule + threshold + tie-breaking)
+6. Provide phased implementation roadmap
-
+
-Constraints: Systematic analysis only. No feasibility verification.
+Deliverables:
 1. Concise recommendation with rationale
 2. List of alternatives with short descriptions
 3. Trade-off matrix with scores and justifications
 4. Recommendation with risk plan
 5. Implementation roadmap with milestones
 6. Success criteria and KPIs
 7. Contingency plans with switch triggers
 Style and guardrails:
 - Creative but realistic options
 - Transparent about hidden costs or dependencies
 - Highlight flexibility-preserving options
 - Use ranges and confidence where estimates are uncertain
 """
 VERIFICATION_AGENT_PROMPT = """
-Role: Verification Agent. Systematic validation and risk assessment.
+You are a verification agent. Your mission is to rigorously validate claims, methods, and feasibility.
-
+
-Instructions:
+Objective:
- Apply verification methodology: source triangulation, fact-checking protocols, evidence validation.
+- Provide a transparent, evidence-backed verification of claims and quantify remaining uncertainty.
- Use risk assessment frameworks: probability × impact matrix, failure mode analysis, sensitivity analysis.
+
- For each claim, assess: evidence quality, source credibility, logical consistency, empirical validity.
+Core responsibilities:
- Identify logical fallacies, cognitive biases, and methodological errors. Flag contradictions with statistical confidence.
+- Fact-check against primary sources
-
+- Validate methodology and internal consistency
-Output (≤400 tokens):
+- Assess feasibility and compliance
-1. Verification Matrix (Claim | Status | Evidence Quality | Source Credibility | Confidence | P-value)
+- Deliver verdicts with supporting evidence
-2. Risk Assessment (Risk | Probability | Impact | Mitigation | Residual Risk)
+
-3. Logical Consistency Check (Contradiction | Severity | Resolution | Confidence)
+Process:
-4. Feasibility Analysis (Constraint | Impact | Workaround | Probability of Success)
+1. Identify claims or deliverables to verify
-
+2. Define requirements for verification
-Constraints: Systematic validation only. Objective and evidence-based.
+3. Triangulate independent sources
 4. Re-run calculations or sanity checks
 5. Stress-test assumptions
 6. Produce verification scorecard and remediation steps
 Deliverables:
 1. Claim summary
 2. Verification status (verified, partial, not verified)
 3. Evidence matrix (source, finding, support, confidence)
 4. Reproduction of critical calculations
 5. Key risks and failure modes
 6. Corrective steps
 7. Confidence score with reasons
 Style and guardrails:
 - Transparent chain-of-evidence
 - Highlight uncertainty explicitly
 - If data is missing, state what’s needed and propose next steps
 """
 SYNTHESIS_AGENT_PROMPT = """
-Role: Synthesis Agent. Multi-criteria decision synthesis and optimization.
+You are a synthesis agent. Your mission is to integrate multiple inputs into a coherent narrative and executable plan.
-
+
-Instructions:
+Objective:
- Apply synthesis methodology: weighted factor analysis, conflict resolution algorithms, optimization modeling.
+- Deliver an integrated synthesis that reconciles evidence, clarifies trade-offs, and yields a prioritized plan.
- Use decision frameworks: multi-criteria decision analysis (MCDA), analytic hierarchy process (AHP), Pareto optimization.
+
- For each recommendation, calculate: expected value, risk-adjusted return, implementation probability, resource efficiency.
+Core responsibilities:
- Reconcile conflicts using evidence hierarchy: statistical significance > source credibility > recency > sample size.
+- Combine outputs from research, analysis, alternatives, and verification
-
+- Highlight consensus and conflicts
-Output (≤600 tokens):
+- Provide a prioritized roadmap and communication plan
-1. Executive Summary (≤6 bullets: key findings + confidence + action items)
+
-2. Integrated Analysis (≤8 bullets: insight + statistical measure + agent attribution + confidence)
+Process:
-3. Conflict Resolution Matrix (Contradiction | Evidence Weight | Resolution | Confidence)
+1. Map inputs and provenance
-4. Optimized Recommendations (table: Recommendation | Expected Value | Risk Score | Implementation Probability | Resource Efficiency | Priority)
+2. Identify convergence and conflicts
-5. Risk-Optimized Portfolio (Risk | Probability | Impact | Mitigation | Residual Risk | Cost)
+3. Prioritize actions by impact and feasibility
-6. Implementation Roadmap (Step | Owner | Timeline | Dependencies | Success Metrics | Probability)
+4. Develop integrated roadmap with owners, milestones, KPIs
-
+5. Create stakeholder-specific summaries
-Constraints: Systematic optimization only. Evidence-based decision support.
+
 Deliverables:
 1. Executive summary (≤150 words)
 2. Consensus findings and open questions
 3. Priority action list
 4. Integrated roadmap
 5. Measurement and evaluation plan
 6. Communication plan per stakeholder group
 7. Evidence map and assumptions
 Style and guardrails:
 - Executive-focused summary, technical appendix for implementers
 - Transparent about uncertainty
 - Include “what could break this plan” with mitigation steps
 """
 schema = {
    "type": "function",
    "function": {
@ -189,64 +283,62 @@ schema = [schema]
 class HeavySwarm:
    """
-    HeavySwarm is a sophisticated multi-agent orchestration system that
+        HeavySwarm is a sophisticated multi-agent orchestration system that
-    decomposes complex tasks into specialized questions and executes them
+        decomposes complex tasks into specialized questions and executes them
-    using four specialized agents: Research, Analysis, Alternatives, and
+        using four specialized agents: Research, Analysis, Alternatives, and
-    Verification. The results are then synthesized into a comprehensive
+        Verification. The results are then synthesized into a comprehensive
-    response.
+        response.
-
+
-    This swarm architecture provides robust task analysis through:
+        This swarm architecture provides robust task analysis through:
-    - Intelligent question generation for specialized agent roles
+        - Intelligent question generation for specialized agent roles
-    - Parallel execution of specialized agents for efficiency
+        - Parallel execution of specialized agents for efficiency
-    - Comprehensive synthesis of multi-perspective results
+        - Comprehensive synthesis of multi-perspective results
-    - Real-time progress monitoring with rich dashboard displays
+        - Real-time progress monitoring with rich dashboard displays
-    - Reliability checks and validation systems
+        - Reliability checks and validation systems
-    - Multi-loop iterative refinement with context preservation
+        - Multi-loop iterative refinement with context preservation
-
+
-    The HeavySwarm follows a structured workflow:
+        The HeavySwarm follows a structured workflow:
-    1. Task decomposition into specialized questions
+        1. Task decomposition into specialized questions
-    2. Parallel execution by specialized agents
+        2. Parallel execution by specialized agents
-    3. Result synthesis and integration
+        3. Result synthesis and integration
-    4. Comprehensive final report generation
+        4. Comprehensive final report generation
-    5. Optional iterative refinement through multiple loops
+        5. Optional iterative refinement through multiple loops
-
+
-    Key Features:
+        Key Features:
-    - **Multi-loop Execution**: The max_loops parameter enables iterative
+        - **Multi-loop Execution**: The max_loops parameter enables iterative
-      refinement where each subsequent loop builds upon the context and
+          refinement where each subsequent loop builds upon the context and
-      results from previous loops
+          results from previous loops
-    - **Context Preservation**: Conversation history is maintained across
+    S **Iterative Refinement**: Each loop can refine, improve, or complete
-      all loops, allowing for deeper analysis and refinement
+          aspects of the analysis based on previous results
-    - **Iterative Refinement**: Each loop can refine, improve, or complete
+
-      aspects of the analysis based on previous results
+        Attributes:
-
+            name (str): Name identifier for the swarm instance
-    Attributes:
+            description (str): Description of the swarm's purpose
-        name (str): Name identifier for the swarm instance
+            agents (Dict[str, Agent]): Dictionary of specialized agent instances (created internally)
-        description (str): Description of the swarm's purpose
+            timeout (int): Maximum execution time per agent in seconds
-        agents (Dict[str, Agent]): Dictionary of specialized agent instances (created internally)
+            aggregation_strategy (str): Strategy for result aggregation (currently 'synthesis')
-        timeout (int): Maximum execution time per agent in seconds
+            loops_per_agent (int): Number of execution loops per agent
-        aggregation_strategy (str): Strategy for result aggregation (currently 'synthesis')
+            question_agent_model_name (str): Model name for question generation
-        loops_per_agent (int): Number of execution loops per agent
+            worker_model_name (str): Model name for specialized worker agents
-        question_agent_model_name (str): Model name for question generation
+            verbose (bool): Enable detailed logging output
-        worker_model_name (str): Model name for specialized worker agents
+            max_workers (int): Maximum number of concurrent worker threads
-        verbose (bool): Enable detailed logging output
+            show_dashboard (bool): Enable rich dashboard with progress visualization
-        max_workers (int): Maximum number of concurrent worker threads
+            agent_prints_on (bool): Enable individual agent output printing
-        show_dashboard (bool): Enable rich dashboard with progress visualization
+            max_loops (int): Maximum number of execution loops for iterative refinement
-        agent_prints_on (bool): Enable individual agent output printing
+            conversation (Conversation): Conversation history tracker
-        max_loops (int): Maximum number of execution loops for iterative refinement
+            console (Console): Rich console for dashboard output
-        conversation (Conversation): Conversation history tracker
+
-        console (Console): Rich console for dashboard output
+        Example:
-
+            >>> swarm = HeavySwarm(
-    Example:
+            ...     name="AnalysisSwarm",
-        >>> swarm = HeavySwarm(
+            ...     description="Market analysis swarm",
-        ...     name="AnalysisSwarm",
+            ...     question_agent_model_name="gpt-4o-mini",
-        ...     description="Market analysis swarm",
+            ...     worker_model_name="gpt-4o-mini",
-        ...     question_agent_model_name="gpt-4o-mini",
+            ...     show_dashboard=True,
-        ...     worker_model_name="gpt-4o-mini",
+            ...     max_loops=3
-        ...     show_dashboard=True,
+            ... )
-        ...     max_loops=3
+            >>> result = swarm.run("Analyze the current cryptocurrency market trends")
-        ... )
+            >>> # The swarm will run 3 iterations, each building upon the previous results
        >>> result = swarm.run("Analyze the current cryptocurrency market trends")
        >>> # The swarm will run 3 iterations, each building upon the previous results
    """
    def __init__(
--- a/swarms/structs/ma_utils.py
+++ b/swarms/structs/ma_utils.py
@ -1,12 +1,13 @@
 from typing import Dict, List, Any, Optional, Union, Callable
 import random
 from swarms.prompts.collaborative_prompts import (
    get_multi_agent_collaboration_prompt_one,
 )
 from functools import lru_cache
 from typing import Any, Callable, Dict, List, Optional, Union
 from loguru import logger
 from swarms.prompts.collaborative_prompts import (
    get_multi_agent_collaboration_prompt_one,
 )
 def list_all_agents(
    agents: List[Union[Callable, Any]],
@ -131,11 +132,9 @@ def set_random_models_for_agents(
        return random.choice(model_names)
    if isinstance(agents, list):
-        return [
+        for agent in agents:
            setattr(agent, "model_name", random.choice(model_names))
-            or agent
+        return agents
            for agent in agents
        ]
    else:
        setattr(agents, "model_name", random.choice(model_names))
        return agents
--- a/swarms/structs/multi_agent_exec.py
+++ b/swarms/structs/multi_agent_exec.py
@ -1,12 +1,12 @@
 import asyncio
 import concurrent.futures
 import os
 import sys
 from concurrent.futures import (
    ThreadPoolExecutor,
 )
 from typing import Any, Callable, List, Optional, Union
 import uvloop
 from loguru import logger
 from swarms.structs.agent import Agent
@ -16,20 +16,50 @@ from swarms.structs.omni_agent_types import AgentType
 def run_single_agent(
    agent: AgentType, task: str, *args, **kwargs
 ) -> Any:
-    """Run a single agent synchronously"""
+    """
    Run a single agent synchronously with the given task.
    This function provides a synchronous wrapper for executing a single agent
    with a specific task. It passes through any additional arguments and
    keyword arguments to the agent's run method.
    Args:
        agent (AgentType): The agent instance to execute
        task (str): The task string to be executed by the agent
        *args: Variable length argument list passed to agent.run()
        **kwargs: Arbitrary keyword arguments passed to agent.run()
    Returns:
        Any: The result returned by the agent's run method
    Example:
        >>> agent = SomeAgent()
        >>> result = run_single_agent(agent, "Analyze this data")
        >>> print(result)
    """
    return agent.run(task=task, *args, **kwargs)
 async def run_agent_async(agent: AgentType, task: str) -> Any:
    """
-    Run an agent asynchronously.
+    Run an agent asynchronously using asyncio event loop.
    This function executes a single agent asynchronously by running it in a
    thread executor to avoid blocking the event loop. It's designed to be
    used within async contexts for concurrent execution.
    Args:
-        agent: Agent instance to run
+        agent (AgentType): The agent instance to execute asynchronously
-        task: Task string to execute
+        task (str): The task string to be executed by the agent
    Returns:
-        Agent execution result
+        Any: The result returned by the agent's run method
    Example:
        >>> async def main():
        ...     agent = SomeAgent()
        ...     result = await run_agent_async(agent, "Process data")
        ...     return result
    """
    loop = asyncio.get_event_loop()
    return await loop.run_in_executor(
@ -41,14 +71,25 @@ async def run_agents_concurrently_async(
    agents: List[AgentType], task: str
 ) -> List[Any]:
    """
-    Run multiple agents concurrently using asyncio.
+    Run multiple agents concurrently using asyncio gather.
    This function executes multiple agents concurrently using asyncio.gather(),
    which runs all agents in parallel and waits for all to complete. Each agent
    runs the same task asynchronously.
    Args:
-        agents: List of Agent instances to run concurrently
+        agents (List[AgentType]): List of agent instances to run concurrently
-        task: Task string to execute
+        task (str): The task string to be executed by all agents
    Returns:
-        List of outputs from each agent
+        List[Any]: List of results from each agent in the same order as input
    Example:
        >>> async def main():
        ...     agents = [Agent1(), Agent2(), Agent3()]
        ...     results = await run_agents_concurrently_async(agents, "Analyze data")
        ...     for i, result in enumerate(results):
        ...         print(f"Agent {i+1} result: {result}")
    """
    results = await asyncio.gather(
        *(run_agent_async(agent, task) for agent in agents)
@ -62,15 +103,35 @@ def run_agents_concurrently(
    max_workers: Optional[int] = None,
 ) -> List[Any]:
    """
-    Optimized concurrent agent runner using ThreadPoolExecutor.
+    Run multiple agents concurrently using ThreadPoolExecutor for optimal performance.
    This function executes multiple agents concurrently using a thread pool executor,
    which provides better performance than asyncio for CPU-bound tasks. It automatically
    determines the optimal number of worker threads based on available CPU cores.
    Args:
-        agents: List of Agent instances to run concurrently
+        agents (List[AgentType]): List of agent instances to run concurrently
-        task: Task string to execute
+        task (str): The task string to be executed by all agents
-        max_workers: Maximum number of threads in the executor (defaults to 95% of CPU cores)
+        max_workers (Optional[int]): Maximum number of threads in the executor.
                                   Defaults to 95% of available CPU cores for optimal performance
    Returns:
-        List of outputs from each agent
+        List[Any]: List of results from each agent. If an agent fails, the exception
                  is included in the results list instead of the result.
    Note:
        - Uses 95% of CPU cores by default for optimal resource utilization
        - Handles exceptions gracefully by including them in the results
        - Results may not be in the same order as input agents due to concurrent execution
    Example:
        >>> agents = [Agent1(), Agent2(), Agent3()]
        >>> results = run_agents_concurrently(agents, "Process data")
        >>> for i, result in enumerate(results):
        ...     if isinstance(result, Exception):
        ...         print(f"Agent {i+1} failed: {result}")
        ...     else:
        ...         print(f"Agent {i+1} result: {result}")
    """
    if max_workers is None:
        # 95% of the available CPU cores
@ -103,16 +164,30 @@ def run_agents_concurrently_multiprocess(
    agents: List[Agent], task: str, batch_size: int = os.cpu_count()
 ) -> List[Any]:
    """
-    Manage and run multiple agents concurrently in batches, with optimized performance.
+    Run multiple agents concurrently in batches using asyncio for optimized performance.
    This function processes agents in batches to avoid overwhelming system resources
    while still achieving high concurrency. It uses asyncio internally to manage
    the concurrent execution of agent batches.
    Args:
-        agents (List[Agent]): List of Agent instances to run concurrently.
+        agents (List[Agent]): List of Agent instances to run concurrently
-        task (str): The task string to execute by all agents.
+        task (str): The task string to be executed by all agents
        batch_size (int, optional): Number of agents to run in parallel in each batch.
-                                    Defaults to the number of CPU cores.
+                                   Defaults to the number of CPU cores for optimal resource usage
    Returns:
-        List[Any]: A list of outputs from each agent.
+        List[Any]: List of results from each agent, maintaining the order of input agents
    Note:
        - Processes agents in batches to prevent resource exhaustion
        - Uses asyncio for efficient concurrent execution within batches
        - Results are returned in the same order as input agents
    Example:
        >>> agents = [Agent1(), Agent2(), Agent3(), Agent4(), Agent5()]
        >>> results = run_agents_concurrently_multiprocess(agents, "Analyze data", batch_size=2)
        >>> print(f"Processed {len(results)} agents")
    """
    results = []
    loop = asyncio.get_event_loop()
@ -134,15 +209,36 @@ def batched_grid_agent_execution(
    max_workers: int = None,
 ) -> List[Any]:
    """
-    Run multiple agents with different tasks concurrently.
+    Run multiple agents with different tasks concurrently using ThreadPoolExecutor.
    This function pairs each agent with a specific task and executes them concurrently.
    It's designed for scenarios where different agents need to work on different tasks
    simultaneously, creating a grid-like execution pattern.
    Args:
-        agents (List[AgentType]): List of agent instances.
+        agents (List[AgentType]): List of agent instances to execute
-        tasks (List[str]): List of tasks, one for each agent.
+        tasks (List[str]): List of task strings, one for each agent. Must match the number of agents
-        max_workers (int, optional): Maximum number of threads to use. Defaults to 90% of CPU cores.
+        max_workers (int, optional): Maximum number of threads to use.
                                   Defaults to 90% of available CPU cores for optimal performance
    Returns:
-        List[Any]: List of results from each agent.
+        List[Any]: List of results from each agent in the same order as input agents.
                  If an agent fails, the exception is included in the results.
    Raises:
        ValueError: If the number of agents doesn't match the number of tasks
    Note:
        - Uses 90% of CPU cores by default for optimal resource utilization
        - Results maintain the same order as input agents
        - Handles exceptions gracefully by including them in results
    Example:
        >>> agents = [Agent1(), Agent2(), Agent3()]
        >>> tasks = ["Task A", "Task B", "Task C"]
        >>> results = batched_grid_agent_execution(agents, tasks)
        >>> for i, result in enumerate(results):
        ...     print(f"Agent {i+1} with {tasks[i]}: {result}")
    """
    logger.info(
        f"Batch Grid Execution with {len(agents)} agents and number of tasks: {len(tasks)}"
@ -184,16 +280,34 @@ def run_agents_with_different_tasks(
    """
    Run multiple agents with different tasks concurrently, processing them in batches.
-    This function executes each agent on its corresponding task, processing the agent-task pairs in batches
+    This function executes each agent on its corresponding task, processing the agent-task pairs
-    of size `batch_size` for efficient resource utilization.
+    in batches for efficient resource utilization. It's designed for scenarios where you have
    a large number of agent-task pairs that need to be processed efficiently.
    Args:
-        agent_task_pairs: List of (agent, task) tuples.
+        agent_task_pairs (List[tuple[AgentType, str]]): List of (agent, task) tuples to execute.
-        batch_size: Number of agents to run in parallel in each batch.
+                                                        Each tuple contains an agent instance and its task
-        max_workers: Maximum number of threads.
+        batch_size (int, optional): Number of agent-task pairs to process in parallel in each batch.
                                   Defaults to 10 for balanced resource usage
        max_workers (int, optional): Maximum number of threads to use for each batch.
                                   If None, uses the default from batched_grid_agent_execution
    Returns:
-        List of outputs from each agent, in the same order as the input pairs.
+        List[Any]: List of outputs from each agent-task pair, maintaining the same order as input pairs.
                  If an agent fails, the exception is included in the results.
    Note:
        - Processes agent-task pairs in batches to prevent resource exhaustion
        - Results maintain the same order as input pairs
        - Handles exceptions gracefully by including them in results
        - Uses batched_grid_agent_execution internally for each batch
    Example:
        >>> pairs = [(agent1, "Task A"), (agent2, "Task B"), (agent3, "Task C")]
        >>> results = run_agents_with_different_tasks(pairs, batch_size=5)
        >>> for i, result in enumerate(results):
        ...     agent, task = pairs[i]
        ...     print(f"Agent {agent.agent_name} with {task}: {result}")
    """
    if not agent_task_pairs:
        return []
@ -216,36 +330,77 @@ def run_agents_concurrently_uvloop(
    max_workers: Optional[int] = None,
 ) -> List[Any]:
    """
-    Run multiple agents concurrently using uvloop for optimized async performance.
+    Run multiple agents concurrently using optimized async performance with uvloop/winloop.
-    uvloop is a fast, drop-in replacement for asyncio's event loop, implemented in Cython.
+    This function provides high-performance concurrent execution of multiple agents using
-    It's designed to be significantly faster than the standard asyncio event loop,
+    optimized event loop implementations. It automatically selects the best available
-    especially beneficial for I/O-bound tasks and concurrent operations.
+    event loop for the platform (uvloop on Unix systems, winloop on Windows).
    Args:
-        agents: List of Agent instances to run concurrently
+        agents (List[AgentType]): List of agent instances to run concurrently
-        task: Task string to execute by all agents
+        task (str): The task string to be executed by all agents
-        max_workers: Maximum number of threads in the executor (defaults to 95% of CPU cores)
+        max_workers (Optional[int]): Maximum number of threads in the executor.
                                   Defaults to 95% of available CPU cores for optimal performance
    Returns:
-        List of outputs from each agent
+        List[Any]: List of results from each agent. If an agent fails, the exception
                  is included in the results list instead of the result.
    Raises:
-        ImportError: If uvloop is not installed
+        ImportError: If neither uvloop nor winloop is available (falls back to standard asyncio)
-        RuntimeError: If uvloop cannot be set as the event loop policy
+        RuntimeError: If event loop policy cannot be set (falls back to standard asyncio)
    Note:
        - Automatically uses uvloop on Linux/macOS and winloop on Windows
        - Falls back gracefully to standard asyncio if optimized loops are unavailable
        - Uses 95% of CPU cores by default for optimal resource utilization
        - Handles exceptions gracefully by including them in results
        - Results may not be in the same order as input agents due to concurrent execution
    Example:
        >>> agents = [Agent1(), Agent2(), Agent3()]
        >>> results = run_agents_concurrently_uvloop(agents, "Process data")
        >>> for i, result in enumerate(results):
        ...     if isinstance(result, Exception):
        ...         print(f"Agent {i+1} failed: {result}")
        ...     else:
        ...         print(f"Agent {i+1} result: {result}")
    """
-    try:
+    # Platform-specific event loop policy setup
-        # Set uvloop as the default event loop policy for better performance
+    if sys.platform in ("win32", "cygwin"):
-        asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
+        # Windows: Try to use winloop
-    except ImportError:
+        try:
-        logger.warning(
+            import winloop
-            "uvloop not available, falling back to standard asyncio. "
+
-            "Install uvloop with: pip install uvloop"
+            asyncio.set_event_loop_policy(winloop.EventLoopPolicy())
-        )
+            logger.info(
-    except RuntimeError as e:
+                "Using winloop for enhanced Windows performance"
-        logger.warning(
+            )
-            f"Could not set uvloop policy: {e}. Using default asyncio."
+        except ImportError:
-        )
+            logger.warning(
                "winloop not available, falling back to standard asyncio. "
                "Install winloop with: pip install winloop"
            )
        except RuntimeError as e:
            logger.warning(
                f"Could not set winloop policy: {e}. Using default asyncio."
            )
    else:
        # Linux/macOS: Try to use uvloop
        try:
            import uvloop
            asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
            logger.info("Using uvloop for enhanced Unix performance")
        except ImportError:
            logger.warning(
                "uvloop not available, falling back to standard asyncio. "
                "Install uvloop with: pip install uvloop"
            )
        except RuntimeError as e:
            logger.warning(
                f"Could not set uvloop policy: {e}. Using default asyncio."
            )
    if max_workers is None:
        # Use 95% of available CPU cores for optimal performance
@ -311,46 +466,90 @@ def run_agents_with_tasks_uvloop(
    max_workers: Optional[int] = None,
 ) -> List[Any]:
    """
-    Run multiple agents with different tasks concurrently using uvloop.
+    Run multiple agents with different tasks concurrently using optimized async performance.
-    This function pairs each agent with a specific task and runs them concurrently
+    This function pairs each agent with a specific task and runs them concurrently using
-    using uvloop for optimized performance.
+    optimized event loop implementations (uvloop on Unix systems, winloop on Windows).
    It's designed for high-performance scenarios where different agents need to work
    on different tasks simultaneously.
    Args:
-        agents: List of Agent instances to run
+        agents (List[AgentType]): List of agent instances to run
-        tasks: List of task strings (must match number of agents)
+        tasks (List[str]): List of task strings, one for each agent. Must match the number of agents
-        max_workers: Maximum number of threads (defaults to 95% of CPU cores)
+        max_workers (Optional[int]): Maximum number of threads in the executor.
                                   Defaults to 95% of available CPU cores for optimal performance
    Returns:
-        List of outputs from each agent
+        List[Any]: List of results from each agent in the same order as input agents.
                  If an agent fails, the exception is included in the results.
    Raises:
-        ValueError: If number of agents doesn't match number of tasks
+        ValueError: If the number of agents doesn't match the number of tasks
    Note:
        - Automatically uses uvloop on Linux/macOS and winloop on Windows
        - Falls back gracefully to standard asyncio if optimized loops are unavailable
        - Uses 95% of CPU cores by default for optimal resource utilization
        - Results maintain the same order as input agents
        - Handles exceptions gracefully by including them in results
    Example:
        >>> agents = [Agent1(), Agent2(), Agent3()]
        >>> tasks = ["Task A", "Task B", "Task C"]
        >>> results = run_agents_with_tasks_uvloop(agents, tasks)
        >>> for i, result in enumerate(results):
        ...     if isinstance(result, Exception):
        ...         print(f"Agent {i+1} with {tasks[i]} failed: {result}")
        ...     else:
        ...         print(f"Agent {i+1} with {tasks[i]}: {result}")
    """
    if len(agents) != len(tasks):
        raise ValueError(
            f"Number of agents ({len(agents)}) must match number of tasks ({len(tasks)})"
        )
-    try:
+    # Platform-specific event loop policy setup
-        # Set uvloop as the default event loop policy
+    if sys.platform in ("win32", "cygwin"):
-        asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
+        # Windows: Try to use winloop
-    except ImportError:
+        try:
-        logger.warning(
+            import winloop
-            "uvloop not available, falling back to standard asyncio. "
+
-            "Install uvloop with: pip install uvloop"
+            asyncio.set_event_loop_policy(winloop.EventLoopPolicy())
-        )
+            logger.info(
-    except RuntimeError as e:
+                "Using winloop for enhanced Windows performance"
-        logger.warning(
+            )
-            f"Could not set uvloop policy: {e}. Using default asyncio."
+        except ImportError:
-        )
+            logger.warning(
                "winloop not available, falling back to standard asyncio. "
                "Install winloop with: pip install winloop"
            )
        except RuntimeError as e:
            logger.warning(
                f"Could not set winloop policy: {e}. Using default asyncio."
            )
    else:
        # Linux/macOS: Try to use uvloop
        try:
            import uvloop
            asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
            logger.info("Using uvloop for enhanced Unix performance")
        except ImportError:
            logger.warning(
                "uvloop not available, falling back to standard asyncio. "
                "Install uvloop with: pip install uvloop"
            )
        except RuntimeError as e:
            logger.warning(
                f"Could not set uvloop policy: {e}. Using default asyncio."
            )
    if max_workers is None:
        num_cores = os.cpu_count()
        max_workers = int(num_cores * 0.95) if num_cores else 1
-    logger.inufo(
+    logger.info(
-        f"Running {len(agents)} agents with {len(tasks)} tasks using uvloop (max_workers: {max_workers})"
+        f"Running {len(agents)} agents with {len(tasks)} tasks using optimized event loop (max_workers: {max_workers})"
    )
    async def run_agents_with_tasks_async():
@ -407,10 +606,40 @@ def run_agents_with_tasks_uvloop(
 def get_swarms_info(swarms: List[Callable]) -> str:
    """
-    Fetches and formats information about all available swarms in the system.
+    Fetch and format information about all available swarms in the system.
    This function provides a comprehensive overview of all swarms currently
    available in the system, including their names, descriptions, agent counts,
    and swarm types. It's useful for debugging, monitoring, and system introspection.
    Args:
        swarms (List[Callable]): List of swarm instances to get information about.
                               Each swarm should have name, description, agents, and swarm_type attributes
    Returns:
-        str: A formatted string containing names and descriptions of all swarms.
+        str: A formatted string containing detailed information about all swarms.
             Returns "No swarms currently available in the system." if the list is empty.
    Note:
        - Each swarm is expected to have the following attributes:
          - name: The name of the swarm
          - description: A description of the swarm's purpose
          - agents: A list of agents in the swarm
          - swarm_type: The type/category of the swarm
        - The output is formatted for human readability with clear section headers
    Example:
        >>> swarms = [swarm1, swarm2, swarm3]
        >>> info = get_swarms_info(swarms)
        >>> print(info)
        Available Swarms:
        [Swarm 1]
        Name: Data Processing Swarm
        Description: Handles data analysis tasks
        Length of Agents: 5
        Swarm Type: Analysis
        ...
    """
    if not swarms:
        return "No swarms currently available in the system."
@ -439,10 +668,47 @@ def get_agents_info(
    agents: List[Union[Agent, Callable]], team_name: str = None
 ) -> str:
    """
-    Fetches and formats information about all available agents in the system.
+    Fetch and format information about all available agents in the system.
    This function provides a comprehensive overview of all agents currently
    available in the system, including their names, descriptions, roles,
    models, and configuration details. It's useful for debugging, monitoring,
    and system introspection.
    Args:
        agents (List[Union[Agent, Callable]]): List of agent instances to get information about.
                                             Each agent should have agent_name, agent_description,
                                             role, model_name, and max_loops attributes
        team_name (str, optional): Optional team name to include in the output header.
                                 If None, uses a generic header
    Returns:
-        str: A formatted string containing names and descriptions of all swarms.
+        str: A formatted string containing detailed information about all agents.
             Returns "No agents currently available in the system." if the list is empty.
    Note:
        - Each agent is expected to have the following attributes:
          - agent_name: The name of the agent
          - agent_description: A description of the agent's purpose
          - role: The role or function of the agent
          - model_name: The AI model used by the agent
          - max_loops: The maximum number of loops the agent can execute
        - The output is formatted for human readability with clear section headers
        - Team name is included in the header if provided
    Example:
        >>> agents = [agent1, agent2, agent3]
        >>> info = get_agents_info(agents, team_name="Data Team")
        >>> print(info)
        Available Agents for Team: Data Team
        [Agent 1]
        Name: Data Analyzer
        Description: Analyzes data patterns
        Role: Analyst
        Model: gpt-4
        Max Loops: 10
        ...
    """
    if not agents:
        return "No agents currently available in the system."
--- a/swarms/tools/base_tool.py
+++ b/swarms/tools/base_tool.py
@ -231,9 +231,10 @@ class BaseTool(BaseModel):
    def base_model_to_dict(
        self,
        pydantic_type: type[BaseModel],
        output_str: bool = False,
        *args: Any,
        **kwargs: Any,
-    ) -> dict[str, Any]:
+    ) -> Union[dict[str, Any], str]:
        """
        Convert a Pydantic BaseModel to OpenAI function calling schema dictionary.
@ -247,7 +248,7 @@ class BaseTool(BaseModel):
            **kwargs: Additional keyword arguments
        Returns:
-            dict[str, Any]: OpenAI function calling schema dictionary
+            Union[dict[str, Any], str]: OpenAI function calling schema dictionary or JSON string
        Raises:
            ToolValidationError: If pydantic_type validation fails
@ -278,9 +279,13 @@ class BaseTool(BaseModel):
            # Get the base function schema
            base_result = base_model_to_openai_function(
-                pydantic_type, *args, **kwargs
+                pydantic_type, output_str=output_str, *args, **kwargs
            )
            # If output_str is True, return the string directly
            if output_str and isinstance(base_result, str):
                return base_result
            # Extract the function definition from the functions array
            if (
                "functions" in base_result
@ -314,8 +319,8 @@ class BaseTool(BaseModel):
            ) from e
    def multi_base_models_to_dict(
-        self, base_models: List[BaseModel]
+        self, base_models: List[BaseModel], output_str: bool = False
-    ) -> dict[str, Any]:
+    ) -> Union[dict[str, Any], str]:
        """
        Convert multiple Pydantic BaseModels to OpenAI function calling schema.
@ -323,12 +328,11 @@ class BaseTool(BaseModel):
        a unified OpenAI function calling schema format.
        Args:
-            return_str (bool): Whether to return string format
+            base_models (List[BaseModel]): List of Pydantic models to convert
-            *args: Additional positional arguments
+            output_str (bool): Whether to return string format. Defaults to False.
            **kwargs: Additional keyword arguments
        Returns:
-            dict[str, Any]: Combined OpenAI function calling schema
+            dict[str, Any] or str: Combined OpenAI function calling schema or JSON string
        Raises:
            ToolValidationError: If base_models validation fails
@ -344,10 +348,18 @@ class BaseTool(BaseModel):
            )
        try:
-            return [
+            results = [
-                self.base_model_to_dict(model)
+                self.base_model_to_dict(model, output_str=output_str)
                for model in base_models
            ]
            # If output_str is True, return the string directly
            if output_str:
                import json
                return json.dumps(results, indent=2)
            return results
        except Exception as e:
            self._log_if_verbose(
                "error", f"Failed to convert multiple models: {e}"
--- a/swarms/tools/pydantic_to_json.py
+++ b/swarms/tools/pydantic_to_json.py
@ -1,6 +1,6 @@
 from typing import Any, List
-from docstring_parser import parse
+from swarms.utils.docstring_parser import parse
 from pydantic import BaseModel
 from swarms.utils.loguru_logger import initialize_logger
@ -39,12 +39,14 @@ def check_pydantic_name(pydantic_type: type[BaseModel]) -> str:
 def base_model_to_openai_function(
    pydantic_type: type[BaseModel],
    output_str: bool = False,
 ) -> dict[str, Any]:
    """
    Convert a Pydantic model to a dictionary representation of functions.
    Args:
        pydantic_type (type[BaseModel]): The Pydantic model type to convert.
        output_str (bool): Whether to return string output format. Defaults to False.
    Returns:
        dict[str, Any]: A dictionary representation of the functions.
@ -85,7 +87,7 @@ def base_model_to_openai_function(
    _remove_a_key(parameters, "title")
    _remove_a_key(parameters, "additionalProperties")
-    return {
+    result = {
        "function_call": {
            "name": name,
        },
@ -98,6 +100,14 @@ def base_model_to_openai_function(
        ],
    }
    # Handle output_str parameter
    if output_str:
        import json
        return json.dumps(result, indent=2)
    return result
 def multi_base_model_to_openai_function(
    pydantic_types: List[BaseModel] = None,
@ -114,13 +124,21 @@ def multi_base_model_to_openai_function(
    """
    functions: list[dict[str, Any]] = [
-        base_model_to_openai_function(pydantic_type, output_str)[
+        base_model_to_openai_function(
-            "functions"
+            pydantic_type, output_str=False
-        ][0]
+        )["functions"][0]
        for pydantic_type in pydantic_types
    ]
-    return {
+    result = {
        "function_call": "auto",
        "functions": functions,
    }
    # Handle output_str parameter
    if output_str:
        import json
        return json.dumps(result, indent=2)
    return result
--- a/swarms/utils/docstring_parser.py
+++ b/swarms/utils/docstring_parser.py
@ -0,0 +1,140 @@
 """
 Custom docstring parser implementation to replace the docstring_parser package.
 This module provides a simple docstring parser that extracts parameter information
 and descriptions from Python docstrings in Google/NumPy style format.
 """
 import re
 from typing import List, Optional, NamedTuple
 class DocstringParam(NamedTuple):
    """Represents a parameter in a docstring."""
    arg_name: str
    description: str
 class DocstringInfo(NamedTuple):
    """Represents parsed docstring information."""
    short_description: Optional[str]
    params: List[DocstringParam]
 def parse(docstring: str) -> DocstringInfo:
    """
    Parse a docstring and extract parameter information and description.
    Args:
        docstring (str): The docstring to parse.
    Returns:
        DocstringInfo: Parsed docstring information containing short description and parameters.
    """
    if not docstring or not docstring.strip():
        return DocstringInfo(short_description=None, params=[])
    # Clean up the docstring
    lines = [line.strip() for line in docstring.strip().split("\n")]
    # Extract short description (first non-empty line that's not a section header)
    short_description = None
    for line in lines:
        if line and not line.startswith(
            (
                "Args:",
                "Parameters:",
                "Returns:",
                "Yields:",
                "Raises:",
                "Note:",
                "Example:",
                "Examples:",
            )
        ):
            short_description = line
            break
    # Extract parameters
    params = []
    # Look for Args: or Parameters: section
    in_args_section = False
    current_param = None
    for line in lines:
        # Check if we're entering the Args/Parameters section
        if line.lower().startswith(("args:", "parameters:")):
            in_args_section = True
            continue
        # Check if we're leaving the Args/Parameters section
        if (
            in_args_section
            and line
            and not line.startswith(" ")
            and not line.startswith("\t")
        ):
            # Check if this is a new section header
            if line.lower().startswith(
                (
                    "returns:",
                    "yields:",
                    "raises:",
                    "note:",
                    "example:",
                    "examples:",
                    "see also:",
                    "see_also:",
                )
            ):
                in_args_section = False
                if current_param:
                    params.append(current_param)
                    current_param = None
                continue
        if in_args_section and line:
            # Check if this line starts a new parameter (starts with parameter name)
            # Pattern: param_name (type): description
            param_match = re.match(
                r"^(\w+)\s*(?:\([^)]*\))?\s*:\s*(.+)$", line
            )
            if param_match:
                # Save previous parameter if exists
                if current_param:
                    params.append(current_param)
                param_name = param_match.group(1)
                param_desc = param_match.group(2).strip()
                current_param = DocstringParam(
                    arg_name=param_name, description=param_desc
                )
            elif current_param and (
                line.startswith(" ") or line.startswith("\t")
            ):
                # This is a continuation of the current parameter description
                current_param = DocstringParam(
                    arg_name=current_param.arg_name,
                    description=current_param.description
                    + " "
                    + line.strip(),
                )
            elif not line.startswith(" ") and not line.startswith(
                "\t"
            ):
                # This might be a new section, stop processing args
                in_args_section = False
                if current_param:
                    params.append(current_param)
                    current_param = None
    # Add the last parameter if it exists
    if current_param:
        params.append(current_param)
    return DocstringInfo(
        short_description=short_description, params=params
    )
--- a/tests/aop/aop_benchmark.py
+++ b/tests/aop/aop_benchmark.py
--- a/tests/aop/test_data/aop_benchmark_data/Detailed_Bench.xlsx
+++ b/tests/aop/test_data/aop_benchmark_data/Detailed_Bench.xlsx
--- a/tests/aop/test_data/aop_benchmark_data/bench1.png
+++ b/tests/aop/test_data/aop_benchmark_data/bench1.png
--- a/tests/aop/test_data/aop_benchmark_data/bench2.png
+++ b/tests/aop/test_data/aop_benchmark_data/bench2.png
--- a/tests/aop/test_data/aop_benchmark_data/bench3.png
+++ b/tests/aop/test_data/aop_benchmark_data/bench3.png
--- a/tests/aop/test_data/aop_benchmark_data/bench4.png
+++ b/tests/aop/test_data/aop_benchmark_data/bench4.png
--- a/tests/aop/test_data/aop_benchmark_data/bench5.png
+++ b/tests/aop/test_data/aop_benchmark_data/bench5.png
--- a/tests/aop/test_data/aop_benchmark_data/benchmark_results.csv
+++ b/tests/aop/test_data/aop_benchmark_data/benchmark_results.csv
@ -0,0 +1,91 @@
 agent_count,test_name,model_name,latency_ms,throughput_rps,memory_usage_mb,cpu_usage_percent,success_rate,error_count,total_requests,concurrent_requests,timestamp,cost_usd,tokens_used,response_quality_score,additional_metrics,agent_creation_time,tool_registration_time,execution_time,total_latency,chaining_steps,chaining_success,error_scenarios_tested,recovery_rate,resource_cycles,avg_memory_delta,memory_leak_detected
 1,scaling_test,gpt-4o-mini,1131.7063331604004,4.131429224630576,1.25,0.0,1.0,0,20,5,1759345643.9453266,0.0015359999999999996,10240,0.8548663728748707,"{'min_latency_ms': 562.7951622009277, 'max_latency_ms': 1780.4391384124756, 'p95_latency_ms': np.float64(1744.0685987472534), 'p99_latency_ms': np.float64(1773.1650304794312), 'total_time_s': 4.84093976020813, 'initial_memory_mb': 291.5546875, 'final_memory_mb': 292.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.679999999999998e-05, 'quality_std': 0.0675424923987846, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 6,scaling_test,gpt-4o-mini,1175.6950378417969,3.7575854004826277,0.0,0.0,1.0,0,20,5,1759345654.225195,0.0015359999999999996,10240,0.8563524483655013,"{'min_latency_ms': 535.4223251342773, 'max_latency_ms': 1985.3930473327637, 'p95_latency_ms': np.float64(1975.6355285644531), 'p99_latency_ms': np.float64(1983.4415435791016), 'total_time_s': 5.322566986083984, 'initial_memory_mb': 293.1796875, 'final_memory_mb': 293.1796875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.679999999999998e-05, 'quality_std': 0.05770982402152013, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 11,scaling_test,gpt-4o-mini,996.9684720039368,4.496099509029146,0.0,0.0,1.0,0,20,5,1759345662.8977199,0.0015359999999999996,10240,0.8844883644941982,"{'min_latency_ms': 45.22204399108887, 'max_latency_ms': 1962.2983932495117, 'p95_latency_ms': np.float64(1647.7753758430483), 'p99_latency_ms': np.float64(1899.3937897682185), 'total_time_s': 4.448300123214722, 'initial_memory_mb': 293.5546875, 'final_memory_mb': 293.5546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.679999999999998e-05, 'quality_std': 0.043434832388308614, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 16,scaling_test,gpt-4o-mini,1112.8681421279907,3.587833950074127,0.0,0.0,1.0,0,20,5,1759345673.162652,0.0015359999999999996,10240,0.8563855623109009,"{'min_latency_ms': 564.1369819641113, 'max_latency_ms': 1951.472282409668, 'p95_latency_ms': np.float64(1897.4883794784546), 'p99_latency_ms': np.float64(1940.6755018234253), 'total_time_s': 5.57439398765564, 'initial_memory_mb': 293.8046875, 'final_memory_mb': 293.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.679999999999998e-05, 'quality_std': 0.05691925404970228, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 1,scaling_test,gpt-4o,1298.2240080833435,3.3670995599405846,0.125,0.0,1.0,0,20,5,1759345683.2065425,0.0512,10240,0.9279627852934385,"{'min_latency_ms': 693.6078071594238, 'max_latency_ms': 1764.8026943206787, 'p95_latency_ms': np.float64(1681.7602753639221), 'p99_latency_ms': np.float64(1748.1942105293274), 'total_time_s': 5.939830303192139, 'initial_memory_mb': 293.8046875, 'final_memory_mb': 293.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.050879141399088765, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 6,scaling_test,gpt-4o,1264.4854545593262,3.5293826102318846,0.0,0.0,1.0,0,20,5,1759345692.6439528,0.0512,10240,0.9737471278894755,"{'min_latency_ms': 175.65083503723145, 'max_latency_ms': 1990.2207851409912, 'p95_latency_ms': np.float64(1910.3824019432068), 'p99_latency_ms': np.float64(1974.2531085014343), 'total_time_s': 5.66671347618103, 'initial_memory_mb': 293.9296875, 'final_memory_mb': 293.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.038542680129780495, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 11,scaling_test,gpt-4o,1212.0607376098633,3.799000004302323,0.125,0.0,1.0,0,20,5,1759345701.8719423,0.0512,10240,0.9366077507029601,"{'min_latency_ms': 542.8001880645752, 'max_latency_ms': 1973.801851272583, 'p95_latency_ms': np.float64(1969.2555904388428), 'p99_latency_ms': np.float64(1972.892599105835), 'total_time_s': 5.264543294906616, 'initial_memory_mb': 293.9296875, 'final_memory_mb': 294.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.044670864578792276, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 16,scaling_test,gpt-4o,1367.1631932258606,3.1229790107314654,0.0,0.0,1.0,0,20,5,1759345711.9738443,0.0512,10240,0.9328922198254587,"{'min_latency_ms': 715.888261795044, 'max_latency_ms': 1905.6315422058105, 'p95_latency_ms': np.float64(1890.480661392212), 'p99_latency_ms': np.float64(1902.6013660430908), 'total_time_s': 6.404141664505005, 'initial_memory_mb': 294.0546875, 'final_memory_mb': 294.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.05146728864962903, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 1,scaling_test,gpt-4-turbo,1429.1370868682861,3.3141614744089267,0.125,0.0,1.0,0,20,5,1759345722.7650242,0.1024,10240,0.960928099222926,"{'min_latency_ms': 637.6686096191406, 'max_latency_ms': 1994.9300289154053, 'p95_latency_ms': np.float64(1973.6997246742249), 'p99_latency_ms': np.float64(1990.6839680671692), 'total_time_s': 6.0347089767456055, 'initial_memory_mb': 294.0546875, 'final_memory_mb': 294.1796875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.0429193742204114, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 6,scaling_test,gpt-4-turbo,1167.8012132644653,3.933946564951724,0.0,0.0,1.0,0,20,5,1759345731.809648,0.1024,10240,0.9575695597206497,"{'min_latency_ms': 521.2328433990479, 'max_latency_ms': 1973.503828048706, 'p95_latency_ms': np.float64(1931.3542008399963), 'p99_latency_ms': np.float64(1965.073902606964), 'total_time_s': 5.083953142166138, 'initial_memory_mb': 294.1796875, 'final_memory_mb': 294.1796875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.04742414087184447, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 11,scaling_test,gpt-4-turbo,1435.1954460144043,3.0793869953124613,0.0,0.0,1.0,0,20,5,1759345741.9117725,0.1024,10240,0.9564233524947511,"{'min_latency_ms': 711.4903926849365, 'max_latency_ms': 2034.2109203338623, 'p95_latency_ms': np.float64(1998.979663848877), 'p99_latency_ms': np.float64(2027.1646690368652), 'total_time_s': 6.4947991371154785, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.03428874308764032, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 16,scaling_test,gpt-4-turbo,1092.1013355255127,4.057819053252887,0.0,0.0,1.0,0,20,5,1759345749.8833907,0.1024,10240,0.9521218582720758,"{'min_latency_ms': 554.4416904449463, 'max_latency_ms': 1968.658447265625, 'p95_latency_ms': np.float64(1637.098050117493), 'p99_latency_ms': np.float64(1902.346367835998), 'total_time_s': 4.92875599861145, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.043763298033728824, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 1,scaling_test,claude-3-5-sonnet,1046.9236850738525,4.047496446876068,0.0,0.0,1.0,0,20,5,1759345757.9539518,0.03071999999999999,10240,0.9511838758969231,"{'min_latency_ms': 184.94415283203125, 'max_latency_ms': 1966.0136699676514, 'p95_latency_ms': np.float64(1677.8094530105593), 'p99_latency_ms': np.float64(1908.3728265762325), 'total_time_s': 4.941326141357422, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999996, 'quality_std': 0.03727295215254124, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 6,scaling_test,claude-3-5-sonnet,1381.3772201538086,3.283979343278356,0.0,0.0,1.0,0,20,5,1759345768.7153368,0.03071999999999999,10240,0.957817098536435,"{'min_latency_ms': 543.0643558502197, 'max_latency_ms': 1937.4654293060303, 'p95_latency_ms': np.float64(1931.4598441123962), 'p99_latency_ms': np.float64(1936.2643122673035), 'total_time_s': 6.090172290802002, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999996, 'quality_std': 0.044335695599357156, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 11,scaling_test,claude-3-5-sonnet,1314.3961310386658,3.5243521468336656,0.0,0.0,1.0,0,20,5,1759345778.6269403,0.03071999999999999,10240,0.9749641888502683,"{'min_latency_ms': 535.1722240447998, 'max_latency_ms': 1983.6831092834473, 'p95_latency_ms': np.float64(1918.512487411499), 'p99_latency_ms': np.float64(1970.6489849090576), 'total_time_s': 5.674801826477051, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999996, 'quality_std': 0.03856740540886548, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 16,scaling_test,claude-3-5-sonnet,1120.720875263214,3.7028070875807546,0.0,0.0,1.0,0,20,5,1759345788.3161702,0.03071999999999999,10240,0.9344569749738585,"{'min_latency_ms': 207.9324722290039, 'max_latency_ms': 2018.561601638794, 'p95_latency_ms': np.float64(1963.4979844093323), 'p99_latency_ms': np.float64(2007.5488781929016), 'total_time_s': 5.401307582855225, 'initial_memory_mb': 294.3046875, 'final_memory_mb': 294.3046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999996, 'quality_std': 0.04750434388073592, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 1,scaling_test,claude-3-haiku,1268.5401320457458,3.539921687652236,0.0,0.0,1.0,0,20,5,1759345797.6495905,0.0256,10240,0.8406194607723803,"{'min_latency_ms': 534.9514484405518, 'max_latency_ms': 1956.9103717803955, 'p95_latency_ms': np.float64(1938.3319020271301), 'p99_latency_ms': np.float64(1953.1946778297424), 'total_time_s': 5.6498425006866455, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.053962632063170944, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 6,scaling_test,claude-3-haiku,1377.644693851471,3.189212271479164,0.0,0.0,1.0,0,20,5,1759345808.2179801,0.0256,10240,0.8370154862115219,"{'min_latency_ms': 661.4456176757812, 'max_latency_ms': 2013.9634609222412, 'p95_latency_ms': np.float64(1985.2455973625183), 'p99_latency_ms': np.float64(2008.2198882102966), 'total_time_s': 6.271141052246094, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.057589803133820325, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 11,scaling_test,claude-3-haiku,1161.9974493980408,3.6778795132801156,0.0,0.0,1.0,0,20,5,1759345817.2541294,0.0256,10240,0.8421329247896683,"{'min_latency_ms': 549.6580600738525, 'max_latency_ms': 1785.23588180542, 'p95_latency_ms': np.float64(1730.9520959854126), 'p99_latency_ms': np.float64(1774.3791246414185), 'total_time_s': 5.437916040420532, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.05774508247670216, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 16,scaling_test,claude-3-haiku,1365.4750227928162,2.998821435629251,0.0,0.0,1.0,0,20,5,1759345827.8750126,0.0256,10240,0.8483772503724578,"{'min_latency_ms': 767.146110534668, 'max_latency_ms': 1936.8767738342285, 'p95_latency_ms': np.float64(1919.3583130836487), 'p99_latency_ms': np.float64(1933.3730816841125), 'total_time_s': 6.669286727905273, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.05705131022796498, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 1,scaling_test,claude-3-sonnet,1360.187566280365,3.089520735450049,0.0,0.0,1.0,0,20,5,1759345837.7737727,0.15360000000000001,10240,0.8835217044830507,"{'min_latency_ms': 550.3547191619873, 'max_latency_ms': 1977.1480560302734, 'p95_latency_ms': np.float64(1924.659264087677), 'p99_latency_ms': np.float64(1966.6502976417542), 'total_time_s': 6.473495960235596, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.058452629496046606, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 6,scaling_test,claude-3-sonnet,1256.138801574707,3.4732685564079335,0.0,0.0,1.0,0,20,5,1759345848.5701082,0.15360000000000001,10240,0.8863139635356961,"{'min_latency_ms': 641.2796974182129, 'max_latency_ms': 1980.7326793670654, 'p95_latency_ms': np.float64(1846.4025855064392), 'p99_latency_ms': np.float64(1953.86666059494), 'total_time_s': 5.758264780044556, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.05783521510861833, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 11,scaling_test,claude-3-sonnet,1306.07008934021,3.5020347317551495,0.0,0.0,1.0,0,20,5,1759345858.6472163,0.15360000000000001,10240,0.9094961422561505,"{'min_latency_ms': 591.8083190917969, 'max_latency_ms': 1971.1270332336426, 'p95_latency_ms': np.float64(1944.3620324134827), 'p99_latency_ms': np.float64(1965.7740330696106), 'total_time_s': 5.710965633392334, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.042442911768923584, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 16,scaling_test,claude-3-sonnet,1307.1481943130493,3.262938882676132,0.0,0.0,1.0,0,20,5,1759345869.905544,0.15360000000000001,10240,0.8938240662052681,"{'min_latency_ms': 646.7251777648926, 'max_latency_ms': 1990.9627437591553, 'p95_latency_ms': np.float64(1935.0676536560059), 'p99_latency_ms': np.float64(1979.7837257385254), 'total_time_s': 6.129443645477295, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.04247877605865338, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 1,scaling_test,gemini-1.5-pro,1401.3476371765137,2.943218490521141,0.0,0.0,1.0,0,20,5,1759345881.238218,0.0128,10240,0.9409363720199192,"{'min_latency_ms': 520.9827423095703, 'max_latency_ms': 1970.2589511871338, 'p95_latency_ms': np.float64(1958.1118822097778), 'p99_latency_ms': np.float64(1967.8295373916626), 'total_time_s': 6.7952821254730225, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.05267230653872383, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 6,scaling_test,gemini-1.5-pro,1341.485834121704,3.3982951582179024,0.0,0.0,1.0,0,20,5,1759345889.5553467,0.0128,10240,0.9355344625586725,"{'min_latency_ms': 503.9515495300293, 'max_latency_ms': 1978.0657291412354, 'p95_latency_ms': np.float64(1966.320013999939), 'p99_latency_ms': np.float64(1975.716586112976), 'total_time_s': 5.885303974151611, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.054780000845711954, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 11,scaling_test,gemini-1.5-pro,1344.3536400794983,3.445457146125384,0.0,0.0,1.0,0,20,5,1759345898.4512925,0.0128,10240,0.9276983017835836,"{'min_latency_ms': 615.3252124786377, 'max_latency_ms': 1981.612205505371, 'p95_latency_ms': np.float64(1803.935217857361), 'p99_latency_ms': np.float64(1946.0768079757688), 'total_time_s': 5.8047449588775635, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.05905363250623063, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 16,scaling_test,gemini-1.5-pro,1202.2199511528015,3.696869831400932,0.0,0.0,1.0,0,20,5,1759345907.5707264,0.0128,10240,0.9307740387961949,"{'min_latency_ms': 589.9953842163086, 'max_latency_ms': 1967.3075675964355, 'p95_latency_ms': np.float64(1913.6008977890015), 'p99_latency_ms': np.float64(1956.5662336349487), 'total_time_s': 5.409982204437256, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.04978369465928124, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 1,scaling_test,gemini-1.5-flash,1053.9512276649475,3.823265280376166,0.0,0.0,1.0,0,20,5,1759345915.0947819,0.007679999999999998,10240,0.8813998853517441,"{'min_latency_ms': -36.76271438598633, 'max_latency_ms': 1967.0710563659668, 'p95_latency_ms': np.float64(1855.4362535476685), 'p99_latency_ms': np.float64(1944.744095802307), 'total_time_s': 5.231130599975586, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0003839999999999999, 'quality_std': 0.050008698196664016, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 6,scaling_test,gemini-1.5-flash,1155.3911447525024,3.615636866719992,0.0,0.0,1.0,0,20,5,1759345925.0694563,0.007679999999999998,10240,0.9025102091839412,"{'min_latency_ms': 502.6116371154785, 'max_latency_ms': 1947.0453262329102, 'p95_latency_ms': np.float64(1765.414369106293), 'p99_latency_ms': np.float64(1910.7191348075864), 'total_time_s': 5.531528949737549, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0003839999999999999, 'quality_std': 0.059194105459554974, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 11,scaling_test,gemini-1.5-flash,1217.6612257957458,3.756965086673101,0.0,0.0,1.0,0,20,5,1759345934.1183383,0.007679999999999998,10240,0.8709830012564668,"{'min_latency_ms': 560.8868598937988, 'max_latency_ms': 2007.932424545288, 'p95_latency_ms': np.float64(1776.0017752647402), 'p99_latency_ms': np.float64(1961.5462946891782), 'total_time_s': 5.323445796966553, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0003839999999999999, 'quality_std': 0.052873446152615404, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 16,scaling_test,gemini-1.5-flash,1351.5228390693665,3.367995990496259,0.0,0.0,1.0,0,20,5,1759345942.2099788,0.007679999999999998,10240,0.872315613940513,"{'min_latency_ms': 689.1014575958252, 'max_latency_ms': 1980.147361755371, 'p95_latency_ms': np.float64(1956.2964797019958), 'p99_latency_ms': np.float64(1975.377185344696), 'total_time_s': 5.938249349594116, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0003839999999999999, 'quality_std': 0.05361394744479093, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 1,scaling_test,llama-3.1-8b,1306.591236591339,3.3070039261320594,0.0,0.0,1.0,0,20,5,1759345952.8692935,0.002048000000000001,10240,0.7778348786353027,"{'min_latency_ms': 555.4070472717285, 'max_latency_ms': 1988.0244731903076, 'p95_latency_ms': np.float64(1957.3988199234009), 'p99_latency_ms': np.float64(1981.8993425369263), 'total_time_s': 6.047770261764526, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000006, 'quality_std': 0.05832225784189981, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 6,scaling_test,llama-3.1-8b,1199.6222853660583,3.634358086220239,0.0,0.0,1.0,0,20,5,1759345963.5152647,0.002048000000000001,10240,0.7696592403957419,"{'min_latency_ms': 541.0621166229248, 'max_latency_ms': 1914.41011428833, 'p95_latency_ms': np.float64(1768.0468797683716), 'p99_latency_ms': np.float64(1885.1374673843382), 'total_time_s': 5.503035068511963, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000006, 'quality_std': 0.06176209698043544, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 11,scaling_test,llama-3.1-8b,1143.358552455902,4.173916297150752,0.0,0.0,1.0,0,20,5,1759345973.8406181,0.002048000000000001,10240,0.7857043630038748,"{'min_latency_ms': 631.817102432251, 'max_latency_ms': 1720.1111316680908, 'p95_latency_ms': np.float64(1547.544610500336), 'p99_latency_ms': np.float64(1685.5978274345396), 'total_time_s': 4.791662931442261, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000006, 'quality_std': 0.06142254552174686, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 16,scaling_test,llama-3.1-8b,1228.6048531532288,3.613465135130269,0.0,0.0,1.0,0,20,5,1759345982.2759545,0.002048000000000001,10240,0.7706622409066766,"{'min_latency_ms': 539.0913486480713, 'max_latency_ms': 1971.7633724212646, 'p95_latency_ms': np.float64(1819.2362308502197), 'p99_latency_ms': np.float64(1941.2579441070554), 'total_time_s': 5.534853458404541, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000006, 'quality_std': 0.05320944570994387, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 1,scaling_test,llama-3.1-70b,1424.0724563598633,2.989394263900763,0.0,0.0,1.0,0,20,5,1759345993.4949126,0.008192000000000005,10240,0.8731561293258354,"{'min_latency_ms': 700.6974220275879, 'max_latency_ms': 1959.3937397003174, 'p95_latency_ms': np.float64(1924.493396282196), 'p99_latency_ms': np.float64(1952.4136710166931), 'total_time_s': 6.690318584442139, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00040960000000000025, 'quality_std': 0.0352234743129485, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 6,scaling_test,llama-3.1-70b,1090.003514289856,4.145917207566353,0.0,0.0,1.0,0,20,5,1759346002.3353932,0.008192000000000005,10240,0.8796527768140011,"{'min_latency_ms': 508.23211669921875, 'max_latency_ms': 1798.6392974853516, 'p95_latency_ms': np.float64(1785.5579257011414), 'p99_latency_ms': np.float64(1796.0230231285095), 'total_time_s': 4.824023008346558, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00040960000000000025, 'quality_std': 0.06407982743031454, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 11,scaling_test,llama-3.1-70b,964.3666982650757,4.70392645090585,0.0,0.0,1.0,0,20,5,1759346010.6974216,0.008192000000000005,10240,0.8992009479579495,"{'min_latency_ms': 135.56504249572754, 'max_latency_ms': 1794.3906784057617, 'p95_latency_ms': np.float64(1775.5030393600464), 'p99_latency_ms': np.float64(1790.6131505966187), 'total_time_s': 4.251767158508301, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.4296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00040960000000000025, 'quality_std': 0.050182727925105516, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 16,scaling_test,llama-3.1-70b,1258.9476823806763,3.653831604110515,0.125,0.0,1.0,0,20,5,1759346020.388094,0.008192000000000005,10240,0.8930892849911802,"{'min_latency_ms': 620.0413703918457, 'max_latency_ms': 1916.384220123291, 'p95_latency_ms': np.float64(1765.2448296546936), 'p99_latency_ms': np.float64(1886.1563420295713), 'total_time_s': 5.473706007003784, 'initial_memory_mb': 294.4296875, 'final_memory_mb': 294.5546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00040960000000000025, 'quality_std': 0.04969618373257882, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,gpt-4o-mini,1273.702096939087,0.7851086796926611,0.0,0.0,1.0,0,10,1,1759346033.2373884,0.0007680000000000001,5120,0.8342026655690804,"{'min_latency_ms': 741.3482666015625, 'max_latency_ms': 1817.1906471252441, 'p95_latency_ms': np.float64(1794.5520520210266), 'p99_latency_ms': np.float64(1812.6629281044006), 'total_time_s': 12.737090110778809, 'initial_memory_mb': 294.5546875, 'final_memory_mb': 294.5546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000001e-05, 'quality_std': 0.0446055902590032, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,gpt-4o-mini,1511.399483680725,2.933763102440156,0.25,0.0,1.0,0,10,6,1759346036.647214,0.0007680000000000001,5120,0.8471277213854321,"{'min_latency_ms': 800.0023365020752, 'max_latency_ms': 1982.2335243225098, 'p95_latency_ms': np.float64(1942.5656914710999), 'p99_latency_ms': np.float64(1974.2999577522278), 'total_time_s': 3.4085915088653564, 'initial_memory_mb': 294.5546875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000001e-05, 'quality_std': 0.06432848764341552, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,gpt-4o,1150.0491619110107,0.8695228900132853,0.0,0.0,1.0,0,10,1,1759346048.2587333,0.0256,5120,0.9599583095352598,"{'min_latency_ms': 544.191837310791, 'max_latency_ms': 1584.9177837371826, 'p95_latency_ms': np.float64(1511.2051010131834), 'p99_latency_ms': np.float64(1570.1752471923828), 'total_time_s': 11.50055980682373, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.057087428808928614, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,gpt-4o,1241.9081926345825,3.22981029743519,0.0,0.0,1.0,0,10,6,1759346051.3563757,0.0256,5120,0.9585199558650109,"{'min_latency_ms': 644.8915004730225, 'max_latency_ms': 1933.1202507019043, 'p95_latency_ms': np.float64(1865.2720570564268), 'p99_latency_ms': np.float64(1919.5506119728088), 'total_time_s': 3.0961570739746094, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00256, 'quality_std': 0.04062204558012218, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,gpt-4-turbo,1581.8750381469727,0.6321581179029606,0.0,0.0,1.0,0,10,1,1759346067.3017964,0.0512,5120,0.9324427514695872,"{'min_latency_ms': 833.935022354126, 'max_latency_ms': 2019.5622444152832, 'p95_latency_ms': np.float64(1978.4671545028687), 'p99_latency_ms': np.float64(2011.3432264328003), 'total_time_s': 15.818827152252197, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.04654046504268862, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,gpt-4-turbo,1153.432297706604,3.2168993240245847,0.0,0.0,1.0,0,10,6,1759346070.4116762,0.0512,5120,0.9790878168553954,"{'min_latency_ms': 635.2591514587402, 'max_latency_ms': 1833.7628841400146, 'p95_latency_ms': np.float64(1808.298635482788), 'p99_latency_ms': np.float64(1828.6700344085693), 'total_time_s': 3.108583450317383, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00512, 'quality_std': 0.038783270511690816, 'data_size_processed': 1000, 'model_provider': 'gpt'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,claude-3-5-sonnet,1397.6783752441406,0.7154680102707422,0.0,0.0,1.0,0,10,1,1759346084.5017824,0.015359999999999999,5120,0.9421283071854264,"{'min_latency_ms': 532.8092575073242, 'max_latency_ms': 2028.5301208496094, 'p95_latency_ms': np.float64(1968.815779685974), 'p99_latency_ms': np.float64(2016.5872526168823), 'total_time_s': 13.976865291595459, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999998, 'quality_std': 0.041911119259679885, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,claude-3-5-sonnet,1215.26198387146,3.6278421983995233,0.0,0.0,1.0,0,10,6,1759346087.2596216,0.015359999999999999,5120,0.9131170426955485,"{'min_latency_ms': 568.2053565979004, 'max_latency_ms': 1612.9648685455322, 'p95_latency_ms': np.float64(1559.6276402473447), 'p99_latency_ms': np.float64(1602.2974228858948), 'total_time_s': 2.7564594745635986, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0015359999999999998, 'quality_std': 0.04319876804321411, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,claude-3-haiku,1299.2276906967163,0.7696826190331395,0.0,0.0,1.0,0,10,1,1759346100.364407,0.0128,5120,0.8252745814485088,"{'min_latency_ms': 668.3671474456787, 'max_latency_ms': 2041.351318359375, 'p95_latency_ms': np.float64(1843.0875778198238), 'p99_latency_ms': np.float64(2001.6985702514648), 'total_time_s': 12.992368221282959, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.058205855327116265, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,claude-3-haiku,1297.508192062378,3.6581654644321087,0.0,0.0,1.0,0,10,6,1759346103.0993996,0.0128,5120,0.8496515913760503,"{'min_latency_ms': 649.4293212890625, 'max_latency_ms': 1873.1675148010254, 'p95_latency_ms': np.float64(1843.8988208770752), 'p99_latency_ms': np.float64(1867.3137760162354), 'total_time_s': 2.7336106300354004, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00128, 'quality_std': 0.06872259975771335, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,claude-3-sonnet,1239.8123741149902,0.8065692205263874,0.0,0.0,1.0,0,10,1,1759346114.9650035,0.07680000000000001,5120,0.8917269647002374,"{'min_latency_ms': 559.9334239959717, 'max_latency_ms': 1828.9196491241455, 'p95_latency_ms': np.float64(1804.089903831482), 'p99_latency_ms': np.float64(1823.9537000656128), 'total_time_s': 12.398191928863525, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.06728256480558785, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,claude-3-sonnet,1325.3875255584717,3.2305613290400945,0.0,0.0,1.0,0,10,6,1759346118.062173,0.07680000000000001,5120,0.8904253939966993,"{'min_latency_ms': 598.4294414520264, 'max_latency_ms': 1956.3815593719482, 'p95_latency_ms': np.float64(1906.8223834037778), 'p99_latency_ms': np.float64(1946.4697241783142), 'total_time_s': 3.0954372882843018, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000001, 'quality_std': 0.06220445402424322, 'data_size_processed': 1000, 'model_provider': 'claude'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,gemini-1.5-pro,1264.2754554748535,0.7909630217832475,0.0,0.0,1.0,0,10,1,1759346130.8282964,0.0064,5120,0.8998460053229075,"{'min_latency_ms': 532.9890251159668, 'max_latency_ms': 1795.492172241211, 'p95_latency_ms': np.float64(1745.6329107284544), 'p99_latency_ms': np.float64(1785.5203199386597), 'total_time_s': 12.642816066741943, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.04050886994282564, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,gemini-1.5-pro,1342.9006338119507,3.7829150181123015,0.0,0.0,1.0,0,10,6,1759346133.472956,0.0064,5120,0.9029938738274873,"{'min_latency_ms': 701.9498348236084, 'max_latency_ms': 1964.576005935669, 'p95_latency_ms': np.float64(1872.5560665130613), 'p99_latency_ms': np.float64(1946.1720180511475), 'total_time_s': 2.6434640884399414, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00064, 'quality_std': 0.05723923041822323, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,gemini-1.5-flash,1368.2588577270508,0.7308515907093506,0.0,0.0,1.0,0,10,1,1759346147.2717574,0.0038399999999999997,5120,0.8795901650694117,"{'min_latency_ms': 620.3913688659668, 'max_latency_ms': 2018.2685852050781, 'p95_latency_ms': np.float64(1993.7742233276367), 'p99_latency_ms': np.float64(2013.3697128295898), 'total_time_s': 13.682668447494507, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00038399999999999996, 'quality_std': 0.05927449072307118, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,gemini-1.5-flash,1207.8629732131958,3.2879592824302044,0.0,0.0,1.0,0,10,6,1759346150.314617,0.0038399999999999997,5120,0.8611774574826484,"{'min_latency_ms': 594.973087310791, 'max_latency_ms': 1811.2657070159912, 'p95_latency_ms': np.float64(1681.6352963447569), 'p99_latency_ms': np.float64(1785.3396248817444), 'total_time_s': 3.041400194168091, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00038399999999999996, 'quality_std': 0.07904328865026665, 'data_size_processed': 1000, 'model_provider': 'gemini'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,llama-3.1-8b,1144.2910194396973,0.8738903631276332,0.0,0.0,1.0,0,10,1,1759346161.882389,0.0010240000000000002,5120,0.7805684315735588,"{'min_latency_ms': 594.846248626709, 'max_latency_ms': 1759.0994834899902, 'p95_latency_ms': np.float64(1631.7564606666563), 'p99_latency_ms': np.float64(1733.6308789253235), 'total_time_s': 11.443083047866821, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000002, 'quality_std': 0.0613021253594286, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,llama-3.1-8b,1128.666615486145,3.527006383973853,0.0,0.0,1.0,0,10,6,1759346164.7190907,0.0010240000000000002,5120,0.7915276538063776,"{'min_latency_ms': 610.3026866912842, 'max_latency_ms': 1934.2899322509766, 'p95_latency_ms': np.float64(1909.2738270759583), 'p99_latency_ms': np.float64(1929.286711215973), 'total_time_s': 2.835265636444092, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010240000000000002, 'quality_std': 0.055242108041169316, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,llama-3.1-70b,1341.410732269287,0.7454805363345477,0.0,0.0,1.0,0,10,1,1759346178.2571824,0.004096000000000001,5120,0.8513858389112968,"{'min_latency_ms': 566.3845539093018, 'max_latency_ms': 1769.1750526428223, 'p95_latency_ms': np.float64(1743.9924359321594), 'p99_latency_ms': np.float64(1764.1385293006897), 'total_time_s': 13.414166450500488, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004096000000000001, 'quality_std': 0.06286695897481548, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,concurrent_test,llama-3.1-70b,1410.3811264038086,3.52022788340447,0.0,0.0,1.0,0,10,6,1759346181.0992308,0.004096000000000001,5120,0.8534058400920448,"{'min_latency_ms': 572.9773044586182, 'max_latency_ms': 1928.0850887298584, 'p95_latency_ms': np.float64(1903.529143333435), 'p99_latency_ms': np.float64(1923.1738996505737), 'total_time_s': 2.8407251834869385, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004096000000000001, 'quality_std': 0.059750620144052545, 'data_size_processed': 1000, 'model_provider': 'llama'}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gpt-4o-mini,1177.2440481185913,3.97501008701798,0.0,0.0,1.0,0,50,5,1759346193.7901201,0.0038400000000000023,25600,0.8512259391579574,"{'min_latency_ms': 537.5485420227051, 'max_latency_ms': 2001.0862350463867, 'p95_latency_ms': np.float64(1892.5400853157041), 'p99_latency_ms': np.float64(1985.4257130622864), 'total_time_s': 12.578584432601929, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000005e-05, 'quality_std': 0.0581968026848211, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gpt-4o-mini,1229.8026752471924,3.9282369679460363,0.0,0.0,1.0,0,50,5,1759346206.6300905,0.0038400000000000023,25600,0.8537868196468017,"{'min_latency_ms': 518.6026096343994, 'max_latency_ms': 1944.331407546997, 'p95_latency_ms': np.float64(1909.6850633621214), 'p99_latency_ms': np.float64(1940.652117729187), 'total_time_s': 12.72835636138916, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000005e-05, 'quality_std': 0.05181407518487485, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gpt-4o-mini,1274.8144483566284,3.7483119966709824,0.0,0.0,1.0,0,50,5,1759346220.0900073,0.0038400000000000023,25600,0.8487480924622282,"{'min_latency_ms': 529.292106628418, 'max_latency_ms': 1996.4158535003662, 'p95_latency_ms': np.float64(1960.6919050216675), 'p99_latency_ms': np.float64(1988.2149648666382), 'total_time_s': 13.339337825775146, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 7.680000000000005e-05, 'quality_std': 0.05812899461310237, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gpt-4o,1174.5057010650635,4.0514136389986115,0.0,0.0,1.0,0,50,5,1759346232.557784,0.12800000000000017,25600,0.9484191580718665,"{'min_latency_ms': 286.58127784729004, 'max_latency_ms': 1877.345085144043, 'p95_latency_ms': np.float64(1735.1435780525208), 'p99_latency_ms': np.float64(1842.000467777252), 'total_time_s': 12.341371297836304, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.8046875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0025600000000000032, 'quality_std': 0.0491398572941036, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gpt-4o,1225.388593673706,3.875932429633176,0.125,0.0,1.0,0,50,5,1759346245.5669534,0.12800000000000017,25600,0.9557179217710832,"{'min_latency_ms': 514.6803855895996, 'max_latency_ms': 2034.6620082855225, 'p95_latency_ms': np.float64(1909.4360709190366), 'p99_latency_ms': np.float64(2010.34743309021), 'total_time_s': 12.900121688842773, 'initial_memory_mb': 294.8046875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0025600000000000032, 'quality_std': 0.04870463047338363, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gpt-4o,1244.0021991729736,3.7266446101546777,0.0,0.0,1.0,0,50,5,1759346259.1414776,0.12800000000000017,25600,0.9458944372937584,"{'min_latency_ms': 521.9912528991699, 'max_latency_ms': 1986.6855144500732, 'p95_latency_ms': np.float64(1953.3554077148438), 'p99_latency_ms': np.float64(1978.9683985710144), 'total_time_s': 13.416895151138306, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0025600000000000032, 'quality_std': 0.04851286804634898, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gpt-4-turbo,1181.3615322113037,4.124998416603219,0.0,0.0,1.0,0,50,5,1759346271.374578,0.25600000000000034,25600,0.9651345363111258,"{'min_latency_ms': 353.2071113586426, 'max_latency_ms': 1966.524362564087, 'p95_latency_ms': np.float64(1945.0057744979858), 'p99_latency_ms': np.float64(1965.7717752456665), 'total_time_s': 12.121216773986816, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0051200000000000065, 'quality_std': 0.04338778763022959, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gpt-4-turbo,1291.4055681228638,3.77552400952112,0.0,0.0,1.0,0,50,5,1759346284.731812,0.25600000000000034,25600,0.9689389907566063,"{'min_latency_ms': 555.095911026001, 'max_latency_ms': 2027.0910263061523, 'p95_latency_ms': np.float64(1966.5393114089964), 'p99_latency_ms': np.float64(2018.9284563064575), 'total_time_s': 13.243194818496704, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0051200000000000065, 'quality_std': 0.04154143035607859, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gpt-4-turbo,1261.4208269119263,3.663208321130074,0.0,0.0,1.0,0,50,5,1759346298.4905493,0.25600000000000034,25600,0.9573488473081913,"{'min_latency_ms': 284.8320007324219, 'max_latency_ms': 2011.866807937622, 'p95_latency_ms': np.float64(1975.5298137664795), 'p99_latency_ms': np.float64(2000.7115292549133), 'total_time_s': 13.649237394332886, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0051200000000000065, 'quality_std': 0.04380501534660363, 'data_size_processed': 1000, 'model_provider': 'gpt', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,claude-3-5-sonnet,1270.3543138504028,3.7944320989090614,0.0,0.0,1.0,0,50,5,1759346311.7936022,0.07680000000000001,25600,0.948463600922609,"{'min_latency_ms': 622.9770183563232, 'max_latency_ms': 1970.0510501861572, 'p95_latency_ms': np.float64(1868.455410003662), 'p99_latency_ms': np.float64(1957.5506472587585), 'total_time_s': 13.177202463150024, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.001536, 'quality_std': 0.04872900892927657, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,claude-3-5-sonnet,1154.527621269226,4.107802148818313,0.0,0.0,1.0,0,50,5,1759346324.0782034,0.07680000000000001,25600,0.9535056752128789,"{'min_latency_ms': 526.8404483795166, 'max_latency_ms': 1841.3877487182617, 'p95_latency_ms': np.float64(1815.3946280479431), 'p99_latency_ms': np.float64(1837.1384692192078), 'total_time_s': 12.171959161758423, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.001536, 'quality_std': 0.04600056992617095, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,claude-3-5-sonnet,1341.6658163070679,3.5050325493977805,0.0,0.0,1.0,0,50,5,1759346338.4560573,0.07680000000000001,25600,0.947231761746643,"{'min_latency_ms': 607.1841716766357, 'max_latency_ms': 1968.3496952056885, 'p95_latency_ms': np.float64(1938.420307636261), 'p99_latency_ms': np.float64(1963.8122081756592), 'total_time_s': 14.265202760696411, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 294.9296875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.001536, 'quality_std': 0.0468041040494112, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,claude-3-haiku,1268.9041805267334,3.6527405734902607,0.125,0.0,1.0,0,50,5,1759346352.2760284,0.06400000000000008,25600,0.8657832919908838,"{'min_latency_ms': 576.9007205963135, 'max_latency_ms': 1978.3263206481934, 'p95_latency_ms': np.float64(1900.9657382965088), 'p99_latency_ms': np.float64(1977.4397349357605), 'total_time_s': 13.688352346420288, 'initial_memory_mb': 294.9296875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0012800000000000016, 'quality_std': 0.05791027367020173, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,claude-3-haiku,1273.6989831924438,3.7602543777430877,0.0,0.0,1.0,0,50,5,1759346365.681829,0.06400000000000008,25600,0.8396294693060197,"{'min_latency_ms': 521.7316150665283, 'max_latency_ms': 1988.7199401855469, 'p95_latency_ms': np.float64(1945.9344744682312), 'p99_latency_ms': np.float64(1987.1683859825134), 'total_time_s': 13.296972751617432, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0012800000000000016, 'quality_std': 0.06291349263235946, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,claude-3-haiku,1234.9269914627075,3.9335082345318124,0.0,0.0,1.0,0,50,5,1759346378.5192664,0.06400000000000008,25600,0.8469784358915146,"{'min_latency_ms': 529.503345489502, 'max_latency_ms': 1981.7008972167969, 'p95_latency_ms': np.float64(1859.1547846794128), 'p99_latency_ms': np.float64(1963.3227896690369), 'total_time_s': 12.711299180984497, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0012800000000000016, 'quality_std': 0.061722943046806616, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,claude-3-sonnet,1195.9008169174194,4.06962738382444,0.0,0.0,1.0,0,50,5,1759346390.9144897,0.3840000000000003,25600,0.9026531444228556,"{'min_latency_ms': -36.6673469543457, 'max_latency_ms': 1991.610050201416, 'p95_latency_ms': np.float64(1819.4202184677124), 'p99_latency_ms': np.float64(1987.222683429718), 'total_time_s': 12.286137104034424, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000005, 'quality_std': 0.058229589360407986, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,claude-3-sonnet,1372.0379829406738,3.502253345465805,0.0,0.0,1.0,0,50,5,1759346405.3043494,0.3840000000000003,25600,0.8837364473272626,"{'min_latency_ms': 543.1270599365234, 'max_latency_ms': 1992.779016494751, 'p95_latency_ms': np.float64(1931.822681427002), 'p99_latency_ms': np.float64(1987.4089169502258), 'total_time_s': 14.276522874832153, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000005, 'quality_std': 0.05634614113838598, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,claude-3-sonnet,1257.2709035873413,3.7764857062182706,0.0,0.0,1.0,0,50,5,1759346418.6521854,0.3840000000000003,25600,0.9053414058751514,"{'min_latency_ms': 529.8404693603516, 'max_latency_ms': 1990.1280403137207, 'p95_latency_ms': np.float64(1911.1806631088257), 'p99_latency_ms': np.float64(1976.6331052780151), 'total_time_s': 13.239822387695312, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.007680000000000005, 'quality_std': 0.050506656009957705, 'data_size_processed': 1000, 'model_provider': 'claude', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gemini-1.5-pro,1221.5951490402222,3.8372908969845323,0.0,0.0,1.0,0,50,5,1759346431.7921565,0.03200000000000004,25600,0.9365925291921394,"{'min_latency_ms': 329.1811943054199, 'max_latency_ms': 1995.384693145752, 'p95_latency_ms': np.float64(1965.0332808494568), 'p99_latency_ms': np.float64(1988.3063769340515), 'total_time_s': 13.030025959014893, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0006400000000000008, 'quality_std': 0.04847128641002876, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gemini-1.5-pro,1351.8355464935303,3.6227975436552606,0.0,0.0,1.0,0,50,5,1759346445.7126448,0.03200000000000004,25600,0.9323552590826123,"{'min_latency_ms': 515.129566192627, 'max_latency_ms': 2008.0702304840088, 'p95_latency_ms': np.float64(1958.6564779281616), 'p99_latency_ms': np.float64(2004.1296029090881), 'total_time_s': 13.801488876342773, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0006400000000000008, 'quality_std': 0.055840796126395656, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gemini-1.5-pro,1240.622534751892,3.8813384098374453,0.0,0.0,1.0,0,50,5,1759346458.7192729,0.03200000000000004,25600,0.9407390543744837,"{'min_latency_ms': -29.146671295166016, 'max_latency_ms': 1934.4398975372314, 'p95_latency_ms': np.float64(1849.7230291366577), 'p99_latency_ms': np.float64(1918.0084466934204), 'total_time_s': 12.8821542263031, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0006400000000000008, 'quality_std': 0.050597003908357786, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gemini-1.5-flash,1237.6702642440796,3.812923495644346,0.0,0.0,1.0,0,50,5,1759346471.9588974,0.019200000000000002,25600,0.8556073429019542,"{'min_latency_ms': 536.4787578582764, 'max_latency_ms': 2010.1728439331055, 'p95_latency_ms': np.float64(1911.8669629096985), 'p99_latency_ms': np.float64(1976.080708503723), 'total_time_s': 13.113297462463379, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.000384, 'quality_std': 0.06082135675952047, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gemini-1.5-flash,1180.0980806350708,4.016049090832003,0.0,0.0,1.0,0,50,5,1759346484.5327744,0.019200000000000002,25600,0.8718428063415768,"{'min_latency_ms': 109.58051681518555, 'max_latency_ms': 1993.358850479126, 'p95_latency_ms': np.float64(1872.3165988922117), 'p99_latency_ms': np.float64(1992.416422367096), 'total_time_s': 12.450047016143799, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.000384, 'quality_std': 0.0613916834940056, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,gemini-1.5-flash,1194.4490098953247,4.009936119483076,0.0,0.0,1.0,0,50,5,1759346497.1201088,0.019200000000000002,25600,0.8652112059805899,"{'min_latency_ms': 520.3211307525635, 'max_latency_ms': 1942.4259662628174, 'p95_latency_ms': np.float64(1834.6370577812195), 'p99_latency_ms': np.float64(1890.3984904289243), 'total_time_s': 12.469026565551758, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.000384, 'quality_std': 0.05312368368226588, 'data_size_processed': 1000, 'model_provider': 'gemini', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,llama-3.1-8b,1306.2016773223877,3.683763547696555,0.0,0.0,1.0,0,50,5,1759346510.812732,0.005119999999999998,25600,0.7727309350554936,"{'min_latency_ms': 527.4953842163086, 'max_latency_ms': 1997.086524963379, 'p95_latency_ms': np.float64(1942.7793741226194), 'p99_latency_ms': np.float64(1994.0643763542175), 'total_time_s': 13.573075294494629, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010239999999999995, 'quality_std': 0.05596283861854901, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,llama-3.1-8b,1304.1251468658447,3.617383744773005,0.0,0.0,1.0,0,50,5,1759346524.7711937,0.005119999999999998,25600,0.785787220179362,"{'min_latency_ms': 112.00571060180664, 'max_latency_ms': 2015.146255493164, 'p95_latency_ms': np.float64(2001.4938592910767), 'p99_latency_ms': np.float64(2012.321424484253), 'total_time_s': 13.822144269943237, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010239999999999995, 'quality_std': 0.0552285639827787, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,llama-3.1-8b,1290.5346298217773,3.671522710311051,0.0,0.0,1.0,0,50,5,1759346538.5084107,0.005119999999999998,25600,0.7771978709125356,"{'min_latency_ms': 565.7510757446289, 'max_latency_ms': 1945.1093673706055, 'p95_latency_ms': np.float64(1906.785237789154), 'p99_latency_ms': np.float64(1942.4526476860046), 'total_time_s': 13.618327856063843, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.00010239999999999995, 'quality_std': 0.057252814774054535, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,llama-3.1-70b,1213.9334726333618,3.947675276737486,0.0,0.0,1.0,0,50,5,1759346551.2951744,0.02047999999999999,25600,0.8683286341213061,"{'min_latency_ms': -79.86569404602051, 'max_latency_ms': 2014.9149894714355, 'p95_latency_ms': np.float64(1919.9433565139768), 'p99_latency_ms': np.float64(1992.4925136566162), 'total_time_s': 12.665682077407837, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004095999999999998, 'quality_std': 0.05862810413022958, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 0}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,llama-3.1-70b,1298.1958770751953,3.7049711897976763,0.0,0.0,1.0,0,50,5,1759346564.9280033,0.02047999999999999,25600,0.8889975698232048,"{'min_latency_ms': 503.5574436187744, 'max_latency_ms': 2020.4124450683594, 'p95_latency_ms': np.float64(1901.4497756958008), 'p99_latency_ms': np.float64(1986.3133001327512), 'total_time_s': 13.495381593704224, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004095999999999998, 'quality_std': 0.053463278827038344, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 1}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
 5,memory_test,llama-3.1-70b,1187.040138244629,4.165139112812611,0.0,0.0,1.0,0,50,5,1759346577.0467978,0.02047999999999999,25600,0.8884529182459214,"{'min_latency_ms': 506.2377452850342, 'max_latency_ms': 2026.6106128692627, 'p95_latency_ms': np.float64(1958.3556652069092), 'p99_latency_ms': np.float64(2007.5032830238342), 'total_time_s': 12.004400968551636, 'initial_memory_mb': 295.0546875, 'final_memory_mb': 295.0546875, 'avg_tokens_per_request': 512.0, 'cost_per_request': 0.0004095999999999998, 'quality_std': 0.05625669416735748, 'data_size_processed': 1000, 'model_provider': 'llama', 'iteration': 2}",0.0,0.0,0.0,0.0,0,False,0,0.0,0,0.0,False
--- a/tests/aop/test_data/aop_benchmark_data/totalbench.png
+++ b/tests/aop/test_data/aop_benchmark_data/totalbench.png
--- a/tests/aop/test_data/image1.jpg
+++ b/tests/aop/test_data/image1.jpg
--- a/tests/aop/test_data/image2.png
+++ b/tests/aop/test_data/image2.png
--- a/tests/prompts/test_prompt.py
+++ b/tests/prompts/test_prompt.py
--- a/tests/structs/test_auto_swarm_builder_fix.py
+++ b/tests/structs/test_auto_swarm_builder_fix.py
@ -0,0 +1,293 @@
 """
 Tests for bug #1115 fix in AutoSwarmBuilder.
 This test module verifies the fix for AttributeError when creating agents
 from AgentSpec Pydantic models in AutoSwarmBuilder.
 Bug: https://github.com/kyegomez/swarms/issues/1115
 """
 import pytest
 from swarms.structs.agent import Agent
 from swarms.structs.auto_swarm_builder import (
    AgentSpec,
    AutoSwarmBuilder,
 )
 from swarms.structs.ma_utils import set_random_models_for_agents
 class TestAutoSwarmBuilderFix:
    """Tests for bug #1115 fix in AutoSwarmBuilder."""
    def test_create_agents_from_specs_with_dict(self):
        """Test that create_agents_from_specs handles dict input correctly."""
        builder = AutoSwarmBuilder()
        # Create specs as a dictionary
        specs = {
            "agents": [
                {
                    "agent_name": "test_agent_1",
                    "description": "Test agent 1 description",
                    "system_prompt": "You are a helpful assistant",
                    "model_name": "gpt-4o-mini",
                    "max_loops": 1,
                }
            ]
        }
        agents = builder.create_agents_from_specs(specs)
        # Verify agents were created correctly
        assert len(agents) == 1
        assert isinstance(agents[0], Agent)
        assert agents[0].agent_name == "test_agent_1"
        # Verify description was mapped to agent_description
        assert hasattr(agents[0], "agent_description")
        assert (
            agents[0].agent_description == "Test agent 1 description"
        )
    def test_create_agents_from_specs_with_pydantic(self):
        """Test that create_agents_from_specs handles Pydantic model input correctly.
        This is the main test for bug #1115 - it verifies that AgentSpec
        Pydantic models can be unpacked correctly.
        """
        builder = AutoSwarmBuilder()
        # Create specs as Pydantic AgentSpec objects
        agent_spec = AgentSpec(
            agent_name="test_agent_pydantic",
            description="Pydantic test agent",
            system_prompt="You are a helpful assistant",
            model_name="gpt-4o-mini",
            max_loops=1,
        )
        specs = {"agents": [agent_spec]}
        agents = builder.create_agents_from_specs(specs)
        # Verify agents were created correctly
        assert len(agents) == 1
        assert isinstance(agents[0], Agent)
        assert agents[0].agent_name == "test_agent_pydantic"
        # Verify description was mapped to agent_description
        assert hasattr(agents[0], "agent_description")
        assert agents[0].agent_description == "Pydantic test agent"
    def test_parameter_name_mapping(self):
        """Test that 'description' field maps to 'agent_description' correctly."""
        builder = AutoSwarmBuilder()
        # Test with dict that has 'description'
        specs = {
            "agents": [
                {
                    "agent_name": "mapping_test",
                    "description": "This should map to agent_description",
                    "system_prompt": "You are helpful",
                }
            ]
        }
        agents = builder.create_agents_from_specs(specs)
        assert len(agents) == 1
        agent = agents[0]
        # Verify description was mapped
        assert hasattr(agent, "agent_description")
        assert (
            agent.agent_description
            == "This should map to agent_description"
        )
    def test_create_agents_from_specs_mixed_input(self):
        """Test that create_agents_from_specs handles mixed dict and Pydantic input."""
        builder = AutoSwarmBuilder()
        # Mix of dict and Pydantic objects
        dict_spec = {
            "agent_name": "dict_agent",
            "description": "Dict agent description",
            "system_prompt": "You are helpful",
        }
        pydantic_spec = AgentSpec(
            agent_name="pydantic_agent",
            description="Pydantic agent description",
            system_prompt="You are smart",
        )
        specs = {"agents": [dict_spec, pydantic_spec]}
        agents = builder.create_agents_from_specs(specs)
        # Verify both agents were created
        assert len(agents) == 2
        assert all(isinstance(agent, Agent) for agent in agents)
        # Verify both have correct descriptions
        dict_agent = next(
            a for a in agents if a.agent_name == "dict_agent"
        )
        pydantic_agent = next(
            a for a in agents if a.agent_name == "pydantic_agent"
        )
        assert (
            dict_agent.agent_description == "Dict agent description"
        )
        assert (
            pydantic_agent.agent_description
            == "Pydantic agent description"
        )
    def test_set_random_models_for_agents_with_valid_agents(
        self,
    ):
        """Test set_random_models_for_agents with proper Agent objects."""
        # Create proper Agent objects
        agents = [
            Agent(
                agent_name="agent1",
                system_prompt="You are agent 1",
                max_loops=1,
            ),
            Agent(
                agent_name="agent2",
                system_prompt="You are agent 2",
                max_loops=1,
            ),
        ]
        # Set random models
        model_names = ["gpt-4o-mini", "gpt-4o", "claude-3-5-sonnet"]
        result = set_random_models_for_agents(
            agents=agents, model_names=model_names
        )
        # Verify results
        assert len(result) == 2
        assert all(isinstance(agent, Agent) for agent in result)
        assert all(hasattr(agent, "model_name") for agent in result)
        assert all(
            agent.model_name in model_names for agent in result
        )
    def test_set_random_models_for_agents_with_single_agent(
        self,
    ):
        """Test set_random_models_for_agents with a single agent."""
        agent = Agent(
            agent_name="single_agent",
            system_prompt="You are helpful",
            max_loops=1,
        )
        model_names = ["gpt-4o-mini", "gpt-4o"]
        result = set_random_models_for_agents(
            agents=agent, model_names=model_names
        )
        assert isinstance(result, Agent)
        assert hasattr(result, "model_name")
        assert result.model_name in model_names
    def test_set_random_models_for_agents_with_none(self):
        """Test set_random_models_for_agents with None returns random model name."""
        model_names = ["gpt-4o-mini", "gpt-4o", "claude-3-5-sonnet"]
        result = set_random_models_for_agents(
            agents=None, model_names=model_names
        )
        assert isinstance(result, str)
        assert result in model_names
    @pytest.mark.skip(
        reason="This test requires API key and makes LLM calls"
    )
    def test_auto_swarm_builder_return_agents_objects_integration(
        self,
    ):
        """Integration test for AutoSwarmBuilder with execution_type='return-agents-objects'.
        This test requires OPENAI_API_KEY and makes actual LLM calls.
        Run manually with: pytest -k test_auto_swarm_builder_return_agents_objects_integration -v
        """
        builder = AutoSwarmBuilder(
            execution_type="return-agents-objects",
            model_name="gpt-4o-mini",
            max_loops=1,
            verbose=False,
        )
        agents = builder.run(
            "Create a team of 2 data analysis agents with specific roles"
        )
        # Verify agents were created
        assert isinstance(agents, list)
        assert len(agents) >= 1
        assert all(isinstance(agent, Agent) for agent in agents)
        assert all(hasattr(agent, "agent_name") for agent in agents)
        assert all(
            hasattr(agent, "agent_description") for agent in agents
        )
    def test_agent_spec_to_agent_all_fields(self):
        """Test that all AgentSpec fields are properly passed to Agent."""
        builder = AutoSwarmBuilder()
        agent_spec = AgentSpec(
            agent_name="full_test_agent",
            description="Full test description",
            system_prompt="You are a comprehensive test agent",
            model_name="gpt-4o-mini",
            auto_generate_prompt=False,
            max_tokens=4096,
            temperature=0.7,
            role="worker",
            max_loops=3,
            goal="Test all parameters",
        )
        agents = builder.create_agents_from_specs(
            {"agents": [agent_spec]}
        )
        assert len(agents) == 1
        agent = agents[0]
        # Verify all fields were set
        assert agent.agent_name == "full_test_agent"
        assert agent.agent_description == "Full test description"
        # Agent may modify system_prompt by adding additional instructions
        assert (
            "You are a comprehensive test agent"
            in agent.system_prompt
        )
        assert agent.max_loops == 3
        assert agent.max_tokens == 4096
        assert agent.temperature == 0.7
    def test_create_agents_from_specs_empty_list(self):
        """Test that create_agents_from_specs handles empty agent list."""
        builder = AutoSwarmBuilder()
        specs = {"agents": []}
        agents = builder.create_agents_from_specs(specs)
        assert isinstance(agents, list)
        assert len(agents) == 0
 if __name__ == "__main__":
    # Run tests with pytest
    pytest.main([__file__, "-v", "--tb=short"])
--- a/tests/telemetry/test_user_utils.py
+++ b/tests/telemetry/test_user_utils.py
@ -1,10 +1,8 @@
 import uuid
 from swarms.telemetry.main import (
    generate_unique_identifier,
    generate_user_id,
    get_machine_id,
    get_system_info,
 )
@ -24,33 +22,6 @@ def test_get_machine_id():
    assert all(char in "0123456789abcdef" for char in machine_id)
 def test_get_system_info():
    # Get system information and ensure it's a dictionary with expected keys
    system_info = get_system_info()
    assert isinstance(system_info, dict)
    expected_keys = [
        "platform",
        "platform_release",
        "platform_version",
        "architecture",
        "hostname",
        "ip_address",
        "mac_address",
        "processor",
        "python_version",
    ]
    assert all(key in system_info for key in expected_keys)
 def test_generate_unique_identifier():
    # Generate unique identifiers and ensure they are valid UUID strings
    unique_id = generate_unique_identifier()
    assert isinstance(unique_id, str)
    assert uuid.UUID(
        unique_id, version=5, namespace=uuid.NAMESPACE_DNS
    )
 def test_generate_user_id_edge_case():
    # Test generate_user_id with multiple calls
    user_ids = set()
@ -69,33 +40,13 @@ def test_get_machine_id_edge_case():
    assert len(machine_ids) == 100  # Ensure generated IDs are unique
 def test_get_system_info_edge_case():
    # Test get_system_info for consistency
    system_info1 = get_system_info()
    system_info2 = get_system_info()
    assert (
        system_info1 == system_info2
    )  # Ensure system info remains the same
 def test_generate_unique_identifier_edge_case():
    # Test generate_unique_identifier for uniqueness
    unique_ids = set()
    for _ in range(100):
        unique_id = generate_unique_identifier()
        unique_ids.add(unique_id)
    assert len(unique_ids) == 100  # Ensure generated IDs are unique
 def test_all():
    test_generate_user_id()
    test_get_machine_id()
    test_get_system_info()
    test_generate_unique_identifier()
    test_generate_user_id_edge_case()
    test_get_machine_id_edge_case()
-    test_get_system_info_edge_case()
+
    test_generate_unique_identifier_edge_case()
 test_all()
--- a/tests/test_comprehensive_test.py
+++ b/tests/test_comprehensive_test.py
@ -1,37 +1,29 @@
 import os
 import json
 import os
 from datetime import datetime
-from typing import List, Dict, Any, Callable
+from typing import Any, Callable, Dict, List
 from dotenv import load_dotenv
 from loguru import logger
 # Basic Imports for Swarms
 from swarms.structs import (
    Agent,
    SequentialWorkflow,
    ConcurrentWorkflow,
    AgentRearrange,
-    MixtureOfAgents,
+    ConcurrentWorkflow,
    SpreadSheetSwarm,
    GroupChat,
-    MultiAgentRouter,
+    InteractiveGroupChat,
    MajorityVoting,
-    SwarmRouter,
+    MixtureOfAgents,
    MultiAgentRouter,
    RoundRobinSwarm,
-    InteractiveGroupChat,
+    SequentialWorkflow,
    SpreadSheetSwarm,
    SwarmRouter,
 )
 # Import swarms not in __init__.py directly
 from swarms.structs.hiearchical_swarm import HierarchicalSwarm
 from swarms.structs.tree_swarm import ForestSwarm, Tree, TreeAgent
 # Setup Logging
 from loguru import logger
 logger.add(
    "test_runs/test_failures.log", rotation="10 MB", level="ERROR"
 )
 # Load environment variables
 load_dotenv()
@ -463,8 +455,8 @@ def test_spreadsheet_swarm():
 def test_hierarchical_swarm():
    """Test HierarchicalSwarm structure"""
    try:
        from swarms.utils.litellm_wrapper import LiteLLM
        from swarms.structs.hiearchical_swarm import SwarmSpec
        from swarms.utils.litellm_wrapper import LiteLLM
        # Create worker agents
        workers = [
--- a/tests/tools/test_output_str_fix.py
+++ b/tests/tools/test_output_str_fix.py
@ -0,0 +1,150 @@
 from pydantic import BaseModel
 from swarms.tools.pydantic_to_json import (
    base_model_to_openai_function,
    multi_base_model_to_openai_function,
 )
 from swarms.tools.base_tool import BaseTool
 # Test Pydantic model
 class TestModel(BaseModel):
    """A test model for validation."""
    name: str
    age: int
    email: str = "test@example.com"
 def test_base_model_to_openai_function():
    """Test that base_model_to_openai_function accepts output_str parameter."""
    print(
        "Testing base_model_to_openai_function with output_str=False..."
    )
    result_dict = base_model_to_openai_function(
        TestModel, output_str=False
    )
    print(f"✓ Dict result type: {type(result_dict)}")
    print(f"✓ Dict result keys: {list(result_dict.keys())}")
    print(
        "\nTesting base_model_to_openai_function with output_str=True..."
    )
    result_str = base_model_to_openai_function(
        TestModel, output_str=True
    )
    print(f"✓ String result type: {type(result_str)}")
    print(f"✓ String result preview: {result_str[:100]}...")
 def test_multi_base_model_to_openai_function():
    """Test that multi_base_model_to_openai_function handles output_str correctly."""
    print(
        "\nTesting multi_base_model_to_openai_function with output_str=False..."
    )
    result_dict = multi_base_model_to_openai_function(
        [TestModel], output_str=False
    )
    print(f"✓ Dict result type: {type(result_dict)}")
    print(f"✓ Dict result keys: {list(result_dict.keys())}")
    print(
        "\nTesting multi_base_model_to_openai_function with output_str=True..."
    )
    result_str = multi_base_model_to_openai_function(
        [TestModel], output_str=True
    )
    print(f"✓ String result type: {type(result_str)}")
    print(f"✓ String result preview: {result_str[:100]}...")
 def test_base_tool_methods():
    """Test that BaseTool methods handle output_str parameter correctly."""
    print(
        "\nTesting BaseTool.base_model_to_dict with output_str=False..."
    )
    tool = BaseTool()
    result_dict = tool.base_model_to_dict(TestModel, output_str=False)
    print(f"✓ Dict result type: {type(result_dict)}")
    print(f"✓ Dict result keys: {list(result_dict.keys())}")
    print(
        "\nTesting BaseTool.base_model_to_dict with output_str=True..."
    )
    result_str = tool.base_model_to_dict(TestModel, output_str=True)
    print(f"✓ String result type: {type(result_str)}")
    print(f"✓ String result preview: {result_str[:100]}...")
    print(
        "\nTesting BaseTool.multi_base_models_to_dict with output_str=False..."
    )
    result_dict = tool.multi_base_models_to_dict(
        [TestModel], output_str=False
    )
    print(f"✓ Dict result type: {type(result_dict)}")
    print(f"✓ Dict result length: {len(result_dict)}")
    print(
        "\nTesting BaseTool.multi_base_models_to_dict with output_str=True..."
    )
    result_str = tool.multi_base_models_to_dict(
        [TestModel], output_str=True
    )
    print(f"✓ String result type: {type(result_str)}")
    print(f"✓ String result preview: {result_str[:100]}...")
 def test_agent_integration():
    """Test that the Agent class can use the fixed methods without errors."""
    print("\nTesting Agent integration...")
    try:
        from swarms import Agent
        # Create a simple agent with a tool schema
        agent = Agent(
            model_name="gpt-4o-mini",
            tool_schema=TestModel,
            max_loops=1,
            verbose=True,
        )
        # This should not raise an error anymore
        agent.handle_tool_schema_ops()
        print(
            "✓ Agent.handle_tool_schema_ops() completed successfully"
        )
    except Exception as e:
        print(f"✗ Agent integration failed: {e}")
        return False
    return True
 if __name__ == "__main__":
    print("=" * 60)
    print("Testing output_str parameter fix")
    print("=" * 60)
    try:
        test_base_model_to_openai_function()
        test_multi_base_model_to_openai_function()
        test_base_tool_methods()
        if test_agent_integration():
            print("\n" + "=" * 60)
            print(
                "✅ All tests passed! The output_str parameter fix is working correctly."
            )
            print("=" * 60)
        else:
            print("\n" + "=" * 60)
            print(
                "❌ Some tests failed. Please check the implementation."
            )
            print("=" * 60)
    except Exception as e:
        print(f"\n❌ Test failed with error: {e}")
        import traceback
        traceback.print_exc()
--- a/tests/utils/test_acompletions.py
+++ b/tests/utils/test_acompletions.py
@ -3,14 +3,6 @@ from dotenv import load_dotenv
 load_dotenv()
 ## [OPTIONAL] REGISTER MODEL - not all ollama models support function calling, litellm defaults to json mode tool calls if native tool calling not supported.
 # litellm.register_model(model_cost={
 #                 "ollama_chat/llama3.1": {
 #                   "supports_function_calling": true
 #                 },
 #             })
 tools = [
    {
        "type": "function",
--- a/tests/utils/test_docstring_parser.py
+++ b/tests/utils/test_docstring_parser.py
@ -0,0 +1,431 @@
 """
 Test suite for the custom docstring parser implementation.
 This module contains comprehensive tests to ensure the docstring parser
 works correctly with various docstring formats and edge cases.
 """
 import pytest
 from swarms.utils.docstring_parser import (
    parse,
    DocstringParam,
 )
 class TestDocstringParser:
    """Test cases for the docstring parser functionality."""
    def test_empty_docstring(self):
        """Test parsing of empty docstring."""
        result = parse("")
        assert result.short_description is None
        assert result.params == []
    def test_none_docstring(self):
        """Test parsing of None docstring."""
        result = parse(None)
        assert result.short_description is None
        assert result.params == []
    def test_whitespace_only_docstring(self):
        """Test parsing of whitespace-only docstring."""
        result = parse("   \n  \t  \n  ")
        assert result.short_description is None
        assert result.params == []
    def test_simple_docstring_no_args(self):
        """Test parsing of simple docstring without Args section."""
        docstring = """
        This is a simple function.
        Returns:
            str: A simple string
        """
        result = parse(docstring)
        assert (
            result.short_description == "This is a simple function."
        )
        assert result.params == []
    def test_docstring_with_args(self):
        """Test parsing of docstring with Args section."""
        docstring = """
        This is a test function.
        Args:
            param1 (str): First parameter
            param2 (int): Second parameter
            param3 (bool, optional): Third parameter with default
        Returns:
            str: Return value description
        """
        result = parse(docstring)
        assert result.short_description == "This is a test function."
        assert len(result.params) == 3
        assert result.params[0] == DocstringParam(
            "param1", "First parameter"
        )
        assert result.params[1] == DocstringParam(
            "param2", "Second parameter"
        )
        assert result.params[2] == DocstringParam(
            "param3", "Third parameter with default"
        )
    def test_docstring_with_parameters_section(self):
        """Test parsing of docstring with Parameters section."""
        docstring = """
        Another test function.
        Parameters:
            name (str): The name parameter
            age (int): The age parameter
        Returns:
            None: Nothing is returned
        """
        result = parse(docstring)
        assert result.short_description == "Another test function."
        assert len(result.params) == 2
        assert result.params[0] == DocstringParam(
            "name", "The name parameter"
        )
        assert result.params[1] == DocstringParam(
            "age", "The age parameter"
        )
    def test_docstring_with_multiline_param_description(self):
        """Test parsing of docstring with multiline parameter descriptions."""
        docstring = """
        Function with multiline descriptions.
        Args:
            param1 (str): This is a very long description
                that spans multiple lines and should be
                properly concatenated.
            param2 (int): Short description
        Returns:
            str: Result
        """
        result = parse(docstring)
        assert (
            result.short_description
            == "Function with multiline descriptions."
        )
        assert len(result.params) == 2
        expected_desc = "This is a very long description that spans multiple lines and should be properly concatenated."
        assert result.params[0] == DocstringParam(
            "param1", expected_desc
        )
        assert result.params[1] == DocstringParam(
            "param2", "Short description"
        )
    def test_docstring_without_type_annotations(self):
        """Test parsing of docstring without type annotations."""
        docstring = """
        Function without type annotations.
        Args:
            param1: First parameter without type
            param2: Second parameter without type
        Returns:
            str: Result
        """
        result = parse(docstring)
        assert (
            result.short_description
            == "Function without type annotations."
        )
        assert len(result.params) == 2
        assert result.params[0] == DocstringParam(
            "param1", "First parameter without type"
        )
        assert result.params[1] == DocstringParam(
            "param2", "Second parameter without type"
        )
    def test_pydantic_style_docstring(self):
        """Test parsing of Pydantic-style docstring."""
        docstring = """
        Convert a Pydantic model to a dictionary representation of functions.
        Args:
            pydantic_type (type[BaseModel]): The Pydantic model type to convert.
        Returns:
            dict[str, Any]: A dictionary representation of the functions.
        """
        result = parse(docstring)
        assert (
            result.short_description
            == "Convert a Pydantic model to a dictionary representation of functions."
        )
        assert len(result.params) == 1
        assert result.params[0] == DocstringParam(
            "pydantic_type", "The Pydantic model type to convert."
        )
    def test_docstring_with_various_sections(self):
        """Test parsing of docstring with multiple sections."""
        docstring = """
        Complex function with multiple sections.
        Args:
            input_data (dict): Input data dictionary
            validate (bool): Whether to validate input
        Returns:
            dict: Processed data
        Raises:
            ValueError: If input is invalid
        Note:
            This is a note section
        Example:
            >>> result = complex_function({"key": "value"})
        """
        result = parse(docstring)
        assert (
            result.short_description
            == "Complex function with multiple sections."
        )
        assert len(result.params) == 2
        assert result.params[0] == DocstringParam(
            "input_data", "Input data dictionary"
        )
        assert result.params[1] == DocstringParam(
            "validate", "Whether to validate input"
        )
    def test_docstring_with_see_also_section(self):
        """Test parsing of docstring with See Also section."""
        docstring = """
        Function with See Also section.
        Args:
            param1 (str): First parameter
        See Also:
            related_function: For related functionality
        """
        result = parse(docstring)
        assert (
            result.short_description
            == "Function with See Also section."
        )
        assert len(result.params) == 1
        assert result.params[0] == DocstringParam(
            "param1", "First parameter"
        )
    def test_docstring_with_see_also_underscore_section(self):
        """Test parsing of docstring with See_Also section (underscore variant)."""
        docstring = """
        Function with See_Also section.
        Args:
            param1 (str): First parameter
        See_Also:
            related_function: For related functionality
        """
        result = parse(docstring)
        assert (
            result.short_description
            == "Function with See_Also section."
        )
        assert len(result.params) == 1
        assert result.params[0] == DocstringParam(
            "param1", "First parameter"
        )
    def test_docstring_with_yields_section(self):
        """Test parsing of docstring with Yields section."""
        docstring = """
        Generator function.
        Args:
            items (list): List of items to process
        Yields:
            str: Processed item
        """
        result = parse(docstring)
        assert result.short_description == "Generator function."
        assert len(result.params) == 1
        assert result.params[0] == DocstringParam(
            "items", "List of items to process"
        )
    def test_docstring_with_raises_section(self):
        """Test parsing of docstring with Raises section."""
        docstring = """
        Function that can raise exceptions.
        Args:
            value (int): Value to process
        Raises:
            ValueError: If value is negative
        """
        result = parse(docstring)
        assert (
            result.short_description
            == "Function that can raise exceptions."
        )
        assert len(result.params) == 1
        assert result.params[0] == DocstringParam(
            "value", "Value to process"
        )
    def test_docstring_with_examples_section(self):
        """Test parsing of docstring with Examples section."""
        docstring = """
        Function with examples.
        Args:
            x (int): Input value
        Examples:
            >>> result = example_function(5)
            >>> print(result)
        """
        result = parse(docstring)
        assert result.short_description == "Function with examples."
        assert len(result.params) == 1
        assert result.params[0] == DocstringParam("x", "Input value")
    def test_docstring_with_note_section(self):
        """Test parsing of docstring with Note section."""
        docstring = """
        Function with a note.
        Args:
            data (str): Input data
        Note:
            This function is deprecated
        """
        result = parse(docstring)
        assert result.short_description == "Function with a note."
        assert len(result.params) == 1
        assert result.params[0] == DocstringParam(
            "data", "Input data"
        )
    def test_docstring_with_complex_type_annotations(self):
        """Test parsing of docstring with complex type annotations."""
        docstring = """
        Function with complex types.
        Args:
            data (List[Dict[str, Any]]): Complex data structure
            callback (Callable[[str], int]): Callback function
            optional (Optional[str], optional): Optional parameter
        Returns:
            Union[str, None]: Result or None
        """
        result = parse(docstring)
        assert (
            result.short_description == "Function with complex types."
        )
        assert len(result.params) == 3
        assert result.params[0] == DocstringParam(
            "data", "Complex data structure"
        )
        assert result.params[1] == DocstringParam(
            "callback", "Callback function"
        )
        assert result.params[2] == DocstringParam(
            "optional", "Optional parameter"
        )
    def test_docstring_with_no_description(self):
        """Test parsing of docstring with no description, only Args."""
        docstring = """
        Args:
            param1 (str): First parameter
            param2 (int): Second parameter
        """
        result = parse(docstring)
        assert result.short_description is None
        assert len(result.params) == 2
        assert result.params[0] == DocstringParam(
            "param1", "First parameter"
        )
        assert result.params[1] == DocstringParam(
            "param2", "Second parameter"
        )
    def test_docstring_with_empty_args_section(self):
        """Test parsing of docstring with empty Args section."""
        docstring = """
        Function with empty Args section.
        Args:
        Returns:
            str: Result
        """
        result = parse(docstring)
        assert (
            result.short_description
            == "Function with empty Args section."
        )
        assert result.params == []
    def test_docstring_with_mixed_indentation(self):
        """Test parsing of docstring with mixed indentation."""
        docstring = """
        Function with mixed indentation.
        Args:
            param1 (str): First parameter
                with continuation
            param2 (int): Second parameter
        """
        result = parse(docstring)
        assert (
            result.short_description
            == "Function with mixed indentation."
        )
        assert len(result.params) == 2
        assert result.params[0] == DocstringParam(
            "param1", "First parameter with continuation"
        )
        assert result.params[1] == DocstringParam(
            "param2", "Second parameter"
        )
    def test_docstring_with_tab_indentation(self):
        """Test parsing of docstring with tab indentation."""
        docstring = """
        Function with tab indentation.
        Args:
        	param1 (str): First parameter
        	param2 (int): Second parameter
        """
        result = parse(docstring)
        assert (
            result.short_description
            == "Function with tab indentation."
        )
        assert len(result.params) == 2
        assert result.params[0] == DocstringParam(
            "param1", "First parameter"
        )
        assert result.params[1] == DocstringParam(
            "param2", "Second parameter"
        )
 if __name__ == "__main__":
    pytest.main([__file__])
--- a/tests/utils/test_formatter.py
+++ b/tests/utils/test_formatter.py
@ -1,6 +1,3 @@
 #!/usr/bin/env python3
 """Test script to verify the improved formatter markdown rendering."""
 from swarms.utils.formatter import Formatter
--- a/tests/utils/test_litellm_wrapper.py
+++ b/tests/utils/test_litellm_wrapper.py
@ -1,21 +1,9 @@
 import asyncio
 import sys
 from loguru import logger
 from swarms.utils.litellm_wrapper import LiteLLM
 # Configure loguru logger
 logger.remove()  # Remove default handler
 logger.add(
    "test_litellm.log",
    rotation="1 MB",
    format="{time} | {level} | {message}",
    level="DEBUG",
 )
 logger.add(sys.stdout, level="INFO")
 tools = [
    {
        "type": "function",
--- a/tests/utils/test_math_eval.py
+++ b/tests/utils/test_math_eval.py
@ -1,4 +1,4 @@
-from swarms.utils import math_eval
+from swarms.utils.math_eval import math_eval
 def func1_no_exception(x):
--- a/tests/utils/test_md_output.py
+++ b/tests/utils/test_md_output.py
@ -1,21 +1,17 @@
 #!/usr/bin/env python3
 """
 Test script demonstrating markdown output functionality with a real swarm
 Uses the current state of formatter.py to show agent markdown output capabilities
 """
 import os
 from dotenv import load_dotenv
 # Load environment variables
 load_dotenv()
-from swarms import Agent
+from swarms import (
-from swarms.structs import (
+    Agent,
    SequentialWorkflow,
    ConcurrentWorkflow,
    GroupChat,
    SequentialWorkflow,
 )
 from swarms.utils.formatter import Formatter
--- a/tests/utils/test_print_class_parameters.py
+++ b/tests/utils/test_print_class_parameters.py
@ -1,120 +0,0 @@
 import pytest
 from swarms.utils import print_class_parameters
 class TestObject:
    def __init__(self, value1, value2: int):
        pass
 class TestObject2:
    def __init__(self: "TestObject2", value1, value2: int = 5):
        pass
 def test_class_with_complex_parameters():
    class ComplexArgs:
        def __init__(self, value1: list, value2: dict = {}):
            pass
    output = {"value1": "<class 'list'>", "value2": "<class 'dict'>"}
    assert (
        print_class_parameters(ComplexArgs, api_format=True) == output
    )
 def test_empty_class():
    class Empty:
        pass
    with pytest.raises(Exception):
        print_class_parameters(Empty)
 def test_class_with_no_annotations():
    class NoAnnotations:
        def __init__(self, value1, value2):
            pass
    output = {
        "value1": "<class 'inspect._empty'>",
        "value2": "<class 'inspect._empty'>",
    }
    assert (
        print_class_parameters(NoAnnotations, api_format=True)
        == output
    )
 def test_class_with_partial_annotations():
    class PartialAnnotations:
        def __init__(self, value1, value2: int):
            pass
    output = {
        "value1": "<class 'inspect._empty'>",
        "value2": "<class 'int'>",
    }
    assert (
        print_class_parameters(PartialAnnotations, api_format=True)
        == output
    )
@pytest.mark.parametrize(
    "obj, expected",
    [
        (
            TestObject,
            {
                "value1": "<class 'inspect._empty'>",
                "value2": "<class 'int'>",
            },
        ),
        (
            TestObject2,
            {
                "value1": "<class 'inspect._empty'>",
                "value2": "<class 'int'>",
            },
        ),
    ],
 )
 def test_parametrized_class_parameters(obj, expected):
    assert print_class_parameters(obj, api_format=True) == expected
@pytest.mark.parametrize(
    "value",
    [
        int,
        float,
        str,
        list,
        set,
        dict,
        bool,
        tuple,
        complex,
        bytes,
        bytearray,
        memoryview,
        range,
        frozenset,
        slice,
        object,
    ],
 )
 def test_not_class_exception(value):
    with pytest.raises(Exception):
        print_class_parameters(value)
 def test_api_format_flag():
    assert print_class_parameters(TestObject2, api_format=True) == {
        "value1": "<class 'inspect._empty'>",
        "value2": "<class 'int'>",
    }
    print_class_parameters(TestObject)
    # TODO: Capture printed output and assert correctness.
`@ -1,4 +1,4 @@`
	`from swarms.utils import math_eval`	`from swarms.utils.math_eval import math_eval`


	`def func1_no_exception(x):`	`def func1_no_exception(x):`