You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
swarms/CHANGES_SUMMARY.md

6.4 KiB

Fix for Issue #936: Agent Tool Usage with Streaming Enabled

Problem Summary

  • Agent tool usage callable doesn't work with streaming enabled
  • Works without streaming well
  • Tool execution logging disappeared

Root Cause Analysis

When streaming is enabled, the LLM response chunks are collected as plain text strings, losing the structured API response format that contains tool call metadata. The tool_struct.execute_function_calls_from_api_response() method expects structured responses with tool_calls attributes, but streaming responses only contained concatenated text content.

Solution Implementation

1. Created StreamingToolResponse Class

File: swarms/structs/agent.py (lines 91-104)

class StreamingToolResponse:
    """
    Response wrapper that preserves both content and tool calls from streaming responses.
    This enables tool execution when streaming is enabled.
    """
    def __init__(self, content: str, tool_calls: List[Any] = None):
        self.content = content
        self.tool_calls = tool_calls or []
    
    def __str__(self):
        return self.content
    
    def __repr__(self):
        return f"StreamingToolResponse(content='{self.content[:50]}...', tool_calls={len(self.tool_calls)} calls)"

Purpose: Replace dynamic type creation with a proper class that preserves both content and tool calls from streaming responses.

2. Enhanced Streaming Chunk Processing

File: swarms/structs/agent.py (lines 2600-2690)

Modified all three streaming paths in call_llm method:

A. Streaming with Callback (lines 2615-2625)

# Preserve tool calls from streaming chunks
try:
    if (hasattr(chunk, "choices") and 
        len(chunk.choices) > 0 and
        hasattr(chunk.choices[0], "delta") and
        hasattr(chunk.choices[0].delta, "tool_calls") and 
        chunk.choices[0].delta.tool_calls):
        tool_calls.extend(chunk.choices[0].delta.tool_calls)
except (AttributeError, IndexError) as e:
    logger.debug(f"Could not extract tool calls from chunk: {e}")

B. Silent Streaming (lines 2636-2646)

  • Same tool call preservation logic as above
  • Maintains streaming behavior while capturing tool calls

C. Streaming Panel (lines 2658-2688)

# Create a tool-aware streaming processor to preserve tool calls
def tool_aware_streaming(stream_response):
    for chunk in stream_response:
        # Preserve tool calls from streaming chunks with error handling
        try:
            if (hasattr(chunk, "choices") and 
                len(chunk.choices) > 0 and
                hasattr(chunk.choices[0], "delta") and
                hasattr(chunk.choices[0].delta, "tool_calls") and 
                chunk.choices[0].delta.tool_calls):
                tool_calls.extend(chunk.choices[0].delta.tool_calls)
        except (AttributeError, IndexError) as e:
            logger.debug(f"Could not extract tool calls from chunk: {e}")
        yield chunk

Purpose: Prevents iterator consumption bug while preserving tool calls.

3. Enhanced Tool Execution Logging

File: swarms/structs/agent.py (lines 3109-3133)

# Add tool execution logging
logger.info(f"Starting tool execution for agent '{self.agent_name}' in loop {loop_count}")

# Enhanced retry logic with logging
try:
    output = self.tool_struct.execute_function_calls_from_api_response(response)
except Exception as e:
    logger.warning(f"First attempt at tool execution failed: {e}. Retrying...")
    try:
        output = self.tool_struct.execute_function_calls_from_api_response(response)
    except Exception as retry_error:
        logger.error(f"Tool execution failed after retry: {retry_error}")
        if output is None:
            raise retry_error

# Log successful tool execution
if output is not None:
    logger.info(f"Tool execution successful for agent '{self.agent_name}' in loop {loop_count}. Output length: {len(str(output)) if output else 0}")
else:
    logger.warning(f"Tool execution completed but returned None output for agent '{self.agent_name}' in loop {loop_count}")

Purpose: Restore missing tool execution logging with comprehensive status reporting.

Key Improvements

1. Robust Error Handling

  • Added try-catch blocks around tool call extraction
  • Graceful handling of malformed chunks
  • Protection against AttributeError and IndexError

2. Iterator Safety

  • Fixed streaming iterator consumption bug
  • Proper generator pattern to avoid iterator exhaustion

3. Comprehensive Logging

  • Tool execution start/success/failure logging
  • Retry attempt logging
  • Debug-level logging for chunk processing errors

4. Backward Compatibility

  • No changes to existing non-streaming behavior
  • Maintains all existing API contracts
  • Falls back gracefully when no tool calls present

Testing

Created two test files:

1. test_streaming_tools.py

  • Tests streaming behavior with and without tools
  • Validates tool execution occurs with streaming enabled
  • Checks memory history for tool execution evidence

2. test_original_issue.py

  • Reproduces exact code from GitHub issue #936
  • Uses original function signatures and agent configuration
  • Validates the specific use case reported in the issue

Files Modified

  1. swarms/structs/agent.py

    • Added StreamingToolResponse class
    • Enhanced streaming chunk processing in call_llm method
    • Improved tool execution logging in execute_tools method
  2. Created Test Files

    • test_streaming_tools.py - Comprehensive streaming + tool tests
    • test_original_issue.py - Reproduction of original issue scenario

Verification

The solution addresses both reported issues:

Tool Usage with Streaming: Tool calls are now preserved and executed when streaming is enabled

Tool Execution Logging: Comprehensive logging is now present throughout the tool execution process

Edge Cases Handled

  1. Malformed Chunks: Graceful error handling prevents crashes
  2. Empty Tool Calls: Proper validation before processing
  3. Iterator Consumption: Safe streaming processing without iterator exhaustion
  4. Mixed Content: Handles chunks with both content and tool calls
  5. Multiple Tool Calls: Supports multiple tool calls in single or multiple chunks

Performance Impact

  • Minimal: Only additional memory for tool call arrays during streaming
  • Efficient: Tool call extraction only occurs when chunks contain them
  • Scalable: Handles multiple concurrent streaming agents safely