### What happened? 1. **The LLM is returning a streaming wrapper instead of text** Because `streaming_on=True`, LiteLLM gives you a `CustomStreamWrapper` object. Your helper `parse_llm_output()` doesn’t know how to turn that object into a string, so the agent stores the wrapper itself. 2. **`mcp_execution_flow()` received an empty string** When the agent tried to forward the tool‑call to your Fast‑MCP server it called `json.loads()`, which raised *“Expecting value: line 1 column 1 (char 0)”*. That means the HTTP call to `http://0.0.0.0:8000/mcp` returned **no body** – usually because the path is wrong or the server crashed. --- ### Fast MCP mock‑server: correct URL `FastMCP.run(transport="sse", port=8000)` serves SSE on `http://:8000/stream` **and** accepts JSON‑RPC over POST on the root `http://:8000/`. So point the client at the root (no `/mcp` suffix): ```python math_server = MCPServerSseParams( url="http://0.0.0.0:8000", # <── just the origin headers={"Content-Type": "application/json"}, timeout=5.0, sse_read_timeout=30.0, ) ``` --- ### Stop streaming while you debug ```python math_agent = Agent( agent_name="Math Agent", agent_description="Specialized agent for mathematical computations", system_prompt=MATH_AGENT_PROMPT, max_loops=1, mcp_servers=[math_server], streaming_on=False, # <── turn off for now ) ``` (You can turn it back on when everything works and you extend `parse_llm_output()` to unwrap streams.) --- ### Make sure the LLM actually *produces* a tool call LLMs won’t invent a JSON‑RPC envelope unless you ask them. Add a very small system hint that mirrors the MCP schema, e.g. ```python TOOL_CALL_INSTRUCTION = """ When you want to use a math tool, reply with a JSON object only: {"tool_name": "", "a": , "b": } """ math_agent.short_memory.add(role="system", content=TOOL_CALL_INSTRUCTION) ``` Now “**add 2 and 4**” should yield ```json {"tool_name":"add","a":2,"b":4} ``` which your `mcp_execution_flow()` can forward. --- ### Fix the empty‑body crash guard (optional) Inside `mcp_execution_flow()` (or `mcp_flow`) you can protect yourself: ```python resp = client.post(...) if not resp.text: raise RuntimeError("MCP server returned empty body") ``` That turns an unhelpful JSON decode error into a clear MCP‑connection error. --- ### Quick test commands ```bash # 1. Start mock server in one shell python mock_math_server.py # “Starting Mock Math Server on port 8000…” # 2. In another shell, make sure it answers curl -X POST http://0.0.0.0:8000 \ -H "Content-Type: application/json" \ -d '{"tool_name":"add","a":2,"b":4}' # → {"result":6} # 3. Run your client python mcp_client.py ``` If you now type `add 2 and 4`, the agent should respond **6** (with your explanation string) instead of the stream‑object dump. --- #### Next steps * Re‑enable `streaming_on=True` and add handling of LiteLLM’s stream wrapper in `parse_llm_output()`. * Return the math result to the user in natural language (e.g. “2 + 4 = 6”). * Add error handling in the mock server (e.g. divide‑by‑zero already covered). That should give you a clean smoke‑test of **Fast MCP + Swarms Agent** without the function‑calling “craziness”.