|
|
### What happened?
|
|
|
|
|
|
1. **The LLM is returning a streaming wrapper instead of text**
|
|
|
Because `streaming_on=True`, LiteLLM gives you a `CustomStreamWrapper` object.
|
|
|
Your helper `parse_llm_output()` doesn’t know how to turn that object into a string, so the
|
|
|
agent stores the wrapper itself.
|
|
|
|
|
|
2. **`mcp_execution_flow()` received an empty string**
|
|
|
When the agent tried to forward the tool‑call to your Fast‑MCP server it called
|
|
|
`json.loads(<empty‑string>)`, which raised
|
|
|
*“Expecting value: line 1 column 1 (char 0)”*.
|
|
|
|
|
|
That means the HTTP call to `http://0.0.0.0:8000/mcp` returned **no body** –
|
|
|
usually because the path is wrong or the server crashed.
|
|
|
|
|
|
---
|
|
|
|
|
|
### Fast MCP mock‑server: correct URL
|
|
|
|
|
|
`FastMCP.run(transport="sse", port=8000)` serves SSE on
|
|
|
`http://<host>:8000/stream` **and** accepts JSON‑RPC over POST on the root
|
|
|
`http://<host>:8000/`.
|
|
|
|
|
|
So point the client at the root (no `/mcp` suffix):
|
|
|
|
|
|
```python
|
|
|
math_server = MCPServerSseParams(
|
|
|
url="http://0.0.0.0:8000", # <── just the origin
|
|
|
headers={"Content-Type": "application/json"},
|
|
|
timeout=5.0,
|
|
|
sse_read_timeout=30.0,
|
|
|
)
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### Stop streaming while you debug
|
|
|
|
|
|
```python
|
|
|
math_agent = Agent(
|
|
|
agent_name="Math Agent",
|
|
|
agent_description="Specialized agent for mathematical computations",
|
|
|
system_prompt=MATH_AGENT_PROMPT,
|
|
|
max_loops=1,
|
|
|
mcp_servers=[math_server],
|
|
|
streaming_on=False, # <── turn off for now
|
|
|
)
|
|
|
```
|
|
|
|
|
|
(You can turn it back on when everything works and you extend
|
|
|
`parse_llm_output()` to unwrap streams.)
|
|
|
|
|
|
---
|
|
|
|
|
|
### Make sure the LLM actually *produces* a tool call
|
|
|
|
|
|
LLMs won’t invent a JSON‑RPC envelope unless you ask them.
|
|
|
Add a very small system hint that mirrors the MCP schema, e.g.
|
|
|
|
|
|
```python
|
|
|
TOOL_CALL_INSTRUCTION = """
|
|
|
When you want to use a math tool, reply with a JSON object only:
|
|
|
|
|
|
{"tool_name": "<add|multiply|divide>", "a": <int>, "b": <int>}
|
|
|
"""
|
|
|
|
|
|
math_agent.short_memory.add(role="system", content=TOOL_CALL_INSTRUCTION)
|
|
|
```
|
|
|
|
|
|
Now “**add 2 and 4**” should yield
|
|
|
|
|
|
```json
|
|
|
{"tool_name":"add","a":2,"b":4}
|
|
|
```
|
|
|
|
|
|
which your `mcp_execution_flow()` can forward.
|
|
|
|
|
|
---
|
|
|
|
|
|
### Fix the empty‑body crash guard (optional)
|
|
|
|
|
|
Inside `mcp_execution_flow()` (or `mcp_flow`) you can protect yourself:
|
|
|
|
|
|
```python
|
|
|
resp = client.post(...)
|
|
|
if not resp.text:
|
|
|
raise RuntimeError("MCP server returned empty body")
|
|
|
```
|
|
|
|
|
|
That turns an unhelpful JSON decode error into a clear MCP‑connection error.
|
|
|
|
|
|
---
|
|
|
|
|
|
### Quick test commands
|
|
|
|
|
|
```bash
|
|
|
# 1. Start mock server in one shell
|
|
|
python mock_math_server.py # “Starting Mock Math Server on port 8000…”
|
|
|
|
|
|
# 2. In another shell, make sure it answers
|
|
|
curl -X POST http://0.0.0.0:8000 \
|
|
|
-H "Content-Type: application/json" \
|
|
|
-d '{"tool_name":"add","a":2,"b":4}'
|
|
|
# → {"result":6}
|
|
|
|
|
|
# 3. Run your client
|
|
|
python mcp_client.py
|
|
|
```
|
|
|
|
|
|
If you now type `add 2 and 4`, the agent should respond **6** (with your
|
|
|
explanation string) instead of the stream‑object dump.
|
|
|
|
|
|
---
|
|
|
|
|
|
#### Next steps
|
|
|
|
|
|
* Re‑enable `streaming_on=True` and add handling of LiteLLM’s stream wrapper in
|
|
|
`parse_llm_output()`.
|
|
|
* Return the math result to the user in natural language (e.g. “2 + 4 = 6”).
|
|
|
* Add error handling in the mock server (e.g. divide‑by‑zero already covered).
|
|
|
|
|
|
That should give you a clean smoke‑test of **Fast MCP + Swarms Agent** without
|
|
|
the function‑calling “craziness”. |