Merge pull request #1084 from IlumCI/fallback

[FEAT-AGENT][Added fallback models function to agent, if primary model were to fail]
pull/1081/merge
Kye Gomez 3 weeks ago committed by GitHub
commit 0813594074
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

@ -323,6 +323,7 @@ nav:
- CLI Reference: "swarms/cli/cli_reference.md"
- Agent Loader: "swarms/utils/agent_loader.md"
- AgentRegistry: "swarms/structs/agent_registry.md"
- Fallback Models: "swarms/utils/fallback_models.md"
- Tools:
- Overview: "swarms_tools/overview.md"

@ -0,0 +1,279 @@
# Fallback Models in Swarms Agent
The Swarms Agent now supports automatic fallback to alternative models when the primary model fails. This feature enhances reliability and ensures your agents can continue operating even when specific models are unavailable or experiencing issues.
## Overview
The fallback model system allows you to specify one or more alternative models that will be automatically tried if the primary model encounters an error. This is particularly useful for:
- **High availability**: Ensure your agents continue working even if a specific model is down
- **Cost optimization**: Use cheaper models as fallbacks for non-critical tasks
- **Rate limiting**: Switch to alternative models when hitting rate limits
- **Model-specific issues**: Handle temporary model-specific problems
## Configuration
### Single Fallback Model
```python
from swarms import Agent
# Configure a single fallback model
agent = Agent(
model_name="gpt-4o", # Primary model
fallback_model_name="gpt-4o-mini", # Fallback model
max_loops=1
)
```
### Multiple Fallback Models
```python
from swarms import Agent
# Configure multiple fallback models using unified list
agent = Agent(
fallback_models=["gpt-4o", "gpt-4o-mini", "gpt-3.5-turbo", "claude-3-haiku"], # First is primary, rest are fallbacks
max_loops=1
)
```
### Combined Configuration
```python
from swarms import Agent
# You can use both single fallback and fallback list
agent = Agent(
model_name="gpt-4o", # Primary model
fallback_model_name="gpt-4o-mini", # Single fallback
fallback_models=["gpt-3.5-turbo", "claude-3-haiku"], # Additional fallbacks
max_loops=1
)
# Or use the unified list approach (recommended)
agent = Agent(
fallback_models=["gpt-4o", "gpt-4o-mini", "gpt-3.5-turbo", "claude-3-haiku"],
max_loops=1
)
# Final order: gpt-4o -> gpt-4o-mini -> gpt-3.5-turbo -> claude-3-haiku
```
## How It Works
1. **Primary Model**: The agent starts with the specified primary model
2. **Error Detection**: When an LLM call fails, the system catches the error
3. **Automatic Switching**: The agent automatically switches to the next available model
4. **Retry**: The failed operation is retried with the new model
5. **Exhaustion**: If all models fail, the original error is raised
## API Reference
### Constructor Parameters
- `fallback_model_name` (str, optional): Single fallback model name
- `fallback_models` (List[str], optional): List of fallback model names
### Methods
#### `get_available_models() -> List[str]`
Returns the complete list of available models in order of preference.
```python
models = agent.get_available_models()
print(models) # ['gpt-4o', 'gpt-4o-mini', 'gpt-3.5-turbo']
```
#### `get_current_model() -> str`
Returns the currently active model name.
```python
current = agent.get_current_model()
print(current) # 'gpt-4o'
```
#### `is_fallback_available() -> bool`
Checks if fallback models are configured.
```python
has_fallback = agent.is_fallback_available()
print(has_fallback) # True
```
#### `switch_to_next_model() -> bool`
Manually switch to the next available model. Returns `True` if successful, `False` if no more models available.
```python
success = agent.switch_to_next_model()
if success:
print(f"Switched to: {agent.get_current_model()}")
else:
print("No more models available")
```
#### `reset_model_index()`
Reset to the primary model.
```python
agent.reset_model_index()
print(agent.get_current_model()) # Back to primary model
```
## Examples
### Basic Usage
```python
from swarms import Agent
# Create agent with fallback models
agent = Agent(
model_name="gpt-4o",
fallback_models=["gpt-4o-mini", "gpt-3.5-turbo"],
max_loops=1
)
# Run a task - will automatically use fallbacks if needed
response = agent.run("Write a short story about AI")
print(response)
```
### Monitoring Model Usage
```python
from swarms import Agent
agent = Agent(
model_name="gpt-4o",
fallback_models=["gpt-4o-mini", "gpt-3.5-turbo"],
max_loops=1
)
print(f"Available models: {agent.get_available_models()}")
print(f"Current model: {agent.get_current_model()}")
# Run task
response = agent.run("Analyze this data")
# Check if fallback was used
if agent.get_current_model() != "gpt-4o":
print(f"Used fallback model: {agent.get_current_model()}")
```
### Manual Model Switching
```python
from swarms import Agent
agent = Agent(
model_name="gpt-4o",
fallback_models=["gpt-4o-mini", "gpt-3.5-turbo"],
max_loops=1
)
# Manually switch models
print(f"Starting with: {agent.get_current_model()}")
agent.switch_to_next_model()
print(f"Switched to: {agent.get_current_model()}")
agent.switch_to_next_model()
print(f"Switched to: {agent.get_current_model()}")
# Reset to primary
agent.reset_model_index()
print(f"Reset to: {agent.get_current_model()}")
```
## Error Handling
The fallback system handles various types of errors:
- **API Errors**: Rate limits, authentication issues
- **Model Errors**: Model-specific failures
- **Network Errors**: Connection timeouts, network issues
- **Configuration Errors**: Invalid model names, unsupported features
### Error Logging
The system provides detailed logging when fallbacks are used:
```
WARNING: Agent 'my-agent' switching to fallback model: gpt-4o-mini (attempt 2/3)
INFO: Retrying with fallback model 'gpt-4o-mini' for agent 'my-agent'
```
## Best Practices
### 1. Model Selection
- Choose fallback models that are compatible with your use case
- Consider cost differences between models
- Ensure fallback models support the same features (e.g., function calling, vision)
### 2. Order Matters
- Place most reliable models first
- Consider cost-performance trade-offs
- Test fallback models to ensure they work for your tasks
### 3. Monitoring
- Monitor which models are being used
- Track fallback usage patterns
- Set up alerts for excessive fallback usage
### 4. Error Handling
- Implement proper error handling in your application
- Consider graceful degradation when all models fail
- Log fallback usage for analysis
## Limitations
1. **Model Compatibility**: Fallback models must be compatible with your use case
2. **Feature Support**: Not all models support the same features (e.g., function calling, vision)
3. **Cost Implications**: Different models have different pricing
4. **Performance**: Fallback models may have different performance characteristics
## Troubleshooting
### Common Issues
1. **All Models Failing**: Check API keys and network connectivity
2. **Feature Incompatibility**: Ensure fallback models support required features
3. **Rate Limiting**: Consider adding delays between model switches
4. **Configuration Errors**: Verify model names are correct
### Debug Mode
Enable verbose logging to see detailed fallback information:
```python
agent = Agent(
model_name="gpt-4o",
fallback_models=["gpt-4o-mini"],
verbose=True # Enable detailed logging
)
```
## Migration Guide
### From No Fallback to Fallback
If you're upgrading from an agent without fallback support:
```python
# Before
agent = Agent(model_name="gpt-4o")
# After
agent = Agent(
model_name="gpt-4o",
fallback_models=["gpt-4o-mini", "gpt-3.5-turbo"]
)
```
### Backward Compatibility
The fallback system is fully backward compatible. Existing agents will continue to work without any changes.
## Conclusion
The fallback model system provides a robust way to ensure your Swarms agents remain operational even when individual models fail. By configuring appropriate fallback models, you can improve reliability, handle rate limits, and optimize costs while maintaining the same simple API.

@ -28,6 +28,7 @@ from litellm.utils import (
supports_parallel_function_calling,
supports_vision,
)
from litellm.exceptions import BadRequestError, InternalServerError, AuthenticationError
from loguru import logger
from pydantic import BaseModel
@ -202,6 +203,8 @@ class Agent:
list_of_pdf (str): The list of pdf
tokenizer (Any): The tokenizer
long_term_memory (BaseVectorDatabase): The long term memory
fallback_model_name (str): The fallback model name to use if primary model fails
fallback_models (List[str]): List of model names to try in order. First model is primary, rest are fallbacks
preset_stopping_token (bool): Enable preset stopping token
traceback (Any): The traceback
traceback_handlers (Any): The traceback handlers
@ -302,6 +305,14 @@ class Agent:
>>> response = agent.run("Tell me a long story.") # Will stream in real-time
>>> print(response) # Final complete response
>>> # Fallback model example
>>> agent = Agent(
... fallback_models=["gpt-4o", "gpt-4o-mini", "gpt-3.5-turbo"],
... max_loops=1
... )
>>> response = agent.run("Generate a report on the financials.")
>>> # Will try gpt-4o first, then gpt-4o-mini, then gpt-3.5-turbo if each fails
"""
def __init__(
@ -340,6 +351,7 @@ class Agent:
tokenizer: Optional[Any] = None,
long_term_memory: Optional[Union[Callable, Any]] = None,
fallback_model_name: Optional[str] = None,
fallback_models: Optional[List[str]] = None,
preset_stopping_token: Optional[bool] = False,
traceback: Optional[Any] = None,
traceback_handlers: Optional[Any] = None,
@ -605,6 +617,13 @@ class Agent:
self.thinking_tokens = thinking_tokens
self.reasoning_enabled = reasoning_enabled
self.fallback_model_name = fallback_model_name
self.fallback_models = fallback_models or []
self.current_model_index = 0
self.model_attempts = {}
# If fallback_models is provided, use the first model as the primary model
if self.fallback_models and not self.model_name:
self.model_name = self.fallback_models[0]
# self.init_handling()
self.setup_config()
@ -729,6 +748,12 @@ class Agent:
if self.model_name is None:
self.model_name = "gpt-4o-mini"
# Use current model (which may be a fallback) only if fallbacks are configured
if self.fallback_models:
current_model = self.get_current_model()
else:
current_model = self.model_name
# Determine if parallel tool calls should be enabled
if exists(self.tools) and len(self.tools) >= 2:
parallel_tool_calls = True
@ -742,7 +767,7 @@ class Agent:
try:
# Base configuration that's always included
common_args = {
"model_name": self.model_name,
"model_name": current_model,
"temperature": self.temperature,
"max_tokens": self.max_tokens,
"system_prompt": self.system_prompt,
@ -1278,7 +1303,7 @@ class Agent:
success = True # Mark as successful to exit the retry loop
except Exception as e:
except (BadRequestError, InternalServerError, AuthenticationError, Exception) as e:
if self.autosave is True:
log_agent_data(self.to_dict())
@ -2551,9 +2576,10 @@ class Agent:
return out
except AgentLLMError as e:
except (AgentLLMError, BadRequestError, InternalServerError, AuthenticationError, Exception) as e:
logger.error(
f"Error calling LLM: {e}. Task: {task}, Args: {args}, Kwargs: {kwargs} Traceback: {traceback.format_exc()}"
f"Error calling LLM with model '{self.get_current_model()}': {e}. "
f"Task: {task}, Args: {args}, Kwargs: {kwargs} Traceback: {traceback.format_exc()}"
)
raise e
@ -2632,7 +2658,43 @@ class Agent:
return output
except AgentRunError as e:
except (AgentRunError, AgentLLMError, BadRequestError, InternalServerError, AuthenticationError, Exception) as e:
# Try fallback models if available
if self.is_fallback_available() and self.switch_to_next_model():
logger.info(
f"Agent '{self.agent_name}' failed with model '{self.get_current_model()}'. "
f"Retrying with fallback model '{self.get_current_model()}'"
)
try:
# Recursive call to run() with the new model
return self.run(
task=task,
img=img,
imgs=imgs,
correct_answer=correct_answer,
streaming_callback=streaming_callback,
*args,
**kwargs
)
except Exception as fallback_error:
logger.error(
f"Fallback model '{self.get_current_model()}' also failed: {fallback_error}"
)
# Continue to next fallback or raise if no more models
if self.is_fallback_available() and self.switch_to_next_model():
return self.run(
task=task,
img=img,
imgs=imgs,
correct_answer=correct_answer,
streaming_callback=streaming_callback,
*args,
**kwargs
)
else:
self._handle_run_error(e)
else:
# No fallback available or all fallbacks exhausted
self._handle_run_error(e)
except KeyboardInterrupt:
@ -2976,6 +3038,87 @@ class Agent:
api_key=self.llm_api_key,
)
def get_available_models(self) -> List[str]:
"""
Get the list of available models including primary and fallback models.
Returns:
List[str]: List of model names in order of preference
"""
models = []
# If fallback_models is specified, use it as the primary list
if self.fallback_models:
models = self.fallback_models.copy()
else:
# Otherwise, build the list from individual parameters
if self.model_name:
models.append(self.model_name)
if self.fallback_model_name and self.fallback_model_name not in models:
models.append(self.fallback_model_name)
return models
def get_current_model(self) -> str:
"""
Get the current model being used.
Returns:
str: Current model name
"""
available_models = self.get_available_models()
if self.current_model_index < len(available_models):
return available_models[self.current_model_index]
return available_models[0] if available_models else "gpt-4o-mini"
def switch_to_next_model(self) -> bool:
"""
Switch to the next available model in the fallback list.
Returns:
bool: True if successfully switched to next model, False if no more models available
"""
available_models = self.get_available_models()
if self.current_model_index + 1 < len(available_models):
self.current_model_index += 1
new_model = available_models[self.current_model_index]
logger.warning(
f"Agent '{self.agent_name}' switching to fallback model: {new_model} "
f"(attempt {self.current_model_index + 1}/{len(available_models)})"
)
# Update the model name and reinitialize LLM
self.model_name = new_model
self.llm = self.llm_handling()
return True
else:
logger.error(
f"Agent '{self.agent_name}' has exhausted all available models. "
f"Tried {len(available_models)} models: {available_models}"
)
return False
def reset_model_index(self) -> None:
"""Reset the model index to use the primary model."""
self.current_model_index = 0
available_models = self.get_available_models()
if available_models:
self.model_name = available_models[0]
self.llm = self.llm_handling()
def is_fallback_available(self) -> bool:
"""
Check if fallback models are available.
Returns:
bool: True if fallback models are configured
"""
available_models = self.get_available_models()
return len(available_models) > 1
def execute_tools(self, response: any, loop_count: int):
# Handle None response gracefully
if response is None:

Loading…
Cancel
Save