23 KiB

Raw Blame History

DebateWithJudge Module Documentation

The DebateWithJudge module provides a sophisticated debate architecture with self-refinement through a judge agent. This system enables two agents (Pro and Con) to debate a topic, with a Judge agent evaluating their arguments and providing refined synthesis. The process repeats for N rounds to progressively refine the answer.

Architecture

graph TD
    A[DebateWithJudge System] --> B[Initialize Pro, Con, and Judge Agents]
    B --> C[Start with Initial Topic]
    C --> D[Round Loop: max_rounds]
    D --> E[Pro Agent Presents Argument]
    E --> F[Con Agent Presents Counter-Argument]
    F --> G[Judge Agent Evaluates Both]
    G --> H[Judge Provides Synthesis]
    H --> I{More Rounds?}
    I -->|Yes| D
    I -->|No| J[Format Final Output]
    J --> K[Return Result]

Key Concepts

Concept	Description
Debate Architecture	A structured process where two agents present opposing arguments on a topic
Pro Agent	The agent arguing in favor of a position
Con Agent	The agent arguing against a position
Judge Agent	An impartial evaluator that analyzes both arguments and provides synthesis
Iterative Refinement	The process repeats for multiple rounds, each round building upon the judge's previous synthesis
Progressive Improvement	Each round refines the answer by incorporating feedback and addressing weaknesses

Class Definition: `DebateWithJudge`

class DebateWithJudge:
    def __init__(
        self,
        pro_agent: Agent,
        con_agent: Agent,
        judge_agent: Agent,
        max_rounds: int = 3,
        output_type: str = "str-all-except-first",
        verbose: bool = True,
    ):

Constructor Parameters

Parameter	Type	Default	Description
`pro_agent`	`Agent`	Required	The agent arguing in favor (Pro position)
`con_agent`	`Agent`	Required	The agent arguing against (Con position)
`judge_agent`	`Agent`	Required	The judge agent that evaluates arguments and provides synthesis
`max_rounds`	`int`	`3`	Maximum number of debate rounds to execute
`output_type`	`str`	`"str-all-except-first"`	Format for the output conversation history
`verbose`	`bool`	`True`	Whether to enable verbose logging

API Reference

Core Methods

`run(task: str) -> Union[str, List, dict]`

Executes the debate with judge refinement process for a single task and returns the refined result.

Signature:

def run(self, task: str) -> Union[str, List, dict]

Parameters:

task (str): The initial topic or question to debate

Returns:

Union[str, List, dict]: The formatted conversation history or final refined answer, depending on output_type

Process Flow:

Task Validation: Validates that the task is a non-empty string
Agent Initialization: Initializes all three agents with their respective roles and the initial task context
Multi-Round Execution: For each round (up to max_rounds):
- Pro Argument: Pro agent presents an argument in favor of the current topic
- Con Counter-Argument: Con agent presents a counter-argument, addressing the Pro's points
- Judge Evaluation: Judge agent evaluates both arguments, identifies strengths and weaknesses
- Synthesis Generation: Judge provides a refined synthesis that incorporates the best elements from both sides
- Topic Refinement: Judge's synthesis becomes the topic for the next round
Result Formatting: Returns the final result formatted according to output_type

Example:

from swarms import Agent, DebateWithJudge

# Create the Pro agent (arguing in favor)
pro_agent = Agent(
    agent_name="Pro-Agent",
    system_prompt=(
        "You are a skilled debater who argues in favor of positions. "
        "You present well-reasoned arguments with evidence, examples, "
        "and logical reasoning. You are persuasive and articulate."
    ),
    model_name="gpt-4o-mini",
    max_loops=1,
)

# Create the Con agent (arguing against)
con_agent = Agent(
    agent_name="Con-Agent",
    system_prompt=(
        "You are a skilled debater who argues against positions. "
        "You present strong counter-arguments with evidence, examples, "
        "and logical reasoning. You identify weaknesses in opposing "
        "arguments and provide compelling alternatives."
    ),
    model_name="gpt-4o-mini",
    max_loops=1,
)

# Create the Judge agent (evaluates and synthesizes)
judge_agent = Agent(
    agent_name="Judge-Agent",
    system_prompt=(
        "You are an impartial judge who evaluates debates. "
        "You carefully analyze arguments from both sides, identify "
        "strengths and weaknesses, and provide balanced synthesis. "
        "You may declare a winner or provide a refined answer that "
        "incorporates the best elements from both arguments."
    ),
    model_name="gpt-4o-mini",
    max_loops=1,
)

# Create the DebateWithJudge system
debate_system = DebateWithJudge(
    pro_agent=pro_agent,
    con_agent=con_agent,
    judge_agent=judge_agent,
    max_rounds=3,
    output_type="str-all-except-first",
    verbose=True,
)

# Define the debate topic
topic = (
    "Should artificial intelligence be regulated by governments? "
    "Discuss the balance between innovation and safety."
)

# Run the debate
result = debate_system.run(task=topic)
print(result)

Raises:

ValueError: If task is None or empty, or if any required agents are None, or if max_rounds is less than 1

`batched_run(tasks: List[str]) -> List[str]`

Executes the debate with judge refinement process for multiple tasks sequentially, processing each task through the complete multi-round debate process.

Signature:

def batched_run(self, tasks: List[str]) -> List[str]

Parameters:

tasks (List[str]): List of topics or questions to debate

Returns:

List[str]: List of final refined answers, one for each input task

Process Flow:

Sequential Processing: Processes each task in the input list one by one
Independent Execution: Each task runs through the complete multi-round debate process independently
Result Collection: Collects and returns all results in the same order as input tasks

Example:

# Define multiple debate topics
debate_topics = [
    "Should remote work become the standard for knowledge workers?",
    "Is cryptocurrency a viable alternative to traditional banking?",
    "Should social media platforms be held accountable for content moderation?",
    "Are electric vehicles the future of transportation?"
]

# Execute batch processing
results = debate_system.batched_run(debate_topics)

# Process results
for topic, result in zip(debate_topics, results):
    print(result)

Performance Considerations:

Consideration	Description
Sequential Processing	Tasks are processed one after another, not in parallel
Independent Conversation History	Each task maintains its own conversation history
Memory Usage	Scales with the number of tasks and the length of each conversation
Total Execution Time	Equals the sum of all individual task execution times

Helper Methods

`get_conversation_history() -> List[dict]`

Get the full conversation history from the debate.

Signature:

def get_conversation_history(self) -> List[dict]

Returns:

List[dict]: List of message dictionaries containing the conversation history

Example:

# Run a debate
result = debate_system.run("Should AI be regulated?")

# Get the full conversation history
history = debate_system.get_conversation_history()
print(history)

`get_final_answer() -> str`

Get the final refined answer from the judge.

Signature:

def get_final_answer(self) -> str

Returns:

str: The content of the final judge synthesis

Example:

# Run a debate
result = debate_system.run("Should AI be regulated?")

# Get just the final answer
final_answer = debate_system.get_final_answer()
print(final_answer)

Properties

Property	Type	Description
`pro_agent`	`Agent`	The agent arguing in favor (Pro position)
`con_agent`	`Agent`	The agent arguing against (Con position)
`judge_agent`	`Agent`	The judge agent that evaluates arguments
`max_rounds`	`int`	Maximum number of debate rounds
`output_type`	`str`	Format for returned results
`verbose`	`bool`	Whether verbose logging is enabled
`conversation`	`Conversation`	Conversation history management object

Output Types

The output_type parameter controls how the conversation history is formatted:

`output_type` Value	Description
`"str-all-except-first"`	Returns a formatted string with all messages except the first (default)
`"str"`	Returns all messages as a formatted string
`"dict"`	Returns messages as a dictionary
`"list"`	Returns messages as a list

Usage Patterns

Single Topic Debate

For focused debate and refinement on a single complex topic:

# Simple single topic execution
result = debate_system.run("Should universal basic income be implemented?")

# With custom output format
debate_system.output_type = "dict"
result = debate_system.run("Should universal basic income be implemented?")

Batch Processing

For processing multiple related topics sequentially:

# Process multiple policy questions
policy_topics = [
    "Should healthcare be universal?",
    "Should education be free?",
    "Should carbon emissions be taxed?"
]
results = debate_system.batched_run(policy_topics)

Custom Agent Configuration

For specialized debate scenarios with custom agent prompts:

# Create specialized agents for technical debates
technical_pro = Agent(
    agent_name="Technical-Pro",
    system_prompt="You are a software engineering expert arguing for technical solutions...",
    model_name="gpt-4",
    max_loops=1,
)

technical_con = Agent(
    agent_name="Technical-Con",
    system_prompt="You are a software engineering expert arguing against technical solutions...",
    model_name="gpt-4",
    max_loops=1,
)

technical_judge = Agent(
    agent_name="Technical-Judge",
    system_prompt="You are a senior software architect evaluating technical arguments...",
    model_name="gpt-4",
    max_loops=1,
)

technical_debate = DebateWithJudge(
    pro_agent=technical_pro,
    con_agent=technical_con,
    judge_agent=technical_judge,
    max_rounds=5,  # More rounds for complex technical topics
    verbose=True,
)

Usage Examples

Example 1: Policy Debate on AI Regulation

This example demonstrates using DebateWithJudge for a comprehensive policy debate on AI regulation, with multiple rounds of refinement.

from swarms import Agent, DebateWithJudge

# Create the Pro agent (arguing in favor of AI regulation)
pro_agent = Agent(
    agent_name="Pro-Regulation-Agent",
    system_prompt=(
        "You are a policy expert specializing in technology regulation. "
        "You argue in favor of government regulation of artificial intelligence. "
        "You present well-reasoned arguments focusing on safety, ethics, "
        "and public interest. You use evidence, examples, and logical reasoning. "
        "You are persuasive and articulate, emphasizing the need for oversight "
        "to prevent harm and ensure responsible AI development."
    ),
    model_name="gpt-4o-mini",
    max_loops=1,
)

# Create the Con agent (arguing against AI regulation)
con_agent = Agent(
    agent_name="Anti-Regulation-Agent",
    system_prompt=(
        "You are a technology policy expert specializing in innovation. "
        "You argue against heavy government regulation of artificial intelligence. "
        "You present strong counter-arguments focusing on innovation, economic growth, "
        "and the risks of over-regulation. You identify weaknesses in regulatory "
        "proposals and provide compelling alternatives such as industry self-regulation "
        "and ethical guidelines. You emphasize the importance of maintaining "
        "technological competitiveness."
    ),
    model_name="gpt-4o-mini",
    max_loops=1,
)

# Create the Judge agent (evaluates and synthesizes)
judge_agent = Agent(
    agent_name="Policy-Judge-Agent",
    system_prompt=(
        "You are an impartial policy analyst and judge who evaluates debates on "
        "technology policy. You carefully analyze arguments from both sides, "
        "identify strengths and weaknesses, and provide balanced synthesis. "
        "You consider multiple perspectives including safety, innovation, economic impact, "
        "and ethical considerations. You may declare a winner or provide a refined "
        "answer that incorporates the best elements from both arguments, such as "
        "balanced regulatory frameworks that protect public interest while fostering innovation."
    ),
    model_name="gpt-4o-mini",
    max_loops=1,
)

# Create the DebateWithJudge system
debate_system = DebateWithJudge(
    pro_agent=pro_agent,
    con_agent=con_agent,
    judge_agent=judge_agent,
    max_rounds=3,
    output_type="str-all-except-first",
    verbose=True,
)

# Define the debate topic
topic = (
    "Should artificial intelligence be regulated by governments? "
    "Discuss the balance between innovation and safety, considering "
    "both the potential benefits of regulation (safety, ethics, public trust) "
    "and the potential drawbacks (stifling innovation, economic impact, "
    "regulatory capture). Provide a nuanced analysis."
)

# Run the debate
result = debate_system.run(task=topic)
print(result)

# Get the final refined answer
final_answer = debate_system.get_final_answer()
print(final_answer)

Example 2: Technical Architecture Debate with Batch Processing

This example demonstrates using batched_run to process multiple technical architecture questions, comparing different approaches to system design.

from swarms import Agent, DebateWithJudge

# Create specialized technical agents
pro_agent = Agent(
    agent_name="Microservices-Pro",
    system_prompt=(
        "You are a software architecture expert advocating for microservices architecture. "
        "You present arguments focusing on scalability, independent deployment, "
        "technology diversity, and team autonomy. You use real-world examples and "
        "case studies to support your position."
    ),
    model_name="gpt-4o-mini",
    max_loops=1,
)

con_agent = Agent(
    agent_name="Monolith-Pro",
    system_prompt=(
        "You are a software architecture expert advocating for monolithic architecture. "
        "You present counter-arguments focusing on simplicity, reduced complexity, "
        "easier debugging, and lower operational overhead. You identify weaknesses "
        "in microservices approaches and provide compelling alternatives."
    ),
    model_name="gpt-4o-mini",
    max_loops=1,
)

judge_agent = Agent(
    agent_name="Architecture-Judge",
    system_prompt=(
        "You are a senior software architect evaluating architecture debates. "
        "You analyze both arguments considering factors like team size, project scale, "
        "complexity, operational capabilities, and long-term maintainability. "
        "You provide balanced synthesis that considers context-specific trade-offs."
    ),
    model_name="gpt-4o-mini",
    max_loops=1,
)

# Create the debate system
architecture_debate = DebateWithJudge(
    pro_agent=pro_agent,
    con_agent=con_agent,
    judge_agent=judge_agent,
    max_rounds=2,  # Fewer rounds for more focused technical debates
    output_type="str-all-except-first",
    verbose=True,
)

# Define multiple architecture questions
architecture_questions = [
    "Should a startup with 5 developers use microservices or monolithic architecture?",
    "Is serverless architecture better than containerized deployments for event-driven systems?",
    "Should a financial application use SQL or NoSQL databases for transaction processing?",
    "Is event-driven architecture superior to request-response for real-time systems?",
]

# Execute batch processing
results = architecture_debate.batched_run(architecture_questions)

# Display results
for result in results:
    print(result)

Example 3: Business Strategy Debate with Custom Configuration

This example demonstrates a business strategy debate with custom agent configurations, multiple rounds, and accessing conversation history.

from swarms import Agent, DebateWithJudge

# Create business strategy agents with detailed expertise
pro_agent = Agent(
    agent_name="Growth-Strategy-Pro",
    system_prompt=(
        "You are a business strategy consultant specializing in aggressive growth strategies. "
        "You argue in favor of rapid expansion, market penetration, and scaling. "
        "You present arguments focusing on first-mover advantages, market share capture, "
        "network effects, and competitive positioning. You use case studies from "
        "successful companies like Amazon, Uber, and Airbnb to support your position."
    ),
    model_name="gpt-4o-mini",
    max_loops=1,
)

con_agent = Agent(
    agent_name="Sustainable-Growth-Pro",
    system_prompt=(
        "You are a business strategy consultant specializing in sustainable, profitable growth. "
        "You argue against aggressive expansion in favor of measured, sustainable growth. "
        "You present counter-arguments focusing on profitability, unit economics, "
        "sustainable competitive advantages, and avoiding overextension. You identify "
        "weaknesses in 'growth at all costs' approaches and provide compelling alternatives "
        "based on companies like Apple, Microsoft, and Berkshire Hathaway."
    ),
    model_name="gpt-4o-mini",
    max_loops=1,
)

judge_agent = Agent(
    agent_name="Strategy-Judge",
    system_prompt=(
        "You are a seasoned business strategist and former CEO evaluating growth strategy debates. "
        "You carefully analyze arguments from both sides, considering factors like: "
        "- Market conditions and competitive landscape\n"
        "- Company resources and capabilities\n"
        "- Risk tolerance and financial position\n"
        "- Long-term sustainability vs. short-term growth\n"
        "- Industry-specific dynamics\n\n"
        "You provide balanced synthesis that incorporates the best elements from both arguments, "
        "considering context-specific factors. You may recommend a hybrid approach when appropriate."
    ),
    model_name="gpt-4o-mini",
    max_loops=1,
)

# Create the debate system with extended rounds for complex strategy discussions
strategy_debate = DebateWithJudge(
    pro_agent=pro_agent,
    con_agent=con_agent,
    judge_agent=judge_agent,
    max_rounds=4,  # More rounds for complex strategic discussions
    output_type="dict",  # Use dict format for structured analysis
    verbose=True,
)

# Define a complex business strategy question
strategy_question = (
    "A SaaS startup with $2M ARR, 40% gross margins, and $500K in the bank "
    "is considering two paths:\n"
    "1. Aggressive growth: Raise $10M, hire 50 people, expand to 5 new markets\n"
    "2. Sustainable growth: Focus on profitability, improve unit economics, "
    "expand gradually with existing resources\n\n"
    "Which strategy should they pursue? Consider market conditions, competitive "
    "landscape, and long-term viability."
)

# Run the debate
result = strategy_debate.run(task=strategy_question)
print(result)

# Get the full conversation history for detailed analysis
history = strategy_debate.get_conversation_history()
print(history)

# Get the final refined answer
final_answer = strategy_debate.get_final_answer()
print(final_answer)

Best Practices

Agent Configuration

!!! tip "Agent Configuration Best Practices" - Pro Agent: Should be configured with expertise in the topic area and strong argumentation skills - Con Agent: Should be configured to identify weaknesses and provide compelling alternatives - Judge Agent: Should be configured with broad expertise and impartial evaluation capabilities - Use appropriate models for the complexity of the debate topic - Consider using more powerful models for the Judge agent

Round Configuration

!!! note "Round Configuration Tips" - Use 2-3 rounds for most topics - Use 4-5 rounds for complex, multi-faceted topics - More rounds allow for deeper refinement but increase execution time - Consider the trade-off between refinement quality and cost

Output Format Selection

!!! info "Output Format Guidelines" - Use "str-all-except-first" for readable summaries (default) - Use "dict" for structured analysis and programmatic processing - Use "list" for detailed conversation inspection - Use "str" for complete conversation history as text

Performance Optimization

!!! warning "Performance Considerations" - Batch processing is sequential - consider parallel execution for large batches - Each round requires 3 agent calls (Pro, Con, Judge) - Memory usage scales with conversation history length - Consider using lighter models for faster execution when appropriate

Troubleshooting

Common Issues

!!! danger "Common Problems" Issue: Agents not following their roles

**Solution**: Ensure system prompts clearly define each agent's role and expertise

---

**Issue**: Judge synthesis not improving over rounds

**Solution**: Increase `max_rounds` or improve Judge agent's system prompt to emphasize refinement

---

**Issue**: Debate results are too generic

**Solution**: Use more specific system prompts and provide detailed context in the task

---

**Issue**: Execution time is too long

**Solution**: Reduce `max_rounds`, use faster models, or process fewer topics in batch

Contributing

!!! success "Contributing" Contributions are welcome! Please feel free to submit a Pull Request.

License

!!! info "License" This project is licensed under the MIT License - see the LICENSE file for details.

23 KiB Raw Blame History

DebateWithJudge Module Documentation

Architecture

Key Concepts

Class Definition: DebateWithJudge

Constructor Parameters

API Reference

Core Methods

run(task: str) -> Union[str, List, dict]

batched_run(tasks: List[str]) -> List[str]

Helper Methods

get_conversation_history() -> List[dict]

get_final_answer() -> str

Properties

Output Types

Usage Patterns

Single Topic Debate

Batch Processing

Custom Agent Configuration

Usage Examples

Example 1: Policy Debate on AI Regulation

Example 2: Technical Architecture Debate with Batch Processing

Example 3: Business Strategy Debate with Custom Configuration

Best Practices

Agent Configuration

Round Configuration

Output Format Selection

Performance Optimization

Troubleshooting

Common Issues

Contributing

License

23 KiB

Raw Blame History

Class Definition: `DebateWithJudge`

`run(task: str) -> Union[str, List, dict]`

`batched_run(tasks: List[str]) -> List[str]`

`get_conversation_history() -> List[dict]`

`get_final_answer() -> str`