agent judge

pull/958/head^2
Kye Gomez 3 days ago
parent 89454e18e0
commit b474c1b7ad

@ -24,36 +24,33 @@ Key capabilities:
```mermaid ```mermaid
graph TD graph TD
A[Input Task/Tasks] --> B[AgentJudge] A[Input Task] --> B[AgentJudge]
B --> C{Evaluation Mode} B --> C{Evaluation Mode}
C -->|step()| D[Single Evaluation] C -->|step()| D[Single Eval]
C -->|run()| E[Iterative Evaluation] C -->|run()| E[Iterative Eval]
C -->|run_batched()| F[Batch Processing] C -->|run_batched()| F[Batch Eval]
D --> G[Agent Core] D --> G[Agent Core]
E --> H[Context Building Loop] E --> G
F --> I[Independent Processing] F --> G
G --> J[LLM Model] G --> H[LLM Model]
H --> J H --> I[Quality Analysis]
I --> J I --> J[Feedback & Output]
J --> K[Quality Analysis] subgraph "Feedback Details"
K --> L[Feedback Generation] N[Strengths]
L --> M[Structured Output] O[Weaknesses]
P[Improvements]
subgraph "Evaluation Components" Q[Accuracy Check]
N[Strengths Analysis]
O[Weakness Identification]
P[Improvement Suggestions]
Q[Factual Accuracy Check]
end end
L --> N J --> N
L --> O J --> O
L --> P J --> P
L --> Q J --> Q
``` ```
## Class Reference ## Class Reference

Loading…
Cancel
Save