eval
Debug and test your agent prompts and tools
What it does
The eval plugin helps you debug agents during development:
Generate Expected (after_user_input)
Generates what should happen to complete the task (unless already set by re_act plugin).
Evaluate (on_complete)
After agent finishes, evaluates if the task was truly completed.
Quick Start
main.py
Python REPL
Interactive
Want to customize? Run co copy eval to get an editable copy.
Combined with re_act
When used with re_act, the eval plugin skips generating expected outcomes (re_act's plan serves as the expected):
main.py
How it works
1. Generate Expected
main.py
2. Evaluate Completion
main.py
Events Used
| Event | Handler | Purpose |
|---|---|---|
after_user_input | generate_expected | Set expected outcome |
on_complete | evaluate_completion | Evaluate if task complete |
Use Cases
- Development: Verify your agent completes tasks correctly
- Testing: Automated evaluation of agent responses
- Debugging: Identify incomplete or incorrect tool usage
Source
connectonion/useful_plugins/eval.py
main.py
