Useful Pluginseval

eval

Debug and test your agent prompts and tools

What it does

The eval plugin helps you debug agents during development:

Generate Expected (after_user_input)

Generates what should happen to complete the task (unless already set by re_act plugin).

Evaluate (on_complete)

After agent finishes, evaluates if the task was truly completed.

Quick Start

main.py
from connectonion import Agent from connectonion.useful_plugins import eval def calculate(expression: str) -> str: """Calculate a math expression.""" return str(eval(expression)) agent = Agent("assistant", tools=[calculate], plugins=[eval]) agent.input("What is 25 * 4?")
output
[Expected: Should calculate 25 * 4 and return 100]
[Tool: calculate("25 * 4")]
Result: 100
/evaluating...
✓ Task complete: Calculated 25 * 4 = 100, which matches the expected result.

Want to customize? Run co copy eval to get an editable copy.

Combined with re_act

When used with re_act, the eval plugin skips generating expected outcomes (re_act's plan serves as the expected):

main.py
from connectonion import Agent from connectonion.useful_plugins import re_act, eval agent = Agent("assistant", tools=[search], plugins=[re_act, eval]) agent.input("Search for Python tutorials") # re_act: Plans the task (sets 'expected' in session) # Tools execute with reflection # eval: Evaluates completion (uses re_act's plan as expected)

How it works

1. Generate Expected

main.py
@after_user_input def generate_expected(agent): # Skip if already set by another plugin (e.g., re_act) if agent.current_session.get('expected'): return user_prompt = agent.current_session.get('user_prompt', '') tool_names = agent.tools.names() expected = llm_do( f"User request: {user_prompt}\nTools: {tool_names}\nWhat should happen?", model="co/gemini-2.5-flash" ) agent.current_session['expected'] = expected

2. Evaluate Completion

main.py
@on_complete def evaluate_completion(agent): user_prompt = agent.current_session.get('user_prompt', '') result = agent.current_session.get('result', '') expected = agent.current_session.get('expected', '') trace = agent.current_session.get('trace', []) # Summarize actions taken actions = [f"- {t['tool_name']}: {t['result'][:100]}" for t in trace if t['type'] == 'tool_execution'] evaluation = llm_do( f"Request: {user_prompt}\nExpected: {expected}\n" f"Actions: {actions}\nResult: {result}\n" f"Is this complete?", model="co/gemini-2.5-flash" ) agent.current_session['evaluation'] = evaluation agent.logger.print(f"✓ {evaluation}")

Events Used

EventHandlerPurpose
after_user_inputgenerate_expectedSet expected outcome
on_completeevaluate_completionEvaluate if task complete

Use Cases

  • Development: Verify your agent completes tasks correctly
  • Testing: Automated evaluation of agent responses
  • Debugging: Identify incomplete or incorrect tool usage

Source

connectonion/useful_plugins/eval.py

main.py
# The plugin is just a list of event handlers eval = [generate_expected, evaluate_completion]

Star us on GitHub

If ConnectOnion saves you time, a ⭐ goes a long way — and earns you a coffee chat with our founder.