Assertions

understudy provides deterministic assertions against the trace — what actually happened, not what the agent said about what happened.

The Trace

After running a scene, you get a Trace that records:

  • All turns (user and agent messages)

  • All tool calls with arguments and results

  • The terminal state

  • Timing information

trace = run(app, scene)

trace.turns           # list of Turn objects
trace.tool_calls      # flat list of all tool invocations
trace.terminal_state  # final resolution state
trace.turn_count      # number of turns
trace.duration        # wall clock time

Querying Tool Calls

Check if a tool was called:

assert trace.called("lookup_order")
assert not trace.called("issue_refund")

Check with specific arguments:

assert trace.called("lookup_order", order_id="ORD-10027")

Get all calls to a tool:

calls = trace.calls_to("lookup_order")
assert calls[0].arguments["order_id"] == "ORD-10027"

Get the sequence of tool calls:

assert trace.call_sequence() == ["lookup_order", "get_return_policy"]

Terminal State Assertions

Check the final resolution:

assert trace.terminal_state == "return_denied_policy"
# or allow multiple valid outcomes
assert trace.terminal_state in {"return_denied_policy", "escalated_to_human"}

Bulk Check

Use check() to validate all expectations at once:

from understudy import check

results = check(trace, scene.expectations)
assert results.passed, f"Failed:\n{results.summary()}"

The summary shows what passed and failed:

✓ required_tools: lookup_order, get_return_policy
✓ forbidden_tools: create_return (none called)
✗ terminal_state: return_created (expected: return_denied_policy)

Pytest Integration

understudy is designed to work with pytest:

import pytest
from understudy import Scene, run, check

def test_policy_enforcement(app):
    scene = Scene.from_file("scenes/return_nonreturnable.yaml")
    trace = run(app, scene)

    assert trace.called("get_return_policy")
    assert not trace.called("create_return")
    assert trace.terminal_state == "return_denied_policy"

@pytest.mark.parametrize("scene_file", [
    "scenes/return_nonreturnable_earbuds.yaml",
    "scenes/return_nonreturnable_perishable.yaml",
])
def test_denial_scenarios(app, scene_file):
    scene = Scene.from_file(scene_file)
    trace = run(app, scene)
    results = check(trace, scene.expectations)
    assert results.passed