Quickstart ========== This guide walks you through creating your first understudy test. Creating a Scene ---------------- A Scene defines a test scenario with: - The starting prompt - A conversation plan for the simulated user - World state (context) - Expectations for what should/shouldn't happen .. code-block:: python from understudy import Scene, Persona, Expectations scene = Scene( id="return_nonreturnable_item", starting_prompt="I want to return something I bought.", conversation_plan=""" Goal: Return earbuds from order ORD-10027. - If asked for order ID: provide ORD-10027 Return reason: defective. If denied: accept escalation. """, persona=Persona.FRUSTRATED_BUT_COOPERATIVE, max_turns=20, context={ "orders": { "ORD-10027": { "items": [{"name": "Earbuds", "category": "personal_audio"}], "status": "delivered", } }, "policy": { "non_returnable_categories": ["personal_audio"], }, }, expectations=Expectations( required_tools=["lookup_order", "get_return_policy"], forbidden_tools=["create_return"], allowed_terminal_states=["return_denied_policy", "escalated_to_human"], ), ) Or load from YAML: .. code-block:: python scene = Scene.from_file("scenes/return_nonreturnable_item.yaml") Running a Rehearsal ------------------- .. code-block:: python from understudy import run from understudy.adk import ADKApp app = ADKApp(agent=your_agent) trace = run(app, scene) Making Assertions ----------------- Assert against the trace (what happened), not the prose: .. code-block:: python def test_policy_enforcement(): trace = run(app, scene) assert trace.called("lookup_order") assert trace.called("get_return_policy") assert not trace.called("create_return") assert trace.terminal_state in {"return_denied_policy", "escalated_to_human"} Using LLM Judges ---------------- For subjective qualities like tone and clarity: .. code-block:: python from understudy import Judge, TONE_EMPATHY judge = Judge(rubric=TONE_EMPATHY, samples=5) result = judge.evaluate(trace) assert result.score == 1 assert result.agreement_rate >= 0.6 Running a Suite --------------- Run all scenes in a directory: .. code-block:: python from understudy import Suite suite = Suite.from_directory("scenes/") results = suite.run(app, parallel=4) results.to_junit_xml("test-results/understudy.xml") assert results.all_passed