Failed Case drawer

Name: LenserFight
Author: LenserFight

Opened from the Evaluations Section or Evaluation drawer when a case fails.

Sections

Section	Content
Expected	The assertion's expected value (rendered by type)
Actual	The agent's actual output for this case
Diff	Inline character-level diff for `substring` / `regex` types
Run trace	Link to the originating run (opens Run Detail)
Token cost	Prompt + completion tokens consumed

Read the diff — is the actual output close, or wildly off?
Open the run trace to inspect tool calls and intermediate states.
Decide:
- Update the case (assertion was wrong).
- Update the instruction lens (prompt regression).
- Update the model profile (model regression).
- Update tooling (tool regression).

Source of truth: FailedCaseDrawer.tsx.

Inspect failed evaluation output, expected result, score, and diff context.
Use it for diagnosis only; fixes belong in cases, rubrics, prompts, models, or workflows.
Verify the fix by rerunning the suite and comparing against the baseline.