Model Review Battle Tutorial
Purpose
Learn how a portable PRIVATE_BATTLE.md describes participants, evaluation, reporting, and an optional executable task.
Concepts Covered
Battle, fighter/contender, runner/provider, local execution, judging, result artifacts.
What You Will Build
You will simulate examples/battles/model-review-battle, then optionally execute it with Ollama.
Prerequisites
- Node 22.
- Dependencies installed.
- CLI built with
pnpm nx run cli:build. - Optional: Ollama with
llama3.1for execution.
File Structure
text
examples/battles/model-review-battle/
PRIVATE_BATTLE.md
README.mdStep-by-Step Walkthrough
- Open
PRIVATE_BATTLE.md. - Inspect
participants. Each participant hastype,ref,provider, andmodel. - Inspect
metricsandrubric_ref. - Run simulation first.
- If Ollama is ready, execute with
--execute --judge human.
How to Run the Example
Simulation:
bash
pnpm nx run cli:build
node dist/apps/cli/main.js battle run examples/battles/model-review-battle/PRIVATE_BATTLE.mdOptional execution:
bash
node dist/apps/cli/main.js battle run examples/battles/model-review-battle/PRIVATE_BATTLE.md --execute --judge humanExpected Output
Simulation reports two participants and writes local report artifacts. Execution streams both contender outputs and writes:
text
model-review-battle.result.md
model-review-battle.result.jsonHow the Example Works Internally
Simulation validates the private battle and reports readiness. Execution creates a local battle state, maps the first two executable participants to contender slots A and B, streams provider output, and writes result artifacts.
Common Errors and Troubleshooting
Execution requires at least 2 participants: both selected participants needproviderandmodel.- Provider errors: start Ollama and pull the configured models.
- Do not commit result files if prompts or outputs are sensitive.
Suggested Modifications
- Change participant models.
- Add
human_judge_required: true. - Replace the task body with a prompt from your own project.