Multi-Round Merge Tournaments: Wide → Narrow → Dial-In
**Category:** workflow | **Tier:** Insider ($5) | **Estimated reading time:** 8 min
**Excerpt:** You ran a tournament with five candidate mer...
Related Tutorials
Two Hard Rules For Blind Evals: 5-Prompt Floor And Always-Control
You ran a blind eval, picked a winner, almost shipped it — then a verification round flipped the result entirely. The candidate that won two of three prompts placed fourth across five. Three prompts felt like enough data; it was actually noise dressed up as signal. There are two specific design rules that prevent this failure: a hard floor on prompt count, and always including the previous version as a control. Cheap to apply, painful to ignore. Here's what each one buys you and the exact thresholds I use now.
Weighted Scoring — When Your 3/2/1 Tournament Hides The Real Winner
Your blind eval came back with two models tied at the top. 21 points each across 10 prompts under standard 3 / 2 / 1 top-3 ranking. Looks like a coin flip. It probably isn't. The standard scoring scheme treats 'never bombs' and 'wins more often' as equivalent — but for production model selection, those are very different qualities. Here's how to re-score the same data under different weighting schemes to surface the real preference, why ties under standard scoring often resolve cleanly when you reweight, and how to pick a scoring scheme that matches what you'll actually do with the result.
V2 Or A New Model? How To Decide When To Add A Version On Civitai
You've got an updated checkpoint or LoRA ready to ship. Same family as something you've already published — but it's a meaningfully different output. Do you click "Add Version" on the existing model page, or post it as a new model? It sounds like a small decision but it's actually a strategic one. Here's the rule I use, when I break it, and what each path actually costs.