V2 Or A New Model? How To Decide When To Add A Version On Civitai
**Category:** workflow | **Tier:** Free | **Estimated reading time:** 5 min
**Excerpt:** You've got an updated checkpoint or LoRA ...
Related Tutorials
Two Hard Rules For Blind Evals: 5-Prompt Floor And Always-Control
You ran a blind eval, picked a winner, almost shipped it — then a verification round flipped the result entirely. The candidate that won two of three prompts placed fourth across five. Three prompts felt like enough data; it was actually noise dressed up as signal. There are two specific design rules that prevent this failure: a hard floor on prompt count, and always including the previous version as a control. Cheap to apply, painful to ignore. Here's what each one buys you and the exact thresholds I use now.
Weighted Scoring — When Your 3/2/1 Tournament Hides The Real Winner
Your blind eval came back with two models tied at the top. 21 points each across 10 prompts under standard 3 / 2 / 1 top-3 ranking. Looks like a coin flip. It probably isn't. The standard scoring scheme treats 'never bombs' and 'wins more often' as equivalent — but for production model selection, those are very different qualities. Here's how to re-score the same data under different weighting schemes to surface the real preference, why ties under standard scoring often resolve cleanly when you reweight, and how to pick a scoring scheme that matches what you'll actually do with the result.
Multi-Round Merge Tournaments: Wide → Narrow → Dial-In
You ran a tournament with five candidate merges. Picked a winner. Shipped it. Two months later you wonder if the loser at slot 3 might have actually been better with slightly different weights — and you have no way to know without redoing everything. The fix is a multi-round tournament structure: wide net first, narrow on the winner's neighborhood, dial in along a single axis. Three rounds, ten or so total candidates, an answer you can defend. Here's how to design each round so the result is interpretable, not just a winner.