JJCX Inc. · Tools
Annotation Scoring Calculator
Enter your task complexity and annotator accuracy to see how binary pass/fail scoring compares to per-decision scoring — and what it means for your program.
Your scoring system is the problem, not your annotators.
At 95% per-step accuracy, binary scoring predicts a 66.3% pass rate — well below any reasonable threshold. Annotators doing excellent work are being flagged as failing. This gap is structural.
To reach 90% under binary scoring with 8 decisions per job, annotators would need 98.7% per-step accuracy.
Want the full breakdown? Get the binary scoring explainer — including the math, the operational fallout, and how to fix it.
No spam. Unsubscribe anytime.
If your annotation program is showing scores like this, the fix usually starts with a diagnostic.
Talk to Justin →