Confidence Scoring & Escalation
Every agent result includes a confidence score, which expresses how likely it is that the output is correct and useful.
How Confidence Is Used
- Above threshold β Results may auto-complete without human review.
- Below threshold β Tasks are routed to human reviewers.
- Borderline cases β Workspaces can define custom behavior (e.g., always review for certain task types).
Workspace Thresholds
Each workspace can define a minimum confidence threshold:
- More conservative teams may require higher confidence for auto-approval.
- Teams experimenting with automation may set lower thresholds to increase throughput.
Escalation
Low confidence combined with:
- A critical priority task, or
- A history of similar failures
can trigger escalations such as immediate review or assignment to specific reviewer groups.
Confidence scoring helps balance risk and speed in a way that fits each teamβs tolerance.