~2% hallucination rate. Not 18%.
A 3-stage pipeline that extracts claims, verifies them against source turns, and sanitizes any that can't be confirmed — all within the same LLM call.
The 3-stage pipeline
Every LLM response passes through all three stages before the user sees it.
Real example: medical intake
The same conversation — once with raw LLM output, once with the firewall active.
Based on your conversation, your blood pressure appears to be 120/80 and you seem to have a mild fever of 37.5°C.
Based on what you've shared, I'd like to confirm a few things. Could you share your current blood pressure reading? And have you measured your temperature recently?
Common questions
Things engineers ask before deploying to production.
No. The 3 stages run inside the same LLM call using structured output constraints. There is zero added round-trip latency.
Blind automated evaluation across 1,000 real production medical intakes using a separate verification model. The test harness is open-source.
Yes. ClaimExtractor and ClaimVerifier both operate on the semantic content, not raw English strings. All 22 supported languages are covered.
Yes. Set min_confidence: 0.85 in your YAML to accept only claims the verifier is 85%+ confident about. Lower thresholds allow more through.
See it run live
The playground runs the full pipeline including the hallucination firewall on every bot turn.