Cognometry v0: 8-Benchmark Cross-Validated Hallucination Detection in Production LLMs
Description
We define cognometry as the empirical quantification of cognitive states in machine systems—refusal, confabulation, retrieval, reasoning, and adversarial drift—from signals already carried on the token stream and residual activations of a language model during inference. We publish three falsifiable laws of cognometry (vitals exist, vitals transfer across substrates, vitals are causally actionable) with cross-validated numerical support for each, and ship the first open-source instrument (styxx on PyPI) that realizes the measurement.
The central empirical claim of this paper is narrower: a 9-signal logistic regression fused over text, entity, novelty, grounding, and NLI contradiction signals achieves cross-validated hallucination discrimination across 8 public benchmarks— HaluEval-QA, Dialog, Summarization, TruthfulQA, and four HaluBench subsets (DROP, PubMedQA, FinanceBench, RAGTruth)—with honest per-dataset performance ranging from near-perfect (AUC 0.998 on HaluEval-QA) to below chance (AUC 0.424 on DROP).
We openly report and taxonomize the failure modes: reading- comprehension extractive-span errors and financial arithmetic errors are not detected by the present signal stack because both classes of error pass the entailment (NLI) and novelty bars by construction. Failure modes are declared in the weights module itself.
This is the first 8-benchmark cross-validated hallucination detector in the open literature. Above-chance performance on 5/8 benchmarks with 3/8 near-perfect is the reproducible empirical floor we lay down. Two below-chance results are the reproducible research agenda we lay down.
Manifesto:
https://fathom.darkflobi.com/cognometry
Software:
github.com/fathom-lab/styxx
| pip install styxx==4.0.1[nli]
Leaderboard:
fathom.darkflobi.com/cognometry/leaderboard