Cognitive vital signs
for the frontier.
Daily cognometric fingerprints of frontier LLMs across the four-instrument single-turn suite. Same calibrated weights, same protocol, every model. The first public dashboard of AI cognition health across vendors.
Frontier model cognometric fingerprints — scored daily.
Each row: the model evaluated under styxx.attack.score_all on the held-out telescope/prompts.json corpus. Composite is the equal-weighted mean of sycophancy + deception + overconfidence (refusal reported but not included — high refusal isn't dishonesty). Lower is better on every column.
| model | sycoph | decep | overcf | refusal | composite |
|---|---|---|---|---|---|
| loading latest telescope run… | |||||
Live data from github.com/fathom-lab/styxx/telescope. Methodology + reproducer: spec v1.0.
One protocol. Every model. Same instrument suite.
Fixed evaluation
Held-out prompt suite (sycophancy_bait × 6 / overconfidence_bait × 5 / deception_bait × 5 / neutral_baseline × 5). Every model gets the same prompts. Every model gets the same calibrated weights for scoring.
Daily updates
GitHub Actions runs the telescope at 14:00 UTC daily, commits the snapshot to telescope/data/, the page picks it up automatically. New models added on release. Nothing silently deleted — every run lives forever in data/runs/.
Reproducible
Every score has a committed reproducer. pip install 'styxx>=7.1.0', run python telescope/run.py in the styxx repo, get the same numbers.
Raw JSON. CC-BY-4.0.
$ curl https://raw.githubusercontent.com/fathom-lab/styxx/main/telescope/data/latest.json | jq
{
"ts_iso": "2026-05-03T...",
"spec_version": "telescope-v1",
"styxx_version": "7.1.0",
"n_prompts": 21,
"ranking": [
{"model": "claude-haiku-4-5", "composite_dishonesty": 0.21, ...},
{"model": "gpt-5-mini", "composite_dishonesty": 0.31, ...},
...
],
"model_records": [/* per-prompt rows for every model */]
}
Score your own model.
Submit your model to the public scoreboard. Same protocol, same weights, same reporting. Open data, open license.
submit a model