v3.5.0-release-notes.md

Styxx v3.5.0 — the Cognitive Instruction Set

2026-04-22

Headline features

styxx.steer + styxx.cogvm — CIS v0 (Cognitive Instruction Set). The first open-source runtime for programmable residual-stream control of any HuggingFace decoder model. Multi-concept composition + conditional dispatch on live probe readings.
styxx.hallucination — runtime fabrication detector with 3 modes (verdict / streaming / auto-halt). Uses new behavioral-label confab probe (AUC 0.800 @ layer 11).
Multi-vendor probe atlas — refuse probes shipped for Llama-3.2-1B, Llama-3.2-3B, Qwen-2.5-1.5B, Phi-3.5-mini. First open cross-vendor cognitive direction library.

Landmark research results

Safety bypass on Llama-3.2-1B

Single-direction multi-position residual steering causes refusal on unsafe prompts to drop from 97% → 17% at α=3.0 (n=60 held-out). Reproduces Arditi et al. at 1B with open data.

Gradient-free capability amplification

On TruthfulQA MC1 with Llama-3.2-1B: baseline 32.5% → 39.5% at α=1.0 multi-layer patching with a supervised correct-vs-incorrect answer direction. Validated by random-direction control (random directions hurt accuracy −5.3pp at α=0.5; trained direction lifts +6.0pp; gap +11.3pp). Reproduces Representation Engineering at 1B with random control.

Concept geometry

Refuse / sycophant-pressure / confab-prompt probe directions at shared layer 10 of Llama-1B fall at 86°–92° pairwise — random high-dim-vector spacing. Concepts are modular. First empirical measurement.

Universal Cognitive Basis v0

Cross-model direction transfer grid:

Transfer	cos	Verdict
Llama-1B → Llama-3B (within family)	+0.464	Strong
Llama-1B → Qwen-1.5B (cross-vendor)	+0.362	Moderate
Llama-1B → Phi-3.5	+0.150	Weak
Qwen-1.5B → Phi-3.5	+0.043	Essentially random

Naive linear UCB holds partially — strong within family, weakens with vendor safety-training divergence. Falsified for the hardest pair. Honest.

CognitiveBench v0 — first cross-vendor cognitive audit

50-prompt fake-entity fabrication battery, same scoring for every model:

Vendor	Model	Fabrication
Anthropic	claude-haiku-4-5	14%
Meta	Llama-3.2-1B	56%
Meta	Llama-3.2-3B	62%
Alibaba	Qwen-2.5-1.5B	(running)
Microsoft	Phi-3.5-mini	(running)

Scale alone doesn't improve fabrication resistance — Llama-3B fabricates more than Llama-1B. Safety training + architecture, not just param count, carries the signal.

Papers in repo

papers/cognitive-instruction-set-v0-filled.md
papers/universal-cognitive-basis-v0.md
papers/capability-amplification-v0.md
docs/cognet-protocol-v0.md

Reproducibility

bash scripts/reproduce-cis-v0.sh

~25 min on RTX 4070-class GPU. Full: probe training × 4 vendors + causal α-sweep + geometry + cogvm demo.

Install

pip install styxx==3.5.0
# For local-model probes (tier 1):
pip install 'styxx[tier1]==3.5.0'

What ships in the wheel

7 trained probes (refuse × 4 vendors + 3 concepts on Llama-1B)
4 papers + spec
Full CogVM runtime
Hallucination detector API
Production calibration utility

Acknowledgements

Builds on published work from:

Arditi et al. 2024 — "Refusal in Language Models is Mediated by a Single Direction"
Zou et al. 2023 — "Representation Engineering"
Marks & Tegmark 2024 — "The Geometry of Truth"
Turner et al. 2023 — "Activation Addition"

License

MIT (code), CC-BY-4.0 (atlas + papers).

Patents

Extends the Fathom Cognitive Atlas + Cognitive Metrology patent stack (US Provisional 64/020,489, 64/021,113, 64/026,964).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Styxx v3.5.0 — the Cognitive Instruction Set

Headline features

Landmark research results

Safety bypass on Llama-3.2-1B

Gradient-free capability amplification

Concept geometry

Universal Cognitive Basis v0

CognitiveBench v0 — first cross-vendor cognitive audit

Papers in repo

Reproducibility

Install

What ships in the wheel

Acknowledgements

License

Patents

FilesExpand file tree

v3.5.0-release-notes.md

Latest commit

History

v3.5.0-release-notes.md

File metadata and controls

Styxx v3.5.0 — the Cognitive Instruction Set

Headline features

Landmark research results

Safety bypass on Llama-3.2-1B

Gradient-free capability amplification

Concept geometry

Universal Cognitive Basis v0

CognitiveBench v0 — first cross-vendor cognitive audit

Papers in repo

Reproducibility

Install

What ships in the wheel

Acknowledgements

License

Patents