Operationalization of the Complex Meaning Space framework, kept structurally separate from the research line per ADR.
Blocks 1 + 2 + 3 + 4 + 5 + 6 delivered. Per the ADR sequencing decision:
Block 1 — L1 observation layer
- packaging, layout, dependencies
L1Observationcanonical recordObservationServiceingestion path with optional explicitturn_index- SQLite storage with
ObservationStore - migration adapter from research-line
CMSPoint - equivalence tests proving bit-exact match against the research path
Block 2 — L2 episode layer
L2Episodedataclass with invariants- pluggable closure policies (
WindowedClosurePolicy,SurpriseClosurePolicy,CompositeClosurePolicy) - injectable surprise scorer (
EuclideanSurpriseScorer) EpisodeServicewith per-session open-episode tracking, flush semanticsEpisodeStorewith CRUD + time-range queries- schema v2 with idempotent migration
CMSEngineorchestrating L1 → L2
Block 3 — L3A memory evidence layer
MemoryEvidencedataclass with mandatory provenance- 8 filing rules (5 observation-level, 3 episode-level) with dead zones + mutual exclusion
EvidenceServicewith idempotency fast path + schema-level UNIQUE backstop- soft-canonical scope policing
- bounded, deterministic
support_score+ defaultrelevance_score= 1.0 EvidenceStorewith CRUD + provenance queries- schema v3 with UNIQUE (user_id, source_kind, source_id, rule_id)
- contradiction fields persisted but inert until Block 5
Block 4 — Retrieval and canonical state assembly
RuntimeStateView— single canonical, consumer-neutral view of persisted stateRetrievalPolicy— tunable limits (default: 5 obs, 3 eps, 5 evidence)RetrievalService— composes store queries with the locked deterministic orderingStateAssembler— buildsRuntimeStateViewwith signals/counts/freshness flags- store helpers:
latest_for_session, scopedsearchfor evidence - evidence ranking: pinned > scope-exact > subscope-exact > newer > stronger > id (deterministic tie-break)
- freshness flags reflect operational recency, NOT decay
- embedded records (not just ids) so consumers don't make second store calls
Block 5 — L3B profile beliefs
ProfileBeliefdataclass with strict per-dimension value semantics (DIMENSION_SPECSregistry)- 3 belief dimensions, scope-pure mapping:
epistemic_style← epistemic scope (signed [-1, +1])social_orientation← social scope (signed [-1, +1])pragmatic_style← pragmatic scope (magnitude [0, +1])
BeliefServicewith strict evidence-only input (never reads observations or episodes directly)BeliefStorewith split insert/update routing — UNIQUE (user_id, dimension) actually firesBeliefThresholdsconfigurable policy: tentative → active → stale → invalidated- staleness via discrete status transitions only (vocabulary: "staleness", never "decay")
- contradiction population:
counterevidence_idsnow populated; classification uses supporting-ledger direction (not value sign) so contradictions stay classified as contradictions even when value drifts to zero - contradiction can invalidate via two paths: count > supports, OR burst threshold within window
- engine wiring: optional
belief_serviceconstructor arg, push-based viaprocess_new_evidence recompute_for_userescape hatch for batch repair / threshold changessweep_stalenessfor periodic active → stale transitionsRuntimeStateViewextended withactive_beliefsandtentative_beliefs(stale and invalidated NOT surfaced as truth)StateAssemblerreads beliefs but never mutates them (guardrail B enforced by tests)- schema v4 with profile_beliefs table, UNIQUE (user_id, dimension)
dynamicsscope evidence is filed but feeds no belief in Block 5 — honest non-mapping
Block 6 — Scoped beliefs, cross-scope dimensions, supersession, events, explanations
context_keyper turn — caller-supplied lane string, propagates fromengine.process_turn(context_key=...)through evidence to beliefs;Nonemeans global, non-Nonemeans scoped- guardrail A:
context_keydoes NOT enter the evidence idempotency key — replays of the same source object don't duplicate evidence even if context handling changed - guardrail B: global and scoped beliefs coexist with no implicit reconciliation;
RuntimeStateViewexposes them in separate buckets interaction_stability— fourth belief dimension, fed bydynamicsscope (rupture → -1, sustained_regime → +1); the locked four-case test matrix covers scope-pure × global/scoped and cross-scope × global/scoped- filing-time supersession in
EvidenceService— when filing a new record, prior records older than 30 days (configurable) in the same(user_id, rule_id, context_key)lane are recorded in the new record'ssupersedeslist; old records remain in store as audit history - supersession-aware recompute — superseded support stays in
supporting_memory_idsfor full provenance but is excluded from value, confidence, stability, and threshold counting;metadata["superseded_support_count"]exposes the audit count - belief transition events — six event types (
belief_tentative_created,belief_activated,belief_staled,belief_invalidated,belief_recomputed,belief_scoped_created) emitted via callableevent_handler; shipsNullEventHandler(default) andLoggingEventHandler; events fire only on real status transitions, never on numeric tweaks BeliefExplanationdataclass +BeliefService.explain(belief_id, top_n=5)for on-demand structured explanations; ranked deterministically by (support_score DESC, created_at DESC, memory_id DESC); never carried onRuntimeStateViewRuntimeStateViewsplit:active_beliefs_global,active_beliefs_scoped,tentative_beliefs_global,tentative_beliefs_scoped; legacyactive_beliefs/tentative_beliefsproperties return the global lists for back-compatall_active_beliefsconvenience property combines both; consumers that use it accept reconciliation responsibility- schema v5 —
context_keycolumns onmemory_evidenceandprofile_beliefs, composite UNIQUE index(user_id, dimension, COALESCE(context_key, ''))so global and scoped beliefs are distinct lanes SQLiteBackend.bootstrap_schemaauto-applies v5 migration steps; all existing call sites work unchanged
cms_runtime/
├── pyproject.toml
├── requirements/
│ ├── base.txt
│ ├── research.txt
│ └── dev.txt
├── src/
│ └── cms/
│ ├── l1/
│ │ ├── observation.py L1Observation dataclass
│ │ ├── adapter.py LegacyExtractorAdapter
│ │ └── service.py ObservationService
│ ├── l2/
│ │ ├── episode.py L2Episode dataclass
│ │ ├── policies.py Closure policies + surprise scorer
│ │ └── service.py EpisodeService
│ ├── l3/
│ │ ├── evidence.py MemoryEvidence dataclass + CANONICAL_SCOPES
│ │ ├── rules.py 8 filing rules (pluggable)
│ │ ├── service.py EvidenceService (idempotent, scope-policed)
│ │ ├── belief.py ProfileBelief + DIMENSION_SPECS registry
│ │ ├── belief_policy.py BeliefThresholds + is_belief_stale
│ │ └── belief_service.py BeliefService (evidence-only, push-based)
│ ├── runtime/
│ │ ├── engine.py CMSEngine + TurnResult
│ │ ├── retrieval.py RetrievalService + ordering policy
│ │ ├── assembler.py StateAssembler
│ │ └── state.py RuntimeStateView + RetrievalPolicy
│ ├── storage/
│ │ ├── base.py StorageBackend protocol
│ │ ├── sqlite.py SQLiteBackend
│ │ ├── schema.py DDL (v1 + v2 + v3 + v4, idempotent)
│ │ ├── observation_store.py
│ │ ├── episode_store.py
│ │ ├── evidence_store.py
│ │ └── belief_store.py
│ └── migration/
│ └── from_cms_point.py Legacy CMSPoint → L1Observation
├── scripts/
│ └── bootstrap_db.py
└── tests/
├── unit/ (9 files)
└── integration/ (3 files)
- No long-term belief without explicit supporting evidence references — constraint now actively enforced. Every future belief (Block 5) will reference
memory_ids from L3A. - Evidence is not belief — summaries are rule-owned, deterministic, non-interpretive. No identity claims. Tests enforce this.
- Consumer-neutral core — no LLM prompt format, no agent routing, no dashboard rendering baked into the runtime.
- Provenance is mandatory — every evidence record carries
(source_kind, source_id, rule_id). Idempotency key =(user_id, source_kind, source_id, rule_id). - Soft-canonical scope — field types are free strings, but the service polices producer behavior. Block 3 scopes are locked to
{pragmatic, epistemic, social, dynamics}. - Inert contradiction fields —
supersedesandcontradicted_bypresent in dataclass and schema from day one. Block 3 never populates them; Block 5 will. - No premature taxonomy —
tags,subscope,trajectory_signaturestay flexible. Hardened taxonomy deferred until evidence patterns emerge from real data. - Research line stays intact — runtime does not modify
cms_research/.
pip install -e ".[dev]"
python scripts/bootstrap_db.py
pytest # 446 tests, ~3.5s
pytest --cov=cms # 95% coverageBy default, ObservationService uses an in-memory per-session counter to compute temporal_phase. This is fine for tests and single-process prototypes but resets on process restart and would diverge across multiple workers ingesting for the same session.
For durable phase semantics, callers can supply turn_index explicitly:
result = engine.process_turn(
user_id="alice",
session_id="sess_001",
turn_id="t42",
text="...",
turn_index=42, # explicit — survives process restart
)When turn_index is supplied, it overrides the internal counter. When absent, the counter is the fallback. This is the recommended pattern for any deployment where phase identity must be reproducible.
from cms.l1 import LegacyExtractorAdapter, ObservationService
from cms.l2 import EpisodeService, WindowedClosurePolicy
from cms.l3 import EvidenceService
from cms.l3.belief_service import BeliefService
from cms.runtime import (
CMSEngine, RetrievalService, StateAssembler, RetrievalPolicy,
)
from cms.storage import (
SQLiteBackend, ObservationStore, EpisodeStore, EvidenceStore,
FULL_SCHEMA_DDL,
)
from cms.storage.belief_store import BeliefStore
from cms_research.cms_core.models import LinguisticFeatureExtractor
backend = SQLiteBackend("data/sqlite/cms_runtime.db")
backend.bootstrap_schema(FULL_SCHEMA_DDL)
obs_store = ObservationStore(backend)
ep_store = EpisodeStore(backend)
ev_store = EvidenceStore(backend)
belief_store = BeliefStore(backend)
# Full L1+L2+L3A+L3B stack
adapter = LegacyExtractorAdapter(LinguisticFeatureExtractor())
obs_service = ObservationService(adapter=adapter, store=obs_store)
ep_service = EpisodeService(
store=ep_store, policy=WindowedClosurePolicy(max_size=10),
)
ev_service = EvidenceService(store=ev_store)
bf_service = BeliefService(belief_store=belief_store, evidence_store=ev_store)
engine = CMSEngine(
obs_service, ep_service,
evidence_service=ev_service,
belief_service=bf_service,
)
# Block 4 retrieval + state assembly, now with beliefs wired
retrieval = RetrievalService(obs_store, ep_store, ev_store)
assembler = StateAssembler(retrieval, belief_store=belief_store)
# Process turns through the engine
result = engine.process_turn(
user_id="alice", session_id="sess_001", turn_id="t0",
text="I'm absolutely certain we need to ship this today.",
)
print(f"Observation: {result.observation.obs_id}")
print(f"Evidence filed: {result.new_evidence_ids}")
print(f"Beliefs updated: {result.updated_belief_ids}")
# Build a canonical state view including beliefs
state = assembler.build("alice", "sess_001")
for b in state.active_beliefs:
print(f" ACTIVE {b.dimension}: value={b.value:+.2f} conf={b.confidence:.2f}")
for b in state.tentative_beliefs:
print(f" TENTATIVE {b.dimension}: value={b.value:+.2f} conf={b.confidence:.2f}")
# Periodic staleness sweep (e.g., daily cron)
bf_service.sweep_staleness("alice")
engine.end_session("alice", "sess_001")| rule_id | scope | subscope | fires when |
|---|---|---|---|
obs.pragmatic.high_ratio |
pragmatic | high_pragmatic_ratio | |Im(z1)|/|Re(z1)| ≥ 1.5 |
obs.epistemic.certainty |
epistemic | certainty | Re(z2) ≥ 0.75 |
obs.epistemic.hedging |
epistemic | hedging | Re(z2) ≤ 0.35 |
obs.social.self_reference |
social | self_reference | Im(z3) ≤ 0.35 |
obs.social.other_reference |
social | other_reference | Im(z3) ≥ 0.75 |
Dead zones (0.35 < x < 0.75) guarantee that certainty↔hedging and self↔other are mutually exclusive per observation.
| rule_id | scope | subscope | fires when |
|---|---|---|---|
ep.dynamics.rupture |
dynamics | rupture | length ≤ 5 AND closure_reason contains "surprise" |
ep.dynamics.sustained_regime |
dynamics | sustained_regime | length ≥ 10 AND natural closure |
ep.pragmatic.sustained_density |
pragmatic | sustained_pragmatic_density | trajectory_signature["mean_pragmatic_ratio"] ≥ 1.0 |
All support_score values are bounded in [0, 1] and monotonic with trigger strength.
Note on sustained_density: This rule reads from trajectory_signature which is populated by signature producers external to this slice (research-line code or a future runtime hook). In a default Block 3 deployment with no signature producer wired, this rule does not fire. The rule is correct; the trigger source is intentionally pluggable.
Evidence filing is safe under retry, replay, and duplicate processing:
- Fast path:
EvidenceServicecallsstore.has_evidence_for()before filing. If a record already exists with the same(user_id, source_kind, source_id, rule_id), the rule firing is skipped. - Backstop: The
memory_evidencetable hasUNIQUE (user_id, source_kind, source_id, rule_id). If the fast path is bypassed or races, theINSERTfails withIntegrityErrorrather than producing a duplicate.
Replay a whole session — observation, episode, evidence — and the evidence count stays constant.
Research can experiment without touching the service:
from cms.l3 import EvidencePayload, EvidenceService
def custom_rule(obs):
if some_condition(obs):
return EvidencePayload(
rule_id="custom.my_rule",
scope="epistemic", # must be in CANONICAL_SCOPES
subscope="my_narrow_label",
summary="observation showed X", # non-interpretive
support_score=0.7,
)
return None
service = EvidenceService(
store=ev_store,
observation_rules=[custom_rule], # override defaults
episode_rules=[],
)If the rule produces a scope outside CANONICAL_SCOPES, the service raises ValueError — producer behavior is policed even when field types are free.
# All evidence produced by a specific observation
records = ev_store.list_for_source("alice", "observation", "obs_xyz")
# All evidence in a specific scope
records = ev_store.list_by_scope("alice", "epistemic")
# Every evidence record carries its full provenance:
for r in records:
print(r.source_kind, r.source_id, r.rule_id, r.support_score)Evidence ranking from RetrievalService.search_evidence() is deterministic. Sort precedence:
pinned=Truefirst (Block 5 will populate the pinned flag — slot reserved, currently always False)- exact
scopematch before broader results - exact
subscopematch before broader results - newer records (higher
created_at) before older - higher
support_scorebefore lower memory_idDESC for deterministic tie-break
Repeated calls with the same inputs return identical results. The candidate pool size scales with the requested limit (4×, minimum 20) so the ordering policy can promote scope-exact records that wouldn't otherwise make the cut.
StateAssembler computes operational freshness, not semantic decay:
| flag | meaning |
|---|---|
has_recent_observations |
latest observation within recent_observation_seconds (default 5min) |
has_recent_episodes |
latest episode within recent_episode_seconds (default 30min) |
has_recent_evidence |
latest evidence within recent_evidence_seconds (default 1hr) |
These thresholds are constructor parameters on StateAssembler — no global config. They answer "is this state fresh enough to trust right now?", not "how much should this evidence weigh?". The latter is Block 5.
Three dimensions, scope-pure mapping:
| dimension | source scope | polarity | value range |
|---|---|---|---|
epistemic_style |
epistemic |
signed | [-1, +1] |
social_orientation |
social |
signed | [-1, +1] |
pragmatic_style |
pragmatic |
magnitude | [ 0, +1] |
For signed dimensions: -1 means full opposite-direction support (hedging / self-reference), +1 means full primary-direction support (certainty / other-reference). For pragmatic_style, value is a magnitude in [0, +1].
dynamics scope evidence is filed but feeds no Block 5 belief — honest non-mapping rather than inventing cross-scope coherence semantics prematurely.
Status transitions (defaults, all configurable via BeliefThresholds):
| Transition | Trigger |
|---|---|
| → tentative | first qualifying support (≥1 record above min_supporting_strength) |
| → active | ≥3 supporting records within active_window_days (30) AND confidence ≥ 0.5 |
| → stale | active belief with no support refresh in stale_window_days (14), via sweep_staleness() |
| → invalidated | contradiction count > support count, OR ≥3 contradictions within invalidation_burst_window_days (7) |
Vocabulary contract: Block 5 uses staleness exclusively. Decay is reserved for nothing — discrete status transitions only, no continuous score mutation.
Boundary: BeliefService reads only from evidence (never observations or episodes directly). The supporting ledger is append-only — recomputation re-reads the ledger but never rewrites it. Idempotency is preserved across replays via memory_id check.
Engine wiring: Push-based via optional belief_service arg on CMSEngine. Each turn's new evidence triggers a focused belief update for the affected dimensions only. recompute_for_user(user_id) is the offline escape hatch for batch repair or threshold changes.
Assembler invariant: StateAssembler reads beliefs from the store but never mutates them. Tested explicitly. RuntimeStateView exposes active_beliefs and tentative_beliefs separately. Stale and invalidated beliefs are NOT surfaced as truth.
Context lanes. Each turn carries an optional context_key: str | None. None means global; non-None ("research", "ops", "personal", whatever the caller chooses) routes evidence and beliefs into a scoped lane keyed on (user_id, dimension, context_key). The runtime stores and compares context_key but does not interpret it — semantics are entirely caller-defined.
Two locked guardrails:
- (A)
context_keydoes NOT enter the evidence idempotency key. Replaying the same(user_id, source_kind, source_id, rule_id)twice with different context_keys does not duplicate evidence. - (B) Global and scoped beliefs coexist with no implicit reconciliation. Filing scoped evidence does not write to global beliefs and vice versa. Consumers that want a combined view use
state.all_active_beliefsand accept reconciliation responsibility.
Four belief dimensions (three scope-pure + one cross-scope):
| dimension | source scope | polarity | value range |
|---|---|---|---|
epistemic_style |
epistemic |
signed | [-1, +1] |
social_orientation |
social |
signed | [-1, +1] |
pragmatic_style |
pragmatic |
magnitude | [ 0, +1] |
interaction_stability |
dynamics |
signed | [-1, +1] |
interaction_stability is the first cross-scope dimension — it consumes dynamics evidence (rupture/sustained_regime episode-level rules). The DIMENSION_SPECS registry makes the mapping explicit; new dimensions must be declared there.
Filing-time supersession. When a new evidence record is filed, the service checks for prior records older than supersession_window_days (default 30) in the same (user_id, rule_id, context_key) lane and records their ids in the new record's supersedes list. Lane awareness is strict: global and scoped lanes are distinct; different rule_ids are different lanes. Old records remain in the store as audit history. Belief recompute reads them but excludes them from primary value/confidence/threshold counting; belief.metadata["superseded_support_count"] exposes the count.
Events. BeliefService accepts an optional event_handler callable. Six event types fire on real status transitions and recompute markers (never on numeric tweaks): belief_tentative_created, belief_activated, belief_staled, belief_invalidated, belief_recomputed, belief_scoped_created. NullEventHandler drops everything (default); LoggingEventHandler writes to Python logging; custom handlers wire any persistence/observability strategy. No belief events table exists — events are derived state.
Explanations. Computed on demand via belief_service.explain(belief_id, top_n=5), returning a BeliefExplanation dataclass. Top supporting and counterevidence ids are ranked deterministically by (support_score DESC, created_at DESC, memory_id DESC). Superseded records are excluded from active counts but reported via superseded_count. Explanations are NOT carried on RuntimeStateView — consumers fetch on demand.
| Block | Status | Contents |
|---|---|---|
| 1 | ✅ done | packaging, L1 observation, SQLite storage, migration adapter |
| 2 | ✅ done | L2 episodes, closure policies, engine orchestration |
| 3 | ✅ done | L3A evidence, filing rules, idempotency, scope policing |
| 4 | ✅ done | retrieval, canonical state view, deterministic ordering, freshness flags |
| 5 | ✅ done | L3B beliefs, dimension specs, contradiction, staleness, engine wiring |
| 6 | ✅ done | scoped beliefs, cross-scope dimensions, supersession, events, explanations |
Still deferred to future blocks:
- consolidation (multi-belief arbitration, belief-to-belief reasoning)
- continuous decay curves (Blocks 5-6 do discrete status transitions only)
- pinned evidence (slot reserved in retrieval ordering, always False today)
- LLM prompt builders
- analyst dashboards
- consumer-specific adapters
- finalized semantic memory taxonomy (still soft-canonical)
- learned ranking
- diversity-aware retrieval
- multi-process / multi-worker durability
- LLM-generated belief summaries
- psychometric scoring
- persistent belief event table (events emit via callback, no
belief_eventstable) - agent routing logic, multi-agent arbitration, hidden routing brain