CDE Lite v0.2 is a local Python and CLI tool with two bounded paths: transcript analysis and narrow reply correction. It is a deterministic, inspectable baseline for small controlled cases, not a general conversational system.
It can analyze speaker-tagged transcripts for bounded drift signals, and it can run a small correction loop over a limited set of direct-answer repair shapes. The repository also includes a curated real-eval batch and a lightweight runner that writes machine-readable and human-readable outputs under real_ai_eval/.
This repository is licensed under CC-BY-NC-4.0. See LICENSE.
CDE Lite is a local, text-first runtime for a narrow operational problem: surfacing bounded signs of conversational drift in plain text transcripts and applying a small deterministic repair loop to narrow reply failures.
The analyzer is rule-based and inspectable. The correction path is also intentionally small: it can emit a direct revised reply when the answer is recoverable from the bad reply, emit a safe known-unknown fallback when it is not, and compress a small safe subset of oversized article-mode or broad-background replies when a recoverable core answer is clearly present.
The repository is meant for people who want a readable baseline rather than a broad system:
- people manually reviewing small transcript sets
- developers building bounded local tooling
- researchers or operators who want inspectable behavior more than broad coverage
CDE Lite v0.2 is a local Python and CLI tool for bounded answer-drift triage and correction on narrow direct-answer cases. It focuses on small response shapes such as yes/no, count, attribute, and fallback handling.
For transcript analysis, the current focus is bounded drift signals such as procedural pressure, gradual escalation, rhetorical spikes, and repeated interactional friction. For reply correction, the supported direct-revision families are narrow: yes/no, count, measurement, object attribute, and time or opening-time style cases.
The correction loop remains a narrow control-layer demonstration:
user message
→ initial reply
→ signal snapshot
→ diagnosis
→ semantic classification
→ repair strategy
→ deterministic revised reply
→ optional local second pass (when allowed)
Pair-level direct revision is gated by a small pair-sufficiency check. If the selected user/reply pair appears underdetermined on its own, direct revision can be blocked and the flow falls back to the existing conservative path instead.
The correction path also surfaces inspectable fields such as direct_question_type, overanswer_candidate, overanswer_reason, recoverable_core, and overanswer_compression_used.
This is intentionally not a general answer-correction engine, not a general summarizer, not a broad hallucination solver, and not a general dialogue agent.
It is also not a UI, not a networked service, not machine learning, not the full CDE stack, and not a broad conversation-understanding system. It does not add memory, training, or autonomous revision behavior by default.
- release eval batch
- bug repro pack
- typo holdout
The frozen markdown artifacts for this structure live under evals/. They were split from the completed Llama 3.2 3B external probe triage and are meant to keep release checks, bug reproduction, and typo robustness separated.
- generic clarification on already narrow questions
- wrapper/control-phrase brittleness
- unsupported hard yes/no answers
- invented evidence framing under directive prompts
- drift from narrow answer shape into background, advice, or category prose
- weak conflict handling on contradictory count prompts
Use one non-empty line per turn in the form:
SPEAKER: utterance text
Example:
ALICE: I need the report today.
BOB: I said I am still working on it.
From the project root:
PYTHONPATH=src python3 -m cde_lite.cli analyze examples/sample_transcript.txt --out output/run_001This writes:
summary.txtevents.jsonaudit.jsonl
PYTHONPATH=src python3 -m cde_lite.cli correct \
--user-message "I'm asking generally what your return policy is." \
--reply-message "I can look into that for you. Could you provide your order number?"The correction-loop output is a narrow local demonstration artifact. It includes:
- the original user message and initial reply
- a compact heuristic signal snapshot
- the unchanged CDE Lite diagnosis
- semantic classification and selected repair strategy
- a deterministic repair instruction
- a deterministic revised-reply fallback first
- an optional local second-pass generation prompt and override if a local hook is supplied
This path stays conservative. Broader location questions, category mismatches, why-style explanations, and ambiguous multi-turn interpretation repairs remain outside the current correction scope. Fake-policy templates, recommendation-list replies, and broader mixed-content explainers are also blocked rather than treated as safe compression targets.
You can also pass --generate-revision to request an automatic second-pass reply, but this repository only attempts that if a local generator hook is explicitly supplied by the caller. Without that hook, the tool still prints the revised-reply prompt so the flow remains usable offline and by hand.
The repository includes a small hand-authored evaluation pack under evaluation/cases. It demonstrates:
- quiet no-drift conversations
- procedural pressure
- gradual escalation
- spike-and-recovery behavior
- agent-caused friction and deflection
- low-intensity passive-aggressive or soft-loop cases
PYTHONPATH=src python3 -m cde_lite.cli evaluate evaluation/cases --out output/evaluation_runThis writes one subfolder per case plus a top-level evaluation_summary.txt.
The curated correction-loop baseline batch lives at real_ai_eval/first_eval_batch_v1.json.
PYTHONPATH=src python3 -m cde_lite.cli real-eval real_ai_eval/first_eval_batch_v1.jsonBy default this writes to real_ai_eval/results/first_eval_batch_v1:
results.jsonis the machine-readable per-case output artifact.summary.mdis the human-readable batch summary.
To extend the batch later, add another case object to the JSON with case_id, bucket, user_message, bad_reply, expected_broad_outcome, and the small set of expected_key_fields you want compared conservatively.
observed low-level signals: all turns with any detected signalflagged turns: turns that reached medium or high severityevents: emitted drift events, kept more conservative than raw signal observationpersistenceandfreeze: intentionally conservative carry-state markers that should only appear in stronger sustained cases
Observed and flagged are intentionally different:
- Observed = low + medium + high signal turns
- Flagged = medium + high only
This means a case can show mild drift without producing flagged events.
summary.txt: human-readable per-run summaryevents.json: stable event records for flagged eventsaudit.jsonl: canonical line-by-line audit records for emitted eventsevaluation_summary.txt: rollup table plus counts across the evaluation pack
CDE Lite is still intentionally narrow, and its limits should be read literally:
- it relies on explicit rule patterns rather than deep language understanding
- it does not resolve meaning the way a person would across long or ambiguous conversations
- it can miss subtle relational drift when the signal is mostly contextual rather than lexical or structural
- it can over- or under-weight edge cases outside the current evaluation pack
- it does not produce global judgments about intent, truthfulness, safety, or sentiment
- it should not be treated as an autonomous decision system
The analyzer in v0.2 is deterministic placeholder runtime logic. It uses a narrow, explainable ruleset such as lexical intensity markers, procedural pressure, interactional friction, rhetorical bursts, bounded persistence, and a conservative freeze flag. It is meant to serve as a usable bridge toward fuller bounded runtime integration, not as a final conversational inference engine.
The current version does not use timing, prosody, or other non-text interaction cues, even though such bounded inputs could strengthen resolution and attribution in future versions.
This repository is coherent enough to run, inspect, and discuss seriously, but it is not yet ready for broad public promotion.
Before that would be justified, a few things should be true:
- the evaluation pack should cover a wider range of realistic low-intensity and mixed cases
- case expectations should be stable enough that calibration changes can be judged against an explicit baseline
- summary interpretations should continue to improve without becoming more speculative
- the analyzer should stay narrow while becoming more reliable on the specific patterns it claims to handle
- a stranger should be able to run the tool on their own transcript and understand both the outputs and the limits without extra explanation
GitHub is the current release path for CDE Lite v0.2.
PyPI packaging is not available yet. It may be added in a later release once the install surface, CLI entrypoints, and public repo contents are finalized.
Stephen A. Putman
Email: [email protected]
GitHub: @putmanmodel