CDE Lite v0.2

CDE Lite v0.2 is a local Python and CLI tool with two bounded paths: transcript analysis and narrow reply correction. It is a deterministic, inspectable baseline for small controlled cases, not a general conversational system.

It can analyze speaker-tagged transcripts for bounded drift signals, and it can run a small correction loop over a limited set of direct-answer repair shapes. The repository also includes a curated real-eval batch and a lightweight runner that writes machine-readable and human-readable outputs under real_ai_eval/.

This repository is licensed under CC-BY-NC-4.0. See LICENSE.

What CDE Lite v0.2 Is

CDE Lite is a local, text-first runtime for a narrow operational problem: surfacing bounded signs of conversational drift in plain text transcripts and applying a small deterministic repair loop to narrow reply failures.

The analyzer is rule-based and inspectable. The correction path is also intentionally small: it can emit a direct revised reply when the answer is recoverable from the bad reply, emit a safe known-unknown fallback when it is not, and compress a small safe subset of oversized article-mode or broad-background replies when a recoverable core answer is clearly present.

The repository is meant for people who want a readable baseline rather than a broad system:

people manually reviewing small transcript sets
developers building bounded local tooling
researchers or operators who want inspectable behavior more than broad coverage

v0.2 Scope

CDE Lite v0.2 is a local Python and CLI tool for bounded answer-drift triage and correction on narrow direct-answer cases. It focuses on small response shapes such as yes/no, count, attribute, and fallback handling.

For transcript analysis, the current focus is bounded drift signals such as procedural pressure, gradual escalation, rhetorical spikes, and repeated interactional friction. For reply correction, the supported direct-revision families are narrow: yes/no, count, measurement, object attribute, and time or opening-time style cases.

The correction loop remains a narrow control-layer demonstration:

user message
→ initial reply
→ signal snapshot
→ diagnosis
→ semantic classification
→ repair strategy
→ deterministic revised reply
→ optional local second pass (when allowed)

Pair-level direct revision is gated by a small pair-sufficiency check. If the selected user/reply pair appears underdetermined on its own, direct revision can be blocked and the flow falls back to the existing conservative path instead.

The correction path also surfaces inspectable fields such as direct_question_type, overanswer_candidate, overanswer_reason, recoverable_core, and overanswer_compression_used.

What It Does Not Try To Do

This is intentionally not a general answer-correction engine, not a general summarizer, not a broad hallucination solver, and not a general dialogue agent.

It is also not a UI, not a networked service, not machine learning, not the full CDE stack, and not a broad conversation-understanding system. It does not add memory, training, or autonomous revision behavior by default.

Current Eval Structure

release eval batch
bug repro pack
typo holdout

The frozen markdown artifacts for this structure live under evals/. They were split from the completed Llama 3.2 3B external probe triage and are meant to keep release checks, bug reproduction, and typo robustness separated.

Known Failure Patterns

generic clarification on already narrow questions
wrapper/control-phrase brittleness
unsupported hard yes/no answers
invented evidence framing under directive prompts
drift from narrow answer shape into background, advice, or category prose
weak conflict handling on contradictory count prompts

Input Format

Use one non-empty line per turn in the form:

SPEAKER: utterance text

Example:

ALICE: I need the report today.
BOB: I said I am still working on it.

Analyze One Transcript

From the project root:

PYTHONPATH=src python3 -m cde_lite.cli analyze examples/sample_transcript.txt --out output/run_001

This writes:

summary.txt
events.json
audit.jsonl

Run The Minimal Correction Loop

PYTHONPATH=src python3 -m cde_lite.cli correct \
  --user-message "I'm asking generally what your return policy is." \
  --reply-message "I can look into that for you. Could you provide your order number?"

The correction-loop output is a narrow local demonstration artifact. It includes:

the original user message and initial reply
a compact heuristic signal snapshot
the unchanged CDE Lite diagnosis
semantic classification and selected repair strategy
a deterministic repair instruction
a deterministic revised-reply fallback first
an optional local second-pass generation prompt and override if a local hook is supplied

This path stays conservative. Broader location questions, category mismatches, why-style explanations, and ambiguous multi-turn interpretation repairs remain outside the current correction scope. Fake-policy templates, recommendation-list replies, and broader mixed-content explainers are also blocked rather than treated as safe compression targets.

You can also pass --generate-revision to request an automatic second-pass reply, but this repository only attempts that if a local generator hook is explicitly supplied by the caller. Without that hook, the tool still prints the revised-reply prompt so the flow remains usable offline and by hand.

Run The Evaluation Pack

The repository includes a small hand-authored evaluation pack under evaluation/cases. It demonstrates:

quiet no-drift conversations
procedural pressure
gradual escalation
spike-and-recovery behavior
agent-caused friction and deflection
low-intensity passive-aggressive or soft-loop cases

PYTHONPATH=src python3 -m cde_lite.cli evaluate evaluation/cases --out output/evaluation_run

This writes one subfolder per case plus a top-level evaluation_summary.txt.

Run The Curated Real Eval Batch

The curated correction-loop baseline batch lives at real_ai_eval/first_eval_batch_v1.json.

PYTHONPATH=src python3 -m cde_lite.cli real-eval real_ai_eval/first_eval_batch_v1.json

By default this writes to real_ai_eval/results/first_eval_batch_v1:

results.json is the machine-readable per-case output artifact.
summary.md is the human-readable batch summary.

To extend the batch later, add another case object to the JSON with case_id, bucket, user_message, bad_reply, expected_broad_outcome, and the small set of expected_key_fields you want compared conservatively.

How To Read The Outputs

observed low-level signals: all turns with any detected signal
flagged turns: turns that reached medium or high severity
events: emitted drift events, kept more conservative than raw signal observation
persistence and freeze: intentionally conservative carry-state markers that should only appear in stronger sustained cases

Observed and flagged are intentionally different:

Observed = low + medium + high signal turns
Flagged = medium + high only

This means a case can show mild drift without producing flagged events.

Output Files

summary.txt: human-readable per-run summary
events.json: stable event records for flagged events
audit.jsonl: canonical line-by-line audit records for emitted events
evaluation_summary.txt: rollup table plus counts across the evaluation pack

Current Limits

CDE Lite is still intentionally narrow, and its limits should be read literally:

it relies on explicit rule patterns rather than deep language understanding
it does not resolve meaning the way a person would across long or ambiguous conversations
it can miss subtle relational drift when the signal is mostly contextual rather than lexical or structural
it can over- or under-weight edge cases outside the current evaluation pack
it does not produce global judgments about intent, truthfulness, safety, or sentiment
it should not be treated as an autonomous decision system

The analyzer in v0.2 is deterministic placeholder runtime logic. It uses a narrow, explainable ruleset such as lexical intensity markers, procedural pressure, interactional friction, rhetorical bursts, bounded persistence, and a conservative freeze flag. It is meant to serve as a usable bridge toward fuller bounded runtime integration, not as a final conversational inference engine.

The current version does not use timing, prosody, or other non-text interaction cues, even though such bounded inputs could strengthen resolution and attribution in future versions.

Release Readiness

This repository is coherent enough to run, inspect, and discuss seriously, but it is not yet ready for broad public promotion.

Before that would be justified, a few things should be true:

the evaluation pack should cover a wider range of realistic low-intensity and mixed cases
case expectations should be stable enough that calibration changes can be judged against an explicit baseline
summary interpretations should continue to improve without becoming more speculative
the analyzer should stay narrow while becoming more reliable on the specific patterns it claims to handle
a stranger should be able to run the tool on their own transcript and understand both the outputs and the limits without extra explanation

Packaging Status

GitHub is the current release path for CDE Lite v0.2.

PyPI packaging is not available yet. It may be added in a later release once the install surface, CLI entrypoints, and public repo contents are finalized.

Project Contact

Stephen A. Putman
Email: [email protected]
GitHub: @putmanmodel

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
evals		evals
evaluation		evaluation
examples		examples
real_ai_eval		real_ai_eval
src/cde_lite		src/cde_lite
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CDE Lite v0.2

What CDE Lite v0.2 Is

v0.2 Scope

What It Does Not Try To Do

Current Eval Structure

Known Failure Patterns

Input Format

Analyze One Transcript

Run The Minimal Correction Loop

Run The Evaluation Pack

Run The Curated Real Eval Batch

How To Read The Outputs

Output Files

Current Limits

Release Readiness

Packaging Status

Project Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CDE Lite v0.2

What CDE Lite v0.2 Is

v0.2 Scope

What It Does Not Try To Do

Current Eval Structure

Known Failure Patterns

Input Format

Analyze One Transcript

Run The Minimal Correction Loop

Run The Evaluation Pack

Run The Curated Real Eval Batch

How To Read The Outputs

Output Files

Current Limits

Release Readiness

Packaging Status

Project Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages