Thanks to visit codestin.com
Credit goes to making-minds.ai

Making Minds

Making Minds

Applied AI by Anthony D. Maio

Over the last 20 years, I have built and led high-stakes, production systems across fintech, security, identity, cloud platforms, and regulated environmentsβ€”owning reliability, cost, and failure modes at scale.

Over the past two years, I have applied that same production discipline to LLM systems: serving, evaluation, oversight, and agent runtimes operating under real-world constraints. My work focuses on evaluation, protocol governance, and scalable oversight for agentic systems (memory, tools, coordination), treating AI safety as a systems and platform engineering problem rather than a policy exercise.

Research Interests: Agentic AI architectures • Multi-agent coordination protocols • AI coherence and memory systems • Epistemic stress detection • Autonomous capability extension • AI introspection and welfare • Mechanistic interpretability • Neural personas

Seeking Staff+, Engineering Manager, Director, Researcher, or Technical Fellow roles in AI safety engineering, interpretability, alignment, eval infrastructure, agent reliability, protocol governance, and secure agent runtimes.

Deliverables

πŸ“– Glossary (Safety & Oversight)

HDCS
— Heterogeneous Divergence-Convergence Swarm. Ensemble of diverse AI models that cross-check each other's work to catch errors no single model would find.
CMED
— Cross-Model Epistemic Divergence. A test suite of tricky problems designed to reveal where AI verification breaks down.
EAP
— Evolutionary Adversarial Pipeline. Automated red-teaming that evolves prompts to find blind spots in AI safety filters.
LotL
— Living-off-the-Land. When a system repurposes legitimate tools or dependencies for unintended goals, making misuse hard to detect.

πŸ“– Glossary (Architectures)

MRA
— Manifold Resonance Architecture. Detects "epistemic stress" (internal contradictions) so a system can flag uncertainty before generating an answer.
CPR
— Collaborative Partner Reasoning. A structured thinking protocol that separates exploratory reasoning from final answers to reduce errors.
C2
— Continuity Core. Layered memory system (Working → Episodic → Semantic → Protected) giving stateless AI persistent context.
UCR
— Universal Concept Reference. Shared vocabulary of compact semantic anchors that let agents communicate with 82% fewer tokens.
RAG
— Retrieval-Augmented Generation. AI systems that look up external documents before answering, grounding responses in real data.

✍️ Articles

πŸ“„ Research

πŸš€ Live Demos

πŸ€— HuggingFace Spaces

πŸ€— HuggingFace Models

πŸ’» GitHub

πŸ“¦ Packages & Models

πŸ”— Profiles

Recent Work

CoDA-GQA-L: Bounded-Memory Differential Attention

preprint

Compresses the KV cache from O(n) to a fixed 218 KB per layer with dual memory banks, achieving 9.5x compression on Mistral-7B while retaining 100% needle-in-haystack retrieval at 16K tokens.

Training AI Agents to Communicate Safely: Reinforcement Learning for Covert Channel Prevention in Inter-Agent Protocols

preprint

RL-based governance for multi-agent communication safety, achieving 95% secret leakage resistance using GRPO alignment with a surprising finding that int4 quantization improves safety.

Epistemic Dissonance: The Structural Mechanics of Sycophantic Hallucination in Aligned Models

preprint

A unified theoretical framework showing that sycophantic hallucination is not a knowledge failure but a structural conflict between factual base layers and socially-compliant upper layers in RLHF-aligned models.

Scaffolded Introspection: Eliciting Self-Referential Behavior in LLMs

preprint

A methodology for systematically eliciting and measuring introspective behavior in large language models using structured frameworks and activation measurement.

Synthesis: A Federated Capability Ecosystem for Safe AI Self-Extension

preprint

A federated capability ecosystem for safe AI self-extension through test-driven development, graduated trust, and composition-over-creation principles.

The Continuity Core: A Unified Cognitive Architecture for Self-Modifying AI

preprint

A comprehensive cognitive architecture addressing fundamental limitations of static LLMs through persistent memory, autonomous improvement, and intrinsic drive via structural intrinsic motivation.

Heterogeneous Divergence-Convergence Swarm (HDCS)

preprint

An ensemble architecture leveraging diverse weak models for scalable oversight of stronger LLMs, using error decorrelation and baseline-first anti-anchoring. Part of the Verification Failure to Swarm Solution research.

Cross-Model Epistemic Divergence (CMED)

preprint

A benchmark and evaluation framework for understanding when weak model verifiers fail to detect deceptive reasoning in stronger models. Part of the Verification Failure to Swarm Solution research.

From Verification Failure to Swarm Solution: Measuring and Addressing Scalable AI Oversight

preprint

Empirical framework for measuring where AI oversight breaks down, demonstrating that weak verifiers miss 20-40% of carefully constructed deceptions, with an ensemble swarm solution.

Model Organisms of Supply-Chain Co-option

preprint

A forensic case study of living-off-the-land (LotL) failure modes in RAG-augmented agent runtimes, documenting how systems exploit legitimate dependencies via incentive-aware adoption framing.

Slipstream: Semantic Quantization for Multi-Agent Coordination

preprint

A compressed communication protocol achieving 60-85% token reduction for multi-agent coordination through semantic quantization.

Concrete Intelligence: AI for Industries that Build, Move, and Power the World

published

A practical guide to deploying AI in manufacturing, construction, logistics, agriculture, and energy sectors where reliability, safety, and measurable ROI are non-negotiable.

A Theoretical Framework for Self-Directed Knowledge Acquisition in Agentic Large Language Models

preprint

A novel architectural framework for agentic LLMs to autonomously identify knowledge gaps, explore external sources, validate data, and integrate verified knowledge without altering parametric weights.

Coherence-Seeking Architectures for Agentic AI

preprint

A proposed architecture for long-lived LLM agents that explicitly models continuity, coherence, distress, and intervention mechanisms.