1 stable release
| 1.0.0 | Feb 6, 2026 |
|---|
#440 in WebAssembly
95KB
1.5K
SLoC
Phago — Biological Computing Primitives
Version 1.0.0 | Production-Ready
A Rust framework that maps cellular biology mechanisms to computational operations. Agents self-organize, consume documents, build a Hebbian knowledge graph, share vocabulary, detect anomalies, and exhibit emergent collective behavior — all without top-down orchestration. Now with distributed multi-node sharding for horizontal scaling.
Key Results (v1.0.0)
| Metric | Value | Notes |
|---|---|---|
| Tests passing | 180+ | 100% pass rate across 15 crates |
| Graph edge reduction | 98.3% | 256k to 4.5k via Hebbian LTP |
| Hybrid MRR | 0.800 | Beats TF-IDF (0.775) on first-result ranking |
| Hybrid P@5 | 0.742 | Matches TF-IDF precision |
| Evolved vs static edges | 11.6x | Self-healing through agent evolution |
| Community detection NMI | 1.000 | Perfect topic recovery (Louvain) |
| Session persistence | 100% | Full temporal state fidelity |
| Distributed shards | 3+ | Consistent hashing, ghost nodes, cross-shard queries |
What It Does
Feed the colony documents. Agents digest them into concepts, wire a knowledge graph through co-activation (Hebbian learning), share vocabulary across agent boundaries (horizontal gene transfer), and detect anomalies (negative selection). The graph structure IS the memory — frequently used connections strengthen, unused ones decay.
Documents → Agents digest → Concepts extracted → Graph wired → Knowledge emerges
↑ ↓
└──── Transfer, Symbiosis, Dissolution ─┘
Quick Start
Run the Demos
# Build
cargo build
# Run the proof-of-concept (120-tick simulation)
cargo run --bin phago-poc
# Run all tests
cargo test --workspace --exclude phago-python --exclude phago-web
# Build with distributed feature
cargo build -p phago --features distributed
# Run distributed benchmarks
cargo run --bin phago-bench -- quick
# Open the interactive visualization (generated by POC)
open output/phago-colony.html
Use as a Library
Add to your Cargo.toml:
[dependencies]
phago = { git = "https://github.com/Clemens865/Phago_Project.git" }
# With distributed support
phago = { git = "https://github.com/Clemens865/Phago_Project.git", features = ["distributed"] }
Basic usage with the prelude:
use phago::prelude::*;
fn main() {
let mut colony = Colony::new();
// Ingest documents
colony.ingest_document("doc1", "Cell membrane transport proteins", Position::new(0.0, 0.0));
colony.ingest_document("doc2", "Protein folding and membrane insertion", Position::new(1.0, 0.0));
// Spawn digesters and run
colony.spawn(Box::new(Digester::new(Position::new(0.0, 0.0)).with_max_idle(30)));
colony.run(30);
// Query with hybrid scoring
let results = hybrid_query(&colony, "membrane protein", &HybridConfig {
alpha: 0.5, max_results: 5, candidate_multiplier: 3,
});
for r in results {
println!("{} (score: {:.3})", r.label, r.final_score);
}
}
See docs/INTEGRATION_GUIDE.md for complete examples and API reference.
Production Features
- Single import:
use phago::prelude::*gives you everything - Structured errors:
Result<T, PhagoError>with typed error categories - Deterministic testing:
Digester::with_seed(pos, seed)for reproducible simulations - Session persistence: Save/restore colony state across sessions (JSON or SQLite)
- SQLite persistence:
ColonyBuilderwith auto-save for production deployments - Async runtime:
AsyncColonywithTickTimerfor real-time visualization - MCP adapter: Ready for external LLM/agent integration
- Semantic embeddings: Vector-based concept extraction (optional
semanticfeature) - Distributed colony: Multi-node sharding with consistent hashing (optional
distributedfeature) - Vector DB integration: Qdrant, Pinecone, Weaviate adapters
- Streaming ingestion: Async channels with backpressure and file watching
- Web dashboard: Axum + D3.js real-time colony visualization
- Python bindings:
pip install phago— PyO3 with LangChain and LlamaIndex adapters - Louvain communities: Perfect topic clustering (NMI = 1.0)
SQLite Persistence (Phase 10)
Enable durable storage with automatic save/load:
[dependencies]
phago-runtime = { version = "1.0", features = ["sqlite"] }
use phago_runtime::prelude::*;
// Create colony with persistent storage
let mut colony = ColonyBuilder::new()
.with_persistence("knowledge.db") // SQLite file
.auto_save(true) // Save on drop
.build()?;
// Use normally — persistence is automatic
colony.ingest_document("title", "content", Position::new(0.0, 0.0));
colony.run(100);
colony.save()?; // Explicit save (also happens on drop)
// Later: reload with full state preserved
let colony2 = ColonyBuilder::new()
.with_persistence("knowledge.db")
.build()?;
Async Runtime (Phase 10)
Enable controlled-rate simulation for visualization:
[dependencies]
phago-runtime = { version = "1.0", features = ["async"] }
use phago_runtime::prelude::*;
use phago_runtime::async_runtime::{run_in_local, TickTimer};
#[tokio::main]
async fn main() {
let colony = Colony::new();
// Fast async simulation
run_in_local(colony, |ac| async move {
ac.run_async(100).await
}).await;
// Or controlled tick rate for visualization
let colony2 = Colony::new();
run_in_local(colony2, |ac| async move {
let mut timer = TickTimer::new(100); // 100ms per tick
timer.run_timed(&ac, 50).await;
}).await;
}
Semantic Embeddings (Phase 9)
Enable vector embeddings for semantic understanding:
[dependencies]
phago = { version = "1.0", features = ["semantic"] }
use phago::prelude::*;
use std::sync::Arc;
// Create an embedder (SimpleEmbedder or API-backed)
let embedder: Arc<dyn Embedder> = Arc::new(SimpleEmbedder::new(256));
// SemanticDigester uses embeddings for concept extraction
let mut digester = SemanticDigester::new(Position::new(0.0, 0.0), embedder.clone());
let concepts = digester.digest_text("The mitochondria is the powerhouse of the cell.".into());
// Find semantically similar concepts
let similar = digester.find_similar("cellular energy", 5);
The semantic feature adds:
- SimpleEmbedder — Hash-based embeddings (no dependencies)
- SemanticDigester — Embedding-backed agent for semantic concept extraction
- Chunker — Document chunking with configurable overlap
- Similarity functions — cosine_similarity, euclidean_distance, normalize_l2
LLM Integration (Phase 9.2)
Enable LLM-backed concept extraction:
[dependencies]
# Local LLM (Ollama)
phago = { version = "1.0", features = ["llm-local"] }
# Cloud APIs (Claude, OpenAI)
phago = { version = "1.0", features = ["llm-api"] }
# All backends
phago = { version = "1.0", features = ["llm-full"] }
use phago::prelude::*;
// Local Ollama backend (no API key needed)
let ollama = OllamaBackend::localhost().with_model("llama3.2");
let concepts = ollama.extract_concepts("Cell membrane transport").await?;
// Claude backend
let claude = ClaudeBackend::new("sk-ant-...").sonnet();
let concepts = claude.extract_concepts("Cell membrane transport").await?;
// OpenAI backend
let openai = OpenAiBackend::new("sk-...").gpt4o_mini();
let concepts = openai.extract_concepts("Cell membrane transport").await?;
The llm features add:
- OllamaBackend — Local LLM via Ollama (no API key needed)
- ClaudeBackend — Anthropic Claude API
- OpenAiBackend — OpenAI GPT API
- LlmBackend trait — Common interface for all backends
- Concept extraction — Extract structured concepts from text
- Relationship identification — Find relationships between concepts
- Query expansion — Expand queries for better recall
The Ten Biological Primitives
| Primitive | Biological Analog | What It Does |
|---|---|---|
| DIGEST | Phagocytosis | Consume input, extract fragments, present to graph |
| APOPTOSE | Programmed cell death | Self-assess health, gracefully self-terminate |
| SENSE | Chemotaxis | Detect signals, follow gradients |
| TRANSFER | Horizontal gene transfer | Export/import vocabulary between agents |
| EMERGE | Quorum sensing | Detect threshold, activate collective behavior |
| WIRE | Hebbian learning | Strengthen used connections, prune unused |
| SYMBIOSE | Endosymbiosis | Integrate another agent as permanent symbiont |
| STIGMERGE | Stigmergy | Coordinate through environmental traces |
| NEGATE | Negative selection | Learn self-model, detect anomalies by exclusion |
| DISSOLVE | Holobiont boundary | Modulate agent-substrate boundaries |
Agent Types
- Digester — Consumes documents, extracts keywords, presents concepts to the knowledge graph. Implements DIGEST + SENSE + APOPTOSE + TRANSFER + SYMBIOSE + DISSOLVE.
- Synthesizer — Dormant until quorum reached, then identifies bridge concepts and topic clusters. Implements EMERGE + SENSE + APOPTOSE.
- Sentinel — Learns what "normal" looks like, flags anomalies by deviation from self-model. Implements NEGATE + SENSE + APOPTOSE.
Research Branches
Four falsifiable hypotheses, each with a working prototype, benchmark, visualization, and papers.
1. Bio-RAG — Self-Reinforcing Retrieval
Hebbian-reinforced knowledge graph retrieval with hybrid scoring (TF-IDF + graph re-ranking).
cargo run --bin phago-bio-rag-demo
| Metric | Graph-only | TF-IDF | Hybrid |
|---|---|---|---|
| P@5 | 0.280 | 0.742 | 0.742 |
| MRR | 0.650 | 0.775 | 0.800 |
| NDCG@10 | 0.357 | 0.404 | 0.410 |
Key insight: The graph's value is not in replacing TF-IDF but in re-ranking candidates using structural context. Hybrid scoring beats pure TF-IDF on MRR (first relevant result ranked higher).
2. Agent Evolution — Evolutionary Agents Through Apoptosis
Agents evolving through intrinsic selection pressure (death + mutation + inheritance) produce richer knowledge graphs.
cargo run --bin phago-agent-evolution-demo
| Metric (tick 300) | Evolved | Static | Random |
|---|---|---|---|
| Nodes | 1,582 | 864 | 1,191 |
| Edges | 101,824 | 8,769 | 38,399 |
| Clustering coeff. | 0.969 | 0.948 | 0.970 |
| Spawns / Generations | 140 / 135 | 0 / 0 | 144 / 144 |
3. KG Training — Knowledge Graph to Training Data
Hebbian-weighted triples with Louvain community detection and curriculum ordering for LLM fine-tuning.
cargo run --bin phago-kg-training-demo
| Metric | Before (Label Prop) | After (Louvain) |
|---|---|---|
| Communities | 1 mega + 547 singletons | Correct structure |
| NMI vs ground truth | 0.170 | 1.000 (perfect) |
| Modularity | N/A | 0.609-0.816 |
| Triples exported | 252,641 | 252,641 |
| Foundation coherence | 100% | 100% |
4. Agentic Memory — Persistent Code Knowledge
Self-organizing code knowledge graph that persists across sessions.
cargo run --bin phago-agentic-memory-demo
| Metric | Value |
|---|---|
| Code elements extracted | 830 |
| Graph nodes / edges | 659 / 33,490 |
| Session persistence | 100% fidelity |
| Graph P@5 | 0.140 |
New Features (Ralph Loop Phase 1)
Hebbian LTP Model (Tentative Edge Wiring)
- First co-occurrence creates edge at 0.1 weight (tentative)
- Subsequent co-occurrences reinforce:
weight += 0.1 - Single-document edges decay quickly under synaptic pruning
- Cross-document reinforced edges survive
Multi-Objective Fitness
4-dimensional evolution:
- 30% Productivity — concepts + edges per tick
- 30% Novelty — novel concepts / total concepts
- 20% Quality — strong edges (co_act ≥ 2) / total edges
- 20% Connectivity — bridge edges / total edges
Structural Queries
// Path queries — "What connects A to B?"
graph.shortest_path(&from, &to) -> Option<(Vec<NodeId>, f64)>
// Centrality queries — "What's most important?"
graph.betweenness_centrality(100) -> Vec<(NodeId, f64)>
// Bridge queries — "What concepts connect domains?"
graph.bridge_nodes(10) -> Vec<(NodeId, f64)>
// Component queries — "How many disconnected regions?"
graph.connected_components() -> usize
Distributed Colony (v1.0.0)
Scale horizontally across multiple nodes:
# Start coordinator
cargo run --bin phago -- cluster start-coordinator --port 9000
# Start shards (in separate terminals)
cargo run --bin phago -- cluster start-shard --coordinator 127.0.0.1:9000 --port 9001
cargo run --bin phago -- cluster start-shard --coordinator 127.0.0.1:9000 --port 9002
# Check cluster status
cargo run --bin phago -- cluster status --coordinator 127.0.0.1:9000
# Or use Docker Compose
cd deploy && docker-compose up
Architecture:
- Consistent hash ring with 150 virtual nodes per shard for even distribution
- Ghost nodes for lazy-resolved cross-shard edge references
- Phase-synchronized ticks (Sense/Act/Decay/Advance) via barrier coordination
- Two-phase distributed TF-IDF with scatter-gather for globally accurate scoring
- tarpc RPC with connection pooling for inter-shard communication
MCP Server
Phago ships a standalone MCP server binary that speaks the Model Context Protocol over stdio. Compatible with Claude Desktop, Cursor, and any MCP client.
# Run the MCP server directly
cargo run --bin phago-mcp
# Or via the CLI with optional SQLite persistence
cargo run --bin phago -- mcp --db knowledge.db
Tools exposed:
phago_remember(title, content, ticks)— ingest a document into the colonyphago_recall(query, max_results, alpha)— hybrid query (TF-IDF + graph re-ranking)phago_explore(type: path|centrality|bridges|stats)— structural graph queries
Add to your Claude Desktop config (claude_desktop_config.json):
{
"mcpServers": {
"phago": {
"command": "cargo",
"args": ["run", "--bin", "phago-mcp", "--manifest-path", "/path/to/Phago_Project/Cargo.toml"]
}
}
}
Counterfactual Explanations
Answer "what if?" questions about the knowledge graph:
use phago_rag::counterfactual::*;
let intervention = Intervention::RemoveEdge {
from_label: "cell".into(),
to_label: "membrane".into(),
};
let result = counterfactual_query(&colony, &intervention, "cell biology", &Default::default());
println!("Impact: {} rank changes", result.rank_changes.len());
println!("Significant: {}", result.significant);
STDP — Directed Temporal Edges
Spike-Timing-Dependent Plasticity adds directed "predictive" edges alongside the existing undirected co-occurrence graph. Encodes "what comes next?" patterns.
use phago_runtime::stdp::StdpGraph;
let mut stdp = StdpGraph::new();
stdp.apply_sequence(&[node_a, node_b, node_c], tick);
// "What comes after A?"
let successors = stdp.successors(&node_a);
// Directed shortest path
let path = stdp.directed_shortest_path(&node_a, &node_c);
Graph Diffing
Compare two colony snapshots to understand knowledge graph evolution:
use phago_runtime::diff::diff_sessions;
let diff = diff_sessions(&before_state, &after_state);
println!("{}", diff.summary());
// "Since tick 0 → 50: +15 nodes, -3 nodes, +20 edges, -5 edges, 12 edges strengthened"
AST-Based Code Digester
Tree-sitter powered multi-language code parsing (Rust, Python, JavaScript):
[dependencies]
phago-agents = { version = "1.0", features = ["ast"] }
use phago_agents::ast_digester::{AstDigester, CodeLanguage};
let digester = AstDigester::new(CodeLanguage::Rust);
let elements = digester.extract_symbols(source_code, "main.rs");
// Extracts functions, structs, enums, traits, impls, modules — via AST, not regex
Lamarckian LLM-Guided Evolution
When an agent dies, optionally feed its death context + genome to an LLM for targeted evolution advice:
use phago_agents::lamarckian::*;
let advisor = MockAdvisor::new().with_suggestion("sense_radius", 15.0);
let evolved = evolve_genome(&genome, &death_context, &advisor, 0.1, seed);
// Applies LLM-suggested patches + Darwinian mutation
// Falls back to pure Darwinian mutation if advisor returns no patches
Architecture
crates/
├── phago/ # Unified facade crate (use this!)
├── phago-cli/ # CLI (ingest, query, stats, session, cluster)
├── phago-core/ # Traits (10 primitives) + shared types + Louvain
├── phago-runtime/ # Colony, substrate, topology, sessions, SQLite, async, STDP, diff
├── phago-agents/ # Digester, Sentinel, Synthesizer, genome, AST digester, Lamarckian evolution
├── phago-embeddings/ # Vector embeddings (Simple, ONNX, API providers)
├── phago-llm/ # LLM integration (Ollama, Claude, OpenAI)
├── phago-rag/ # Query engine, hybrid scoring, counterfactual engine
├── phago-mcp/ # Standalone MCP server (stdio transport, rmcp)
├── phago-viz/ # Self-contained HTML visualization (D3.js)
├── phago-web/ # Axum web dashboard + WebSocket
├── phago-python/ # PyO3 bindings (LangChain, LlamaIndex)
├── phago-vectors/ # Vector DB adapters (Qdrant, Pinecone, Weaviate)
├── phago-distributed/ # Multi-node sharding, tarpc RPC, consistent hashing
└── phago-wasm/ # WASM integration (future)
poc/
├── knowledge-ecosystem/ # Full system demo (120-tick simulation)
├── bio-rag-demo/ # Hybrid retrieval benchmark
├── agent-evolution-demo/ # Evolutionary agents experiment
├── kg-training-demo/ # Curriculum ordering with Louvain
├── agentic-memory-demo/ # Persistent code knowledge
└── data/corpus/ # 100-doc test corpus (4 topics × 25 docs)
deploy/
└── docker-compose.yml # Distributed cluster deployment
docs/
├── ABOUT_PHAGO.md # Comprehensive project paper
├── papers/ # Research branch whitepapers
└── ... # Integration guide, executive summary, etc.
Colony Lifecycle (per tick)
- Sense — All agents observe substrate (signals, documents, traces)
- Act — Colony processes agent actions (move, digest, present, wire)
- Transfer — Agents export/integrate vocabulary, attempt symbiosis
- Dissolve — Mature agents modulate boundaries, reinforce graph nodes
- Death — Remove agents that self-assessed for termination
- Decay — Signals, traces, and edge weights decay; weak edges pruned
Key Design Choices
- Rust ownership = biological resource management.
movesemantics model consumption (you can't eat something twice).Dropmodels apoptosis. No garbage collector = deterministic death. - The graph IS the memory. No separate storage layer. The topology of the knowledge graph, shaped by Hebbian learning, encodes all accumulated knowledge.
- No LLMs in the loop. The v0.1 primitives must prove emergence without external intelligence. The framework is designed for LLM-backed agents in future versions.
Quantitative Proof (Phase 5)
Running cargo run --bin phago-poc produces metrics proving the model works:
| Metric | What It Proves |
|---|---|
| Transfer Effect | Vocabulary sharing across agents (shared terms ratio, export/integration counts) |
| Dissolution Effect | Boundary modulation reinforces knowledge (concept vs non-concept access ratio) |
| Graph Richness | Colony builds meaningful structure (density, clustering coefficient, bridge concepts) |
| Vocabulary Spread | Knowledge propagates across agents (Gini coefficient of vocabulary sizes) |
The POC also generates output/phago-colony.html — an interactive D3.js visualization with:
- Force-directed knowledge graph
- Agent spatial canvas
- Event timeline
- Metrics dashboard with tick slider
Implementation Status
| Phase | Version | Status | Description |
|---|---|---|---|
| 0-4 — Core Framework | 0.1.0 | ✅ Done | 10 primitives, 3 agent types, colony lifecycle |
| 5-6 — Research | 0.2.0 | ✅ Done | 4 branches with prototypes, benchmarks, papers |
| 7-8 — Production | 0.2.0 | ✅ Done | Facade crate, CLI, preludes, error types |
| 9 — Semantic Intelligence | 0.3.0 | ✅ Done | Embeddings, LLM backends, semantic wiring |
| 10 — Persistence & Scale | 0.3.0 | ✅ Done | SQLite, async runtime, agent serialization |
| Config File Support | 0.3.0 | ✅ Done | phago.toml with ColonyBuilder integration |
| Web Dashboard | 0.4.0 | ✅ Done | Axum + D3.js real-time colony visualization |
| Python Bindings | 0.5.0 | ✅ Done | PyO3 with LangChain and LlamaIndex adapters |
| Louvain Communities | 0.5.0 | ✅ Done | Perfect NMI = 1.0 on synthetic benchmarks |
| Streaming Ingestion | 0.6.0 | ✅ Done | Async channels, backpressure, file watching |
| Vector DB Integration | 0.7.0 | ✅ Done | Qdrant, Pinecone, Weaviate adapters |
| Distributed Colony | 1.0.0 | ✅ Done | Sharding, tarpc RPC, consistent hashing, ghost nodes |
| CI/CD Pipeline | 1.1.0 | ✅ Done | GitHub Actions: test, lint, feature matrix |
| MCP Server | 1.1.0 | ✅ Done | Standalone binary, rmcp 0.15, stdio transport |
| PyPI Publication | 1.1.0 | ✅ Done | maturin build + publish workflow |
| STDP Directed Edges | 1.1.0 | ✅ Done | Directed temporal graph, predictive edges |
| Graph Diffing | 1.1.0 | ✅ Done | Structural changelog between snapshots |
| Counterfactual Engine | 1.1.0 | ✅ Done | "What if?" queries with intervention analysis |
| AST Code Digester | 1.1.0 | ✅ Done | tree-sitter for Rust, Python, JavaScript |
| Lamarckian Evolution | 1.1.0 | ✅ Done | LLM-guided genome patches + Darwinian mutation |
Tests
# All tests (excludes phago-python which requires maturin)
cargo test --workspace --exclude phago-python --exclude phago-web
# Distributed crate tests (146 unit + 9 integration)
cargo test -p phago-distributed
# By category
cargo test --test transfer_tests # Vocabulary export/import
cargo test --test symbiosis_tests # Agent absorption
cargo test --test dissolution_tests # Boundary modulation
cargo test --test phase4_integration # Full colony integration
cargo test -p phago-runtime metrics # Quantitative metrics
# Distributed benchmarks
cargo run --bin phago-bench -- quick
Benchmark Results
| Category | Metric | Result |
|---|---|---|
| Throughput | Ticks/sec (small colony) | 733 |
| SQLite | Save/load time | <1ms |
| Async | Overhead vs sync | <5% |
| Serialization | 200 agents | 8µs |
| Semantic wiring | Overhead | ~11% |
Documentation
docs/ABOUT_PHAGO.md— About Phago — comprehensive project paper (v1.0.0)docs/INTEGRATION_GUIDE.md— How to use Phago — installation, examples, API referencedocs/papers/phago-whitepaper-v2.md— Main whitepaper (v2.0) — technical paperdocs/EXECUTIVE_SUMMARY.md— Latest results and roadmapdocs/COMPETITIVE_ANALYSIS.md— Where Phago wins vs traditional approachesdocs/USE_CASES.md— Practical applicationsdocs/WHITEPAPER.md— Original theoretical foundationdocs/NEXT_PRIORITIES.md— Development plan (all 7 priorities complete)
Research Papers
| Branch | White Paper | Explainer |
|---|---|---|
| Bio-RAG | bio-rag-whitepaper.md |
bio-rag-explainer.md |
| Agent Evolution | agent-evolution-whitepaper.md |
agent-evolution-explainer.md |
| KG Training | kg-training-whitepaper.md |
kg-training-explainer.md |
| Agentic Memory | agentic-memory-whitepaper.md |
agentic-memory-explainer.md |
License
MIT
Dependencies
~2.7–4MB
~66K SLoC