Thanks to visit codestin.com
Credit goes to github.com

Skip to content

v3: semantic query with embeddings #1

@safishamsi

Description

@safishamsi

Problem

Current /graphify query is BFS keyword matching - same as grep with graph traversal. Searching "find what handles authentication" only works if the word "auth" appears in node labels.

Goal

Replace keyword BFS with embedding-based semantic search so queries find concepts by meaning, not exact string match.

Plan

Embedding backend (local by default):

  • sentence-transformers with all-MiniLM-L6-v2 (80MB, no API key, works offline)
  • Optional: OpenAI embeddings API, nomic-embed via ollama

What changes:

  • On graph build, embed every node label + source context, store vectors in graph.json
  • /graphify query computes query embedding, ranks nodes by cosine similarity, then does BFS from top-k hits
  • semantically_similar_to edge detection can use embeddings instead of LLM (faster, cheaper)
  • Node similarity surfaced in graph visualization

New optional dependency:

pip install graphifyy[embeddings]

Why this matters

This is the difference between a search tool and an understanding tool. "Find what connects the optimizer to the attention mechanism" should work even if those exact words don't appear together anywhere in the codebase.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions