Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Feature Request: Pluggable Retriever Interface and External HTTP Retriever Adapter #16

@0x-Professor

Description

@0x-Professor

TL;DR
Let anyone plug in their own retriever (including private/proprietary ones) via a simple, secure HTTP contract or lightweight in-repo adapter. No code changes needed to add new retrievers. Works with existing dynamic flow, voting/Elo, and the leaderboard.

Why this matters

  • Teams want to compare their private retrievers without exposing IP or forking.
  • Maintainers need a safe, scalable way to add lots of retrievers with consistent I/O and limits.
  • Unlocks the core value: easy, apples-to-apples RAG evaluations using real user feedback.

What we’re proposing

  • Standard plugin contract
    • Input: query, topK, metadata
    • Output: docs[] with text/score/metadata, optional latency
    • Health check endpoint
    • Strict schema validation (e.g., Zod)
  • External HTTP Retriever adapter
    • POST standard input to a user’s HTTPS endpoint; expect standard output
    • Timeouts, retries, response-size limits, and per-retriever rate limiting
    • Domain allowlist to prevent SSRF; request logging for auditability
  • Config-driven registry
    • Supabase table for retrievers (id, key, type, config, limits, enabled)
    • Seed/migrations included; admin UI to add/edit, health-check, enable/disable
    • Dynamic loading so new retrievers appear without code changes
  • Seamless scoring/leaderboard
    • Keep current voting/Elo; associate votes with retriever ids for clean history
  • Great DX
    • Example external server (Flask/Worker) + copy-pasteable schema
    • Clear docs: how to register, secure, and troubleshoot

How it works (at a glance)

  1. Admin registers a retriever (built-in or HTTP) in the registry.
  2. Dynamic route loads retrievers from config.
  3. Arena calls retrievers in parallel with standard input.
  4. Results are validated, rate-limited, and logged.
  5. Users vote; Elo updates by retriever id.

Alternatives considered

  • PRs per retriever: doesn’t scale; risks IP leakage.
  • Sidecar containers: heavier ops.
  • Arbitrary code hooks: security risk.

Impact

  • Fast onboarding for proprietary retrievers (minutes, not days).
  • Safer, scalable growth for maintainers.
  • Richer, more meaningful comparisons for everyone.

Priority
High

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions