Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Feature Request: per-chat prompt caching #14470

Open
@betweenus

Description

@betweenus

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Separate prompt caches for each chat + default cache (as it is now).

Motivation

When switching between chats in the web interface (or via api), the prompt is recalculated each time. Caching does not help in this case because the prompt in each chat is radically different. If when requesting endpoints, you specify a string identifier for which a separate cache would be allocated and used, there would be a good increase in performance, especially in cases where several chats are used and frequent switching between them occurs.

Possible Implementation

Add a string parameter cacheid to the server endpoints (including openapi). If it is specified, a separate cache allocated under this identifier will be used. If the parameter is empty or is not added to the request, the default cache will be used as it works now.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions