Thanks to visit codestin.com
Credit goes to github.com

Skip to content

kaydotai/freak

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

FREAK

Finance RAG Evaluation for Assessing Kay

FREAK aims to provide an evaluation framework for running analyses against Kay's finance data RAG. Plug in your API call and call run — we'll take care of the rest.

Quick Start

  1. In tester.py fill out the call_my_code function with a call to your API. Convert the response into a RagResult. (more details below)
  2. Get a cohere API key (if you don't have one, you can create one here - cohere signup)
  3. Run python tester.py --cohere-api-key <COHERE_KEY> --kay-api-key <KAY_API_KEY> [--verbose] [--save-chunks-file </path/output_file_name.json>] [--query-override-file </path/to/query.txt>]

If you want to add custom queries, use the --queries-override-file flag. To see all chunks that each API outputs, use the --save-chunks-file flag.

For a full breakdown of the command line flags available, run python tester.py --help.

Detailed Usage

Details on RagResult

RagResult helps us bring retrived results to a common structure for easy comparision.

RagResult is an array of RagDocument.

RagResult(
        docs=[
            RagDocument
        ],
    )

Full defination is here - RagResult

Details on RagDocument

RagDocument holds the raw text of the retrieved context, along with optional metadata.

doc1 = RagDocument(text = "<TEXT_OF_YOUR_DOC_HERE>")

Full defination is here - RagResult

Why do I need a cohere API key?

We use cohere re-ranking scores as a proxy to evaluate relevancy for retrieved context. While we acknowledge the evident short-comings, this is a quick way to do sanity checks between two retriever systems without a golden test set. If you have a golden test set internally, we can add more metrics to compare two retriver systems.

Contribution

At Kay, we are pushing the boundaries on RAG. One of the biggest challenges is to accurately keep evaluating a retriever system. The intention behind this library is twofold -

  1. We use this internally to test improvements confidently and track changes
  2. We made this publicly available to enable our users to test Kay's retriever system with theirs.

With that note, we would love for you to contribute to this package.

About

Tool for assessing Kay.ai against other Finance RAG

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages