Next generation LLM evaluation framework powered by Vitest.
Documentation | Getting Started | Examples
- ✅ Vitest-like API: easily define and run evals the same way you run tests.
- ✅ Local Dataset: quickly define and generate datasets that are stored locally or in a database.
- ✅ Scorer framework: easily define and use scorers to evaluate your LLM.
- ✅ CI ready: easily integrate with your CI pipeline and run evals the same way you run tests.
- 🧪 UI for viewing the results of Evals (alpha)
- (SOON) Report Exports: custom reporters to help you visualize your evals, or upload them to a 3rd party
- (SOON) Remote Datasets: easily define & use datasets that are stored in a database, S3 bucket, or a 3rd party
npm install viteval
You can use viteval
to evaluate your LLM and add tests in your CI in one single file.
import { evaluate, scorers } from 'viteval';
import { generateText } from 'ai';
evaluate('Color detection', {
data: async () => [
{ input: "What color is the sky?", expected: "Blue" },
{ input: "What color is grass?", expected: "Green" },
],
task: async (input) => {
const result = await generateText(input);
return result.text;
},
scorers: [scorers.levenshtein],
threshold: 0.8,
});
Now you can run the eval by running:
npx viteval