Arona

Arona is Elysia's documentation search using AI with RAG

What is

Standard RAG

When a user ask question, it will query the vector database to find the most relevant content, then use that content to answer the question

It clone Elysia documentation and index the content into embedding and BM25 index

The content will be chunked and vectorized then stored in Postgres Indexing has diff awareness, it will only update the new content

Indexing process is automated by

Using webhook that triggers when Elysia documentation update
Cron for 6 hours as a fallback

Why

Off the shelf RAG is expensive

We were looking for an API based RAG provider, but we couldn't find any that fit our needs and budget

Kapa, Mendable, and other RAG providers didn't tell the price upfront, and people said it can cost up to $1000 per month, which is not affordable for us, especially when we are still in early stage and need to be frugal with our expenses

We don't want to move to full API documentation provider because we have already invested in Vitepress and Vue ecosystem for building documentation, eg. Interactive Tutorial, Playground, etc

We have already invested in Vitepress

We don't want to lose all the benefits of having our own documentation site and move to a third party provider that may not have the same level of customization and control as we have now

Because all of this we decided to build our own RAG system, which is more cost effective and gives us more control over the data and the features we want to implement

There are several ways to build RAG but we choose a more cost effective way to do things

This also give us more freedom to customize the RAG system to fit our specific needs and use cases, rather than being limited by the features and capabilities of a third party provider

How it works

User request a Proof of Work to prove that they are not a bot, this is to prevent abuse and spam
User send question alongside PoW, Turnstile token and checksum to prevent abuse
Question then normalized using a small model for semantic search and caching
Send question to main model for answering, it has tool for doing search and read page
Normalized question will be used to query using BM25 and vector search
The relevant content like normalized query, embedding, question-answer will be cached
The answer will be returned to the user, and also stored in the cache for future reference

We don't use reranking API because it's expensive

Stack

Most of the stack is preferably self-hosted when possible:

Elysia (obviously) - For building the API and backend
ParadeDB - BM25 + Vector Search
Dragonfly - Caching, Semantic Cache Search
GPT OSS 120B - Main model for answering questions
GPT OSS 20B - Query normalization for semantic search/caching
OpenAI Embedding Small - Most cost effective embedding model for vector search
Axiom - Logging and monitoring
Turnstile - Proof of Work to prevent abuse and spam

Setup

Set .env variables, you can refer to .env.example for the required variables
Run docker compose up -d to start the services
Setup using scripts/setup.ts, modify the script to fit your documentation repo

Name		Name	Last commit message	Last commit date
Latest commit History 232 Commits
public		public
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
.prettierrc		.prettierrc
Dockerfile		Dockerfile
README.md		README.md
bun.lock		bun.lock
docker-compose.yml		docker-compose.yml
dump.rdb		dump.rdb
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Arona

What is

Why

How it works

Stack

Setup

About

Uh oh!

Releases

Packages

Languages

elysiajs/arona

Folders and files

Latest commit

History

Repository files navigation

Arona

What is

Why

How it works

Stack

Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages