Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BSalita/benchy

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

207 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BENCHY

Benchmarks you can feel

We all love benchmarks, but there's nothing like a hands on vibe check. What if we could meet somewhere in the middle?

Enter BENCHY. A chill, live benchmark tool that lets you see the performance, price, and speed of LLMs in a side by side comparison for SPECIFIC use cases.

Watch the latest development video here

deepseek-r1

deepseek-r1

o1-ai-coding-limit-testing

m4-mac-book-pro

parallel-function-calling

pick-two

Benchy Micro Apps

Important Files

  • .env - Environment variables for API keys
  • server/.env - Environment variables for API keys
  • package.json - Front end dependencies
  • server/pyproject.toml - Server dependencies
  • src/store/* - Stores all front end state and prompt
  • src/api/* - API layer for all requests
  • src/pages/* - Front end per app pages
  • src/components/* - Front end components
  • server/server.py - Server routes
  • server/modules/llm_models.py - All LLM models
  • server/modules/openai_llm.py - OpenAI LLM
  • server/modules/anthropic_llm.py - Anthropic LLM
  • server/modules/gemini_llm.py - Gemini LLM
  • server/modules/ollama_llm.py - Ollama LLM
  • server/modules/deepseek_llm.py - Deepseek LLM
  • server/benchmark_data/* - Benchmark data
  • server/reports/* - Benchmark results

Setup

Get API Keys & Models

  • Anthropic
  • Google Cloud
  • OpenAI
  • Deepseek
  • Ollama
    • After installing Ollama, pull the required models:
    # Pull Llama 3.2 1B model
    ollama pull llama3.2:1b
    
    # Pull Llama 3.2 latest (3B) model
    ollama pull llama3.2:latest
    
    # Pull Qwen2.5 Coder 14B model
    ollama pull qwen2.5-coder:14b
    
    # Pull Deepseek R1 1.5B, 7b, 8b, 14b, 32b, 70b models
    ollama pull deepseek-r1:1.5b
    ollama pull deepseek-r1:latest
    ollama pull deepseek-r1:8b
    ollama pull deepseek-r1:14b
    ollama pull deepseek-r1:32b
    ollama pull deepseek-r1:70b
    
    # Pull mistral-small 3
    ollama pull mistral-small:latest

Client Setup

# Install dependencies using bun (recommended)
bun install

# Or using npm
npm install

# Or using yarn
yarn install

# Start development server
bun dev  # or npm run dev / yarn dev

Server Setup

# Move into server directory
cd server

# Create and activate virtual environment using uv
uv sync

# Set up environment variables
cp .env.sample .env (client)
cp server/.env.sample server/.env (server)

# Set EVERY .env key with your API keys and settings
ANTHROPIC_API_KEY=
OPENAI_API_KEY=
GEMINI_API_KEY=
DEEPSEEK_API_KEY=
FIREWORKS_API_KEY=

# Start server
uv run python server.py

# Run tests
uv run pytest (**beware will hit APIs and cost money**)

Resources

About

Benchmarks you can feel

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 64.8%
  • Python 19.1%
  • Vue 15.4%
  • Other 0.7%