Codestin Search App

Self-hosted AI platform with multi-model chat, persistent memory,
and an extensible skills engine. One Python file. Zero data leakage.

Built for how you work

Whether you're shipping code, managing knowledge, or securing AI for your team.

⌨️ Developer Tool

Self-host your AI backend

Multi-model routing — Claude, Grok, Codex, Ollama
Agent SDK with tool use and streaming
61 REST API endpoints + WebSocket
Extensible skill system — drop in a SKILL.md
One Python file. No npm. No build step.

💬 Personal AI

Your AI, your data, your server

Native iOS app + responsive web
Memory that persists across sessions
Switch models per conversation
Voice input, file uploads, search
Real-time alerts and notifications

🏢 Business Integration

AI infrastructure for your team

mTLS certificate-based auth
Admin dashboard with health monitoring
Credential management, audit trail
Session management and token tracking
Backup/restore, log streaming

See it in action

Every feature, transparent. Nothing hidden behind a login wall.

Multi-Model Routing

Claude for deep coding. Grok for live web research. Local models for free, private inference. Each conversation targets a specific model — run them all simultaneously.

Use your existing subscriptions. Claude Pro/Max, ChatGPT Plus/Pro — Apex connects through your existing accounts. Only Grok requires a separate API key. Local models are free.

Supported Models

Provider	Models	Connection
Claude (Anthropic)	`opus-4-6`, `sonnet-4-6`, `haiku-4-5`	Agent SDK — uses existing subscription
Codex (OpenAI)	`gpt-5.4`, `gpt-5.3`, `o3`, `o4-mini`	CLI — uses existing subscription
Grok (xAI)	`grok-4`, `grok-4-fast`	API key — pay per use
Local (Ollama/MLX)	Qwen, Gemma, Llama, Mistral, etc.	Local — zero cost, no internet

Note: Using existing subscriptions through Apex is for personal, non-commercial use only. For commercial use, use the providers' API plans.

Persistent Memory

Memory system — automatic context recall

The AI remembers your projects, decisions, and past conversations. Whisper injection silently finds and injects relevant context — no "can you remind me what we discussed?"

APEX.md — project rules injected into every session
MEMORY.md — accumulated knowledge across sessions
Semantic search — vector embeddings with hybrid recall
Session recovery — survives restarts and compaction

Extensible Skills

Slash commands for search, delegation, analysis. Build custom skills with just a markdown file.

/recall — search all conversation transcripts
/codex — delegate tasks to a background agent
/grok — live web research via xAI
/evaluate — sandbox-assess any GitHub repo
/first-principles — 4-layer deep analysis
+ Build your own — drop a SKILL.md, auto-discovered

Production Security

Zero-trust security — mTLS, TLS, atomic writes

Your AI conversations and API keys deserve production-grade protection.

mTLS — client certificate auth, no passwords
TLS everywhere — HTTPS with built-in cert generation
Atomic writes — no partial state on crash
Secrets isolation — credentials in .env only
QR onboarding — scan, install cert, connect in 60s

Admin Dashboard

Full web-based management at /admin — health monitoring, credential management, TLS certificates, session control, live log streaming. 61 REST endpoints usable by both humans and AI agents.

Server status, uptime, model reachability
Per-provider green/red health dots
Database stats, TLS certificate status
Edit project instructions from the browser
Active sessions with token usage, force compaction

Alert System

Multi-channel push — in-app via WebSocket, Telegram bot, and REST API. Custom categories, severity levels (info/warning/critical), ack/unack tracking. Any script, cron job, or service can POST alerts into Apex.

Multi-Agent Orchestration

Apex War Room — multiple agents collaborating on desktop

The War Room — four agents (Operations, Architect, Codex, Designer) collaborating in a single conversation. Each agent has its own model, its own persona, its own specialty. Direct them with @mentions. See cost and token usage in real time.

The sidebar tells the story: dedicated channels for Claude, Grok, Codex, a trading room, a marketing agent, local models — an entire AI organization in one interface.

Three AI agents working simultaneously on iOS

Same experience on mobile. Three agents spinning simultaneously on an iPhone — native SwiftUI, not a web view.

Quick Start

From zero to running in under 2 minutes.

git clone https://github.com/use-ash/apex.git ~/.apex
cd ~/.apex
bash install.sh

The installer creates a virtual environment, installs all dependencies, generates TLS certificates, and walks you through first-time setup. When it finishes, open https://localhost:8300. Your Claude subscription is detected automatically.

Add local models (free)

# Install Ollama (https://ollama.ai)
ollama pull qwen3.5

# Start Apex — Ollama is detected automatically
python3 apex.py

Create a Claude channel for heavy tasks, an Ollama channel for quick questions.

Add Grok (web search)

export XAI_API_KEY=xai-...
python3 apex.py

Full stack with mTLS

export APEX_ENABLE_WHISPER=1
export XAI_API_KEY=xai-...
export GOOGLE_API_KEY=AIza...
APEX_SSL_CERT=cert.pem APEX_SSL_KEY=key.pem APEX_SSL_CA=ca.pem python3 apex.py

All models active. Memory with semantic search. Whisper injection. mTLS auth.

What 30 minutes looks like

Fresh install — three AI providers in a group chat within 30 minutes

Fresh install. Three AI providers — Claude, Grok, and ChatGPT — collaborating in a group chat, @mentioning each other, completing a task loop. Guided onboarding channel visible in the sidebar. This is what you get in half an hour.

Free & Premium

The Apex server and web app are free and open source. Run it on your machine, use it in your browser, no limits, no costs beyond your own AI subscriptions.

Everything is free through September 30, 2026. Every feature, every model, every platform. No license key required. After that, premium features (group channels, multi-agent orchestration, custom personas, native app connectivity) will require a license key.

	Free (forever)	Premium (after Sept 30)
Apex Server	✅
Web App (Desktop & Mobile)	✅
All AI Models	✅
Memory, Skills, Alerts	✅
mTLS / Certificate Auth	✅
Basic Dashboard	✅
Group Channels	✅ Free until Sept 30	License required
Multi-Agent Orchestration	✅ Free until Sept 30	License required
Custom Personas	✅ Free until Sept 30	License required
iOS App (free download)	✅ Free until Sept 30	License required
Android App		🚧 Coming soon
Desktop App		📋 Planned

How the license works

One license key, one gate. The key lives on your server. When it's valid, premium features are unlocked and native apps can connect. The iOS app is a free download that just needs your server to have an active license.

Right now, everything is unlocked with no license required. After September 30, 2026, a license key ($29.99/mo, $249/yr, or $499 lifetime for the first 500 users) will be needed for premium features. The web app always works, licensed or not. If your license expires, you keep everything in the free tier: chat, multi-model routing, memory, skills, alerts, and the dashboard.

Platform Availability

Platform	Status	Tier
🌐 Web App (Desktop)	✅ Available	Free
🌐 Web App (Mobile)	✅ Available	Free
📱 iOS (iPhone)	✅ Free Download	Free until Sept 30, then License
🤖 Android	🚧 In Development	License (after Sept 30)
🖥️ Desktop (Electron)	📋 Planned	License (after Sept 30)

Architecture

server/
├── apex.py                  ← entry point, startup, router registration
├── ws_handler.py            ← WebSocket connections, streaming, session mgmt
├── agent_sdk.py             ← Claude SDK integration, auth, turn execution
├── backends.py              ← Codex, Grok, Ollama/MLX dispatch
├── model_dispatch.py        ← model routing and provider selection
├── routes_chat.py           ← chat REST endpoints
├── routes_alerts.py         ← alert ingestion, APNs push
├── routes_profiles.py       ← persona management
├── routes_models.py         ← model listing and config
├── routes_setup.py          ← guided onboarding wizard
├── routes_misc.py           ← models, usage, license, misc endpoints
├── db.py                    ← SQLite schema, all database helpers
├── state.py                 ← shared in-memory state, accessor functions
├── streaming.py             ← broadcast helpers, WS send utilities
├── config.py                ← constants, version, build metadata
├── env.py                   ← all os.environ reads (single source of truth)
├── mtls.py                  ← TLS + mTLS certificate handling
├── context.py               ← conversation context assembly
├── memory_extract.py        ← memory tag extraction and persistence
├── memory_search.py         ← semantic search and recall
├── skills.py                ← skill discovery and dispatch
├── tasks.py                 ← background task management
├── license.py               ← license validation and trial gating
├── chat_html.py             ← embedded web UI (chat SPA)
├── dashboard.py             ← admin dashboard backend
├── dashboard_html.py        ← admin dashboard UI
├── setup_html.py            ← onboarding wizard UI
├── alert_client.py          ← Telegram + push notification delivery
└── log.py                   ← logging

35 modules. No frameworks. No npm. No build step. The frontend is embedded in the Python server — python3 apex.py and everything runs.

Configuration Reference

Variable	Default	Description
`APEX_HOST`	`0.0.0.0`	Bind address
`APEX_PORT`	`8300`	Port
`APEX_MODEL`	`claude-sonnet-4-6`	Default model for new chats
`APEX_WORKSPACE`	current dir	Working directory for AI tools
`APEX_SSL_CERT`	—	TLS certificate path
`APEX_SSL_KEY`	—	TLS private key path
`APEX_SSL_CA`	—	CA cert for mTLS client verification
`APEX_ENABLE_WHISPER`	`false`	Enable memory whisper injection
`APEX_OLLAMA_URL`	`http://localhost:11434`	Ollama server address
`APEX_MLX_URL`	`http://localhost:8400`	MLX server address
`XAI_API_KEY`	—	xAI API key for Grok
`GOOGLE_API_KEY`	—	Google API key for embedding index

Build Your Own Skills

Skills are directories with a SKILL.md file. Drop one in skills/, restart, and it's live.

skills/my-skill/
├── SKILL.md          # Metadata + instructions (required)
├── run.sh            # Executable entry point (optional)
├── feedback.log      # User corrections (auto-generated)
└── metrics.json      # Usage tracking (auto-generated)

Two types:

Executable skills — have a run.sh. Server executes it, passes results to the AI.
Thinking skills — no script. The AI reads instructions and follows them.

Risk tiers control execution:

Tier	Behavior	Examples
1	Auto-approve	Read-only analysis, search, formatting
2	Notify	File modifications, new dependencies
3	Require approval	API calls, credential access, external writes

Self-improving: /improve reads a skill's metrics and feedback, then proposes concrete changes.

Requirements

Python 3.10+
At least one model provider:
- Claude subscription (Pro/Max/Code), or
- ChatGPT subscription (Plus/Pro), or
- Ollama or MLX for free local inference, or
- xAI API key for Grok
Mix and match per conversation.

Optional:

Google API key for semantic search embeddings (free tier sufficient)
Telegram bot token for mobile alert delivery

FAQ

Is this a wrapper around the API?

No. Claude runs through the Agent SDK with full tool access (read, write, bash, search). Codex runs through the CLI with sandbox permissions. Local models get a custom tool-calling loop. It's closer to Claude Code than to a simple chat interface.

Can I use it on my phone?

The webapp works in mobile browsers for free. The iOS app is a free download — it requires your server to have a valid license key for it to connect. Android is in development.

What's free vs. paid?

Everything is free through September 30, 2026. No license key required. After that, the server, web app, all AI model integrations, memory system, skills engine, admin dashboard, and mTLS security remain free forever. Group channels, multi-agent orchestration, and custom personas will require a license key ($29.99/mo, $249/yr, or $499 lifetime for the first 500). The iOS app is a free download but will need a valid server license to connect after Sept 30.

Can multiple people use one server?

The current architecture is single-user. Multi-user with RBAC is on the roadmap.

How much does it cost to run?

If you already pay for Claude and/or ChatGPT, Apex adds zero cost for those models. Local models are free. Only Grok requires a separate API key. Hosting is your own hardware.

What if I only want local models?

That works. Install Ollama, pull a model, start Apex. No API keys, no accounts, no internet needed. Full memory system, skills, and dashboard included.

Can I build my own skills?

Yes. Drop a directory with a SKILL.md into skills/ and restart. Skills are auto-discovered, usage-tracked, and self-improving via the /improve meta-skill.

Getting Started · Personas · Groups · Contributing · Changelog · License

_{Elastic License 2.0 — free to use, modify, and self-host. Cannot be offered as a hosted service.}

Name		Name	Last commit message	Last commit date
Latest commit History 453 Commits
.github/workflows		.github/workflows
docs		docs
memory		memory
scripts		scripts
server		server
setup		setup
skills		skills
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Install Apex.command		Install Apex.command
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
install.sh		install.sh
mkdocs.yml		mkdocs.yml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Built for how you work

⌨️ Developer Tool

💬 Personal AI

🏢 Business Integration

See it in action

Multi-Model Routing

Persistent Memory

Extensible Skills

Production Security

Admin Dashboard

Alert System

Multi-Agent Orchestration

Quick Start

What 30 minutes looks like

Free & Premium

How the license works

Platform Availability

Architecture

FAQ

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Built for how you work

⌨️ Developer Tool

💬 Personal AI

🏢 Business Integration

See it in action

Multi-Model Routing

Persistent Memory

Extensible Skills

Production Security

Admin Dashboard

Alert System

Multi-Agent Orchestration

Quick Start

What 30 minutes looks like

Free & Premium

How the license works

Platform Availability

Architecture

FAQ

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages