Thanks to visit codestin.com
Credit goes to github.com

Skip to content

flaviomalavazi/agentic-data-stack

 
 

Repository files navigation

Agentic Data Stack

The open-source stack for ClickHouse's suite of agentic analytic tools — your chat, your models, your data.
Powered by ClickHouse Cloud, LibreChat, and Langfuse Cloud.

Learn more at clickhouse.ai

Overview

This project runs an agentic analytics environment with Docker Compose. It connects a chat UI (LibreChat) to your data (ClickHouse Cloud) via MCP, with full LLM observability via Langfuse Cloud — all in a single docker compose up command.

ClickHouse and Langfuse are both managed cloud services, so nothing beyond Docker is required to run locally.

What's included

Component Purpose Hosted
LibreChat Modern Chat UI with multi-model / provider support (OpenAI, Anthropic, Google) Local (3081)
ClickHouse MCP MCP server that gives agents access to ClickHouse ClickHouse Cloud
Langfuse LLM observability — traces, evals, prompt management Langfuse Cloud
Langfuse Enricher Patches AgentRun traces with human-readable agent names from LibreChat Local (sidecar)
MongoDB Transactional database for LibreChat Local (27017)
Meilisearch Full-text search for LibreChat Local (7700)
pgvector Vector database for RAG Local (5433)
RAG API Retrieval-augmented generation service for LibreChat Local (8022)

Quick Start

Prerequisites

  • Docker and Docker Compose v2+
  • A ClickHouse Cloud account — get your MCP auth token from the ClickHouse Cloud console
  • A Langfuse Cloud account — get your public and secret API keys from Project Settings → API Keys

1. Prepare the environment

./scripts/prepare-demo.sh

This generates a .env file with random credentials for all local services, then presents an interactive menu to configure your API keys (OpenAI, Anthropic, Google). Any providers you skip will remain as user_provided, letting users enter their own keys in the LibreChat UI.

You will also need to set the following cloud service credentials in your .env:

# ClickHouse Cloud MCP
CLICKHOUSE_MCP_AUTH_TOKEN=<your token from ClickHouse Cloud>

# Langfuse Cloud (US region — adjust LANGFUSE_BASE_URL for EU)
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_BASE_URL=https://us.cloud.langfuse.com

You can also generate credentials separately and customize the initial administrator account:

USER_EMAIL="[email protected]" USER_PASSWORD="supersecret" USER_NAME="YourName" ./scripts/generate-env.sh

Note: To use LibreChat's file search / RAG features, the RAG API needs a real API key for embeddings — user_provided won't work because the RAG API calls the embeddings endpoint directly. If OPENAI_API_KEY is set to user_provided, set RAG_OPENAI_API_KEY to a valid OpenAI key (it overrides OPENAI_API_KEY for RAG only). You can also switch embedding providers via EMBEDDINGS_PROVIDER (openai, azure, huggingface, huggingfacetei, ollama). See the RAG API docs for details.

2. Start the stack

docker compose up -d

3. Access the services

An admin user is created automatically on first startup using the credentials from your .env file.

Architecture

LibreChat connects to ClickHouse Cloud through the managed MCP endpoint, allowing AI agents to query and analyze your data. All LLM interactions are traced in Langfuse Cloud for observability, evaluation, and prompt management. A local enricher sidecar automatically tags each trace with the agent's display name.

Observability

Traces are sent automatically to Langfuse Cloud from LibreChat. The langfuse-enricher sidecar runs alongside the stack and enriches every AgentRun trace with:

  • Tagagent:<AgentName> (e.g. agent:Varejão)
  • Metadata fieldagent_name: <AgentName>

This makes it easy to filter traces by agent in the Langfuse UI. The enricher polls every 60 seconds and backfills the last 7 days on startup.

Scripts

Script Description
scripts/prepare-demo.sh Generate .env and interactively configure API keys
scripts/generate-env.sh Generate .env with random credentials
scripts/reset-all.sh Stop all containers and wipe all local data/volumes
scripts/create-librechat-user.sh Manually create a LibreChat admin user
scripts/init-librechat-user.sh Auto-init user on container startup (used internally)

Configuration

  • LibreChatlibrechat.yaml configures endpoints, MCP servers, and agent capabilities
  • Environment.env holds all credentials and service configuration (see .env.example for reference)
  • Dockerdocker-compose.yml includes librechat-compose.yml, which defines all local services including the Langfuse enricher sidecar

Reset Everything

To tear down all containers and delete all local data:

./scripts/reset-all.sh

Then set up again and start fresh:

./scripts/prepare-demo.sh
docker compose up -d

Links

About

Official ClickHouse Agentic Data Stack - self-host with ClickHouse, LibreChat, Langfuse, and ClickHouse MCP.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Shell 75.2%
  • Python 24.0%
  • Dockerfile 0.8%