Production-ready RAG stack (Retrieval-Augmented Generation) with local Docker Compose, Kubernetes manifests, and automatic environment generation. CI is handled by GitLab.
- Auto-generated env files per environment via the
generators/module. - Local dev with
docker compose(seedevops/docker-compose/). - Kubernetes deploy manifests under
devops/kubernetes/. - GitLab CI pipeline configured in
.gitlab-ci.yml. - One-shot entrypoint script to run/stop/clean the stack locally.
.
├── .vscode/ # Workspace settings (tasks, launch configs, etc.)
├── ci/
│ ├── images/ # (Optional) CI container images & Dockerfiles
│ └── utils/ # Helper scripts used by CI/deploy
│ ├── build.sh
│ ├── clean.sh
│ ├── deploy.sh
│ ├── README.md
│ └── redirect-service.sh
├── devops/
│ ├── docker-compose/
│ │ └── docker-compose.yml # Local development stack
│ └── kubernetes/ # K8s manifests
│ ├── deployments/
│ │ ├── ai-frontend-deployment.yaml
│ │ └── ai-stt-deployment.yaml
│ └── services/
│ ├── ai-frontend-service.yaml
│ └── ai-stt-service.yaml
├── generators/ # Environment generator
│ ├── configurations/ # Templates/configs per environment
│ ├── entrypoint.sh # Usage: ./entrypoint.sh <env>|clean
│ └── generate_env.py # Writes .env files consumed by services
├── scripts/ # (Optional) ad-hoc utilities
├── services/ # Microservices
│ ├── ai-backend-py # Backend API (FastAPI/Flask, etc.)
│ ├── ai-frontend-rag # Next.js RAG frontend (current)
│ ├── ai-frontend-rag-py # Legacy Python-based frontend (if needed)
│ ├── ai-llm # vLLM-based serving / on-prem LLM wrapper
│ ├── ai-postgres # Dev/Postgres artifacts for local use
│ └── ai-rag # RAG pipeline/orchestrator
├── stacks/
│ └── ingress.trainingdev1.ai.yaml # Example Ingress for cluster
├── .gitignore
├── .gitlab-ci.yml # CI pipeline (build/test/deploy)
├── entrypoint.sh # Local runner (run|stop|clean)
└── Readme.md
This project generates environment variables automatically for each environment using the generators directory.
- Templates and configuration live in
generators/configurations/. - The generator writes
.envfiles expected by services and tooling.
Run it directly:
cd generators
./entrypoint.sh dev # or staging/prod if defined
# ...
./entrypoint.sh clean # remove generated env filesThe exact output paths and variables are defined in
generate_env.pyand theconfigurations/templates.
The easiest way to spin up the whole stack locally is via the root entrypoint.sh, which handles env generation and Compose:
# from repo root
./entrypoint.sh run # generates envs (dev) and starts docker compose
./entrypoint.sh stop # stops the compose stack
./entrypoint.sh clean # cleans generated env filesWhat it does under the hood (simplified):
# 1) generate envs for dev
pushd generators && ./entrypoint.sh dev && popd
# 2) bring up the local stack
pushd devops/docker-compose && docker compose up --build && popdYou can also run Compose manually:
# Ensure envs exist first:
cd generators && ./entrypoint.sh dev && cd -
cd devops/docker-compose
docker compose up --buildManifests live under devops/kubernetes/:
deployments/– Deployments for services (e.g.,ai-frontend,ai-stt).services/– ClusterIP/Service definitions.- Optional ingress examples under
stacks/(e.g.,ingress.trainingdev1.ai.yaml).
Apply them in your target cluster:
# Make sure the right kube-context is selected
kubectl config get-contexts
kubectl config use-context <your-context>
# Deploy workloads & services
kubectl apply -f devops/kubernetes/deployments/
kubectl apply -f devops/kubernetes/services/
# (Optional) Ingress for your environment
kubectl apply -f stacks/ingress.trainingdev1.ai.yaml
# Check rollout
kubectl rollout status deployment/ai-frontend
kubectl get svc,deploy,pods -n <namespace-if-used>You can also wire these steps into your CI using the scripts in ci/utils/ (e.g., deploy.sh, redirect-service.sh).
The pipeline is defined in .gitlab-ci.yml. Typical stages include:
- Build: containerize services (optionally using
ci/images/Dockerfiles). - Test/Lint: run unit/integration checks.
- Package/Push: push images to your registry.
- Deploy: use
kubectl(and scripts fromci/utils/) to apply manifests to the cluster.
Make sure your project variables (registry credentials, kubeconfig or cluster integration, environment names) are configured in GitLab.
- ai-backend-py – Core backend API (Python).
- ai-rag – Retrieval-Augmented Generation pipeline (indexing, retrieval, orchestration).
- ai-frontend-rag – Next.js UI for the RAG experience (current frontend).
- ai-frontend-rag-py – Legacy Python-based frontend (kept for reference/transition).
- ai-llm – vLLM based serving for on-prem/controlled inference.
- ai-postgres – Development database artifacts for local runs.
- ai-stt – Speech-to-Text component (deployed via K8s manifests).
- Docker & Docker Compose
- kubectl with access to your cluster (for deploys)
- GitLab project with CI runners (for pipelines)
- Python 3.x (if you run generator scripts locally)
# One-shot local bring-up
./entrypoint.sh run
# Stop local services
./entrypoint.sh stop
# Clean generated env files
./entrypoint.sh clean
# Manual Kubernetes deploy
kubectl apply -f devops/kubernetes/deployments/
kubectl apply -f devops/kubernetes/services/- Keep
generators/configurations/updated whenever services add/remove environment variables. - Prefer the Next.js app in
services/ai-frontend-ragover legacy frontends. - For production, use GitLab CI to build and deploy images, not local Compose.