jimwurst

"Es ist mir wurst." Germans 🇩🇪

It is a German expression meaning that "It doesn't matter to me", literally translated as "This is sausage to me". 🌭 (yeah no there is no sausage emoji so I am putting a hotdog here instead)

Casual in attitude, serious about DWH architecture.

Tenet

This is a monolithic repository of a DWH for personal data analytics and AI. Everything in this repo is expected to be 100% running locally.

A very core idea here is tools agnostic. Any tooling in modern data stack will be abstracted, and materialize in places like folder structure. Open source tooling will be prioritized.

The pholosophy behind can be found in Data Biz.

Getting Started

1. Prerequisites

Docker must be installed and running.
Ollama must be installed.

2. Launch Everything

Run the following command to spin up the database, pull the AI model, and launch the AI Agent interface:

make up

This will:

Start Postgres (Docker).
Pull the required LLM (qwen2.5:3b).
Launch the Streamlit web interface at http://localhost:8501.

3. Connectivity & Schemas

You can connect to the local PostgreSQL instance with:

Host: localhost | Port: 5432 | User/Pass: jimwurst_user/jimwurst_password

The following schemas are initialized by default:

marts, intermediate, staging, and s_<app_name> (ODS).

Python Environment Setup

This project uses uv for fast, reliable Python dependency management.

Installing uv

curl -LsSf https://astral.sh/uv/install.sh | sh

After installation, restart your shell or run:

source $HOME/.local/bin/env

Managing Dependencies

Create a virtual environment:

uv venv

Install dependencies:

uv pip install -r requirements.txt

Add a new dependency:

uv pip install <package>
uv pip freeze > requirements.txt

Sync dependencies (ensure exact match with requirements.txt):

uv pip sync requirements.txt

Abstraction & Toolings

Data Ops

Containerization: Docker
CI/CD: Github Actions
Job Orchestration: Python / Makefile
DWH: Postgres
Package Manager: uv

Data Engineering

Data Ingestion: Python / SQL
Data Transformation: dbt Core
Data Activation:
- Reporting: (tbd) Metabase/lightdash
- adhoc analysis: Jupyter
- AI: Ollama

Scalability (Optional)

For larger-scale data operations, the following tools can be integrated:

Job Orchestration: Apache Airflow
Data Ingestion: Airbyte

🏗 Folder Structure

Each application follows a strict modular structure using snake_case. Tooling is materialized through structure:

.
├── .github/                 # GitHub Actions workflows and CI config
├── apps/                    # Tool-specific configurations and deployments
│   ├── data_ingestion/      # Ingestion tools
│   │   └── airbyte/
│   ├── data_transformation/ # Transformation tools
│   │   └── dbt/             # Central dbt project
│   ├── data_activation/     # BI & activation tools
│   │   └── metabase/
│   └── job_orchestration/   # Orchestration tools
│       └── airflow/
├── docker/                  # Local orchestration (Docker Compose, .env)
├── docs/                    # Documentation, diagrams, and architecture RFCs
├── prompts/                 # AI system prompts and LLM context files
└── utils/                   # Shared internal packages (Python utils, custom operators)

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github		.github
.vscode		.vscode
apps		apps
docker		docker
docs		docs
prompts		prompts
utils		utils
.env.example		.env.example
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
debug_db.py		debug_db.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

jimwurst

Table of Contents

Tenet

Getting Started

1. Prerequisites

2. Launch Everything

3. Connectivity & Schemas

Python Environment Setup

Installing uv

Managing Dependencies

Abstraction & Toolings

Data Ops

Data Engineering

Scalability (Optional)

🏗 Folder Structure

About

Uh oh!

Releases

Packages

Uh oh!

Languages

davnnis2003/jimwurst

Folders and files

Latest commit

History

Repository files navigation

jimwurst

Table of Contents

Tenet

Getting Started

1. Prerequisites

2. Launch Everything

3. Connectivity & Schemas

Python Environment Setup

Installing uv

Managing Dependencies

Abstraction & Toolings

Data Ops

Data Engineering

Scalability (Optional)

🏗 Folder Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages