Skyulf

Skyulf: The Visual MLOps Builder

Skyulf is a self-hosted, privacy-first. It is designed to be the "glue" that holds your data science workflow together (soon with export project option). Bring your data, clean it visually, engineer features with a node-based canvas, and train models, all in one place.

What is the meaning of Skyulf?

I named it Skyulf after two ideas. Sky is the open space above Earth, where the sun, moon, stars, and clouds live. Ulf means “wolf,” with Nordic roots, and the wolf is also a strong symbol in Turkic tradition. Together it fits the project: independent and helpful to community.

Quick Start

Prerequisites: Python 3.10+

On Windows PowerShell

Using pip:

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install --upgrade pip
pip install -r requirements-fastapi.txt
python run_skyulf.py

Using uv (Faster):

uv venv
.\.venv\Scripts\Activate.ps1
uv pip install -r requirements-fastapi.txt
python run_skyulf.py

The run_skyulf.py script will automatically start the FastAPI server.

Optional: Celery & Redis By default, Skyulf uses Celery and Redis for robust background task management. However, for simple local testing or environments where you cannot run Redis, you can disable this dependency.

Add this to your .env file:

USE_CELERY=false

When disabled, background tasks (training, ingestion) will run in background threads within the main application process instead of a separate worker.

S3 Configuration (Optional)

To enable S3 integration for data and artifacts, add these to your .env:

AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
AWS_REGION=us-east-1
S3_BUCKET_NAME=your-bucket
# Optional: Upload local training artifacts to S3
UPLOAD_TO_S3_FOR_LOCAL_FILES=true
# Optional: Force local storage even for S3 data
SAVE_S3_ARTIFACTS_LOCALLY=false

With Docker Compose (Recommended)

docker compose up --pull=always --build

This will start the full stack:

FastAPI Backend (Port 8000)
Redis (Port 6379)
Celery Worker (Background jobs)

Open:

API health — http://127.0.0.1:8000/health
Docs (dev mode) — http://127.0.0.1:8000/docs

Skyulf Core Library

The core machine learning logic of Skyulf (preprocessing, modeling, tuning) is available as a standalone library on PyPI. You can use it to build reproducible pipelines in your own scripts or notebooks, independent of the web platform.

pip install skyulf-core
# or
uv add skyulf-core

Using Skyulf as a Library

Skyulf isn't just a web application; its core logic is available as a standalone Python library (skyulf-core). You can use it in your own scripts or Jupyter notebooks for powerful EDA and pipeline building. Using EDA is a great way to get started and it is really easy to use.

Example: Automated EDA

import polars as pl
from skyulf.profiling.analyzer import EDAAnalyzer
from skyulf.profiling.visualizer import EDAVisualizer

# 1. Load Data
df = pl.read_csv("your_dataset.csv")

# 2. Run Analysis
analyzer = EDAAnalyzer(df)
profile = analyzer.analyze(
    target_col="target",
    task_type="Classification", # Optional: Force "Classification" or "Regression"
    date_col="timestamp",       # Optional: Manually specify if auto-detection fails
    lat_col="latitude",         # Optional
    lon_col="longitude"         # Optional
)

# 3. Visualize Results (The Easy Way)
# This single class handles all the rich terminal output and matplotlib plots
viz = EDAVisualizer(profile, df)

# Print the dashboard
viz.summary()

# Show the plots
viz.plot()

For detailed examples including Time Series, Geospatial Analysis, and Causal Inference, see the EDA User Guide.

Key Features

🎨 Visual Feature Canvas: A node-based editor to clean, transform, and engineer features without writing spaghetti code. (25+ built-in nodes).
Automated EDA: Professional-grade Exploratory Data Analysis with interactive charts, causal discovery (DAGs), decision trees for rule extraction, segmentation, outlier detection, and statistical alerts.
Drift Analysis Built on the EDA engine to monitor data and model drift over time with statistical tests and visualizations.
High-Performance Engine: Built on FastAPI and Polars for lightning-fast data processing and easy API extension.
⚡ Async by Default: Heavy training jobs run in the background via Celery & Redis (or background threads)—your UI never freezes.
💾 Flexible Data: Ingest CSV, Excel, JSON, Parquet, or SQL. Start with SQLite (zero-config) and scale to PostgreSQL.
☁️ S3 Integration: Full support for S3-compatible storage (AWS, MinIO) for data ingestion, artifact storage, and model registry.
🧠 Model Training: Built-in support for Scikit-Learn models with hyperparameter search (Grid/Random/Halving) and optional Optuna integration.
📦 Model Registry & Deployment: Version control your models, track metrics, and deploy them to a live inference API with a single click.
📊 Experiment Tracking: Compare multiple runs side-by-side with interactive charts, confusion matrices, and ROC curves.

Roadmap

We have a clear vision to turn Skyulf into a complete App Hub for AI.

Phase 1: Polish & Stability (Done) - Architecturing, type safety, and documentation.
Phase 2: Deepening Data Science (Current Focus) - Advanced EDA, Ethics/Fairness checks, Synthetic Data, and Public Data Hubs, more models, NLP and more.
Phase 3: The "App Hub" Vision - Plugin system, GenAI/LLM Builders, and Deployment.
Phase 4: Expansion - Real-time collaboration, Edge/IoT export, and Audio support.

👉 View the full ROADMAP.md for details.

Version History

We maintain a detailed changelog of all major updates, new features, and architectural changes.

👉 View the full VERSION_UPDATE.md for the complete history.

Workflow Overview

The high-level flow from dataset to model training inside Skyulf:

Dataset source → train/val/test split → background model training (Celery)

Development

Configuration via config.py with sane defaults (SQLite, dev CORS)
Lifespan hooks initialize the async DB engine automatically
Tests under tests/ cover core feature engineering and training helpers
docker-compose.yml to run API + Redis (+ Celery worker)

Contributing

We welcome contributions! See CONTRIBUTING.md for setup and workflow guidance, and read our Code of Conduct.

License

Skyulf uses a split licensing model to balance open standards with sustainable development:

Backend & Core: Apache 2.0 (Permissive) - Ideal for integration and enterprise use.
Frontend (Feature Canvas): GNU AGPLv3 (Copyleft) - Ensures UI improvements are shared back to the community.

Commercial Use: No separate commercial license is required for internal use or building proprietary plugins on the backend. However, if you are building a proprietary SaaS that modifies the frontend and cannot comply with AGPLv3, please see COMMERCIAL-LICENSE.md for partnership options.

If you'd like to contribute, sponsor, or request a commercial license, please star the repo, open a Discussion or issue, or see .github/FUNDING.yml for sponsorship options.

🤝 Join the Journey

I'm building this because I love it, but I can't do it alone forever.

Try it out: Clone the repo, run it, break it.
Give Feedback: Tell me what sucks. Tell me what you love.
Contribute: Even a typo fix in the README helps.

Let's build the simplest, most powerful MLOps hub together.

"Not all those who wander are lost." — J.R.R. Tolkien

Name		Name	Last commit message	Last commit date
Latest commit History 277 Commits
.github		.github
backend		backend
docs		docs
frontend/ml-canvas		frontend/ml-canvas
site		site
skyulf-core		skyulf-core
static		static
tests		tests
.editorconfig		.editorconfig
.env.example		.env.example
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLA.md		CLA.md
CNAME		CNAME
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
COMMERCIAL-LICENSE.md		COMMERCIAL-LICENSE.md
CONTRIBUTING.md		CONTRIBUTING.md
COPYRIGHT.md		COPYRIGHT.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
VERSION_UPDATE.md		VERSION_UPDATE.md
__init__.py		__init__.py
celery_worker.py		celery_worker.py
docker-compose.yml		docker-compose.yml
index.html		index.html
mkdocs.yml		mkdocs.yml
mypy.ini		mypy.ini
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
requirements-ci.txt		requirements-ci.txt
requirements-dev.txt		requirements-dev.txt
requirements-docs.txt		requirements-docs.txt
requirements-fastapi.txt		requirements-fastapi.txt
robots.txt		robots.txt
run_fastapi.py		run_fastapi.py
run_skyulf.py		run_skyulf.py
sitemap.xml		sitemap.xml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Skyulf

What is the meaning of Skyulf?

Table of Contents

Quick Start

On Windows PowerShell

S3 Configuration (Optional)

With Docker Compose (Recommended)

Skyulf Core Library

Using Skyulf as a Library

Example: Automated EDA

Key Features

Roadmap

Version History

Workflow Overview

Development

Contributing

License

🤝 Join the Journey

About

Uh oh!

Releases 11

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

Uh oh!

License

flyingriverhorse/Skyulf

Folders and files

Latest commit

History

Repository files navigation

Skyulf

What is the meaning of Skyulf?

Table of Contents

Quick Start

On Windows PowerShell

S3 Configuration (Optional)

With Docker Compose (Recommended)

Skyulf Core Library

Using Skyulf as a Library

Example: Automated EDA

Key Features

Roadmap

Version History

Workflow Overview

Development

Contributing

License

🤝 Join the Journey

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 11

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages