Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Build and ship production ML pipelines faster: a pipeline library with an optional self-hosted visual layer for modular, reproducible workflows, local testing, and experiment tracking.

License

Notifications You must be signed in to change notification settings

flyingriverhorse/Skyulf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

277 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Skyulf

Skyulf Logo

License Commercial Python CI Docs pre-commit Code style: black Checked with mypy PRs Welcome Skyulf codecov Codacy Badge Downloads issues contributors

Skyulf: The Visual MLOps Builder

Skyulf is a self-hosted, privacy-first. It is designed to be the "glue" that holds your data science workflow together (soon with export project option). Bring your data, clean it visually, engineer features with a node-based canvas, and train models, all in one place.

What is the meaning of Skyulf?

I named it Skyulf after two ideas. Sky is the open space above Earth, where the sun, moon, stars, and clouds live. Ulf means β€œwolf,” with Nordic roots, and the wolf is also a strong symbol in Turkic tradition. Together it fits the project: independent and helpful to community.

Table of Contents

Quick Start

Prerequisites: Python 3.10+

On Windows PowerShell

Using pip:

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install --upgrade pip
pip install -r requirements-fastapi.txt
python run_skyulf.py

Using uv (Faster):

uv venv
.\.venv\Scripts\Activate.ps1
uv pip install -r requirements-fastapi.txt
python run_skyulf.py

The run_skyulf.py script will automatically start the FastAPI server.

Optional: Celery & Redis By default, Skyulf uses Celery and Redis for robust background task management. However, for simple local testing or environments where you cannot run Redis, you can disable this dependency.

Add this to your .env file:

USE_CELERY=false

When disabled, background tasks (training, ingestion) will run in background threads within the main application process instead of a separate worker.

S3 Configuration (Optional)

To enable S3 integration for data and artifacts, add these to your .env:

AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
AWS_REGION=us-east-1
S3_BUCKET_NAME=your-bucket
# Optional: Upload local training artifacts to S3
UPLOAD_TO_S3_FOR_LOCAL_FILES=true
# Optional: Force local storage even for S3 data
SAVE_S3_ARTIFACTS_LOCALLY=false

With Docker Compose (Recommended)

docker compose up --pull=always --build

This will start the full stack:

  • FastAPI Backend (Port 8000)
  • Redis (Port 6379)
  • Celery Worker (Background jobs)

Open:

Skyulf Core Library

The core machine learning logic of Skyulf (preprocessing, modeling, tuning) is available as a standalone library on PyPI. You can use it to build reproducible pipelines in your own scripts or notebooks, independent of the web platform.

pip install skyulf-core
# or
uv add skyulf-core

Using Skyulf as a Library

Skyulf isn't just a web application; its core logic is available as a standalone Python library (skyulf-core). You can use it in your own scripts or Jupyter notebooks for powerful EDA and pipeline building. Using EDA is a great way to get started and it is really easy to use.

Example: Automated EDA

import polars as pl
from skyulf.profiling.analyzer import EDAAnalyzer
from skyulf.profiling.visualizer import EDAVisualizer

# 1. Load Data
df = pl.read_csv("your_dataset.csv")

# 2. Run Analysis
analyzer = EDAAnalyzer(df)
profile = analyzer.analyze(
    target_col="target",
    task_type="Classification", # Optional: Force "Classification" or "Regression"
    date_col="timestamp",       # Optional: Manually specify if auto-detection fails
    lat_col="latitude",         # Optional
    lon_col="longitude"         # Optional
)

# 3. Visualize Results (The Easy Way)
# This single class handles all the rich terminal output and matplotlib plots
viz = EDAVisualizer(profile, df)

# Print the dashboard
viz.summary()

# Show the plots
viz.plot()

For detailed examples including Time Series, Geospatial Analysis, and Causal Inference, see the EDA User Guide.

Key Features

  • 🎨 Visual Feature Canvas: A node-based editor to clean, transform, and engineer features without writing spaghetti code. (25+ built-in nodes).
  • Automated EDA: Professional-grade Exploratory Data Analysis with interactive charts, causal discovery (DAGs), decision trees for rule extraction, segmentation, outlier detection, and statistical alerts.
  • Drift Analysis Built on the EDA engine to monitor data and model drift over time with statistical tests and visualizations.
  • High-Performance Engine: Built on FastAPI and Polars for lightning-fast data processing and easy API extension.
  • ⚑ Async by Default: Heavy training jobs run in the background via Celery & Redis (or background threads)β€”your UI never freezes.
  • πŸ’Ύ Flexible Data: Ingest CSV, Excel, JSON, Parquet, or SQL. Start with SQLite (zero-config) and scale to PostgreSQL.
  • ☁️ S3 Integration: Full support for S3-compatible storage (AWS, MinIO) for data ingestion, artifact storage, and model registry.
  • 🧠 Model Training: Built-in support for Scikit-Learn models with hyperparameter search (Grid/Random/Halving) and optional Optuna integration.
  • πŸ“¦ Model Registry & Deployment: Version control your models, track metrics, and deploy them to a live inference API with a single click.
  • πŸ“Š Experiment Tracking: Compare multiple runs side-by-side with interactive charts, confusion matrices, and ROC curves.

Roadmap

We have a clear vision to turn Skyulf into a complete App Hub for AI.

  • Phase 1: Polish & Stability (Done) - Architecturing, type safety, and documentation.
  • Phase 2: Deepening Data Science (Current Focus) - Advanced EDA, Ethics/Fairness checks, Synthetic Data, and Public Data Hubs, more models, NLP and more.
  • Phase 3: The "App Hub" Vision - Plugin system, GenAI/LLM Builders, and Deployment.
  • Phase 4: Expansion - Real-time collaboration, Edge/IoT export, and Audio support.

πŸ‘‰ View the full ROADMAP.md for details.

Version History

We maintain a detailed changelog of all major updates, new features, and architectural changes.

πŸ‘‰ View the full VERSION_UPDATE.md for the complete history.

Workflow Overview

The high-level flow from dataset to model training inside Skyulf:

Dataset β†’ Train/Val/Test split β†’ Celery-driven model trainer
Dataset source β†’ train/val/test split β†’ background model training (Celery)

Development

  • Configuration via config.py with sane defaults (SQLite, dev CORS)
  • Lifespan hooks initialize the async DB engine automatically
  • Tests under tests/ cover core feature engineering and training helpers
  • docker-compose.yml to run API + Redis (+ Celery worker)

Contributing

We welcome contributions! See CONTRIBUTING.md for setup and workflow guidance, and read our Code of Conduct.

License

Skyulf uses a split licensing model to balance open standards with sustainable development:

  • Backend & Core: Apache 2.0 (Permissive) - Ideal for integration and enterprise use.
  • Frontend (Feature Canvas): GNU AGPLv3 (Copyleft) - Ensures UI improvements are shared back to the community.

Commercial Use: No separate commercial license is required for internal use or building proprietary plugins on the backend. However, if you are building a proprietary SaaS that modifies the frontend and cannot comply with AGPLv3, please see COMMERCIAL-LICENSE.md for partnership options.


If you'd like to contribute, sponsor, or request a commercial license, please star the repo, open a Discussion or issue, or see .github/FUNDING.yml for sponsorship options.


🀝 Join the Journey

I'm building this because I love it, but I can't do it alone forever.

  • Try it out: Clone the repo, run it, break it.
  • Give Feedback: Tell me what sucks. Tell me what you love.
  • Contribute: Even a typo fix in the README helps.

Let's build the simplest, most powerful MLOps hub together.

"Not all those who wander are lost." β€” J.R.R. Tolkien ring


Β© 2025 Murat Unsal β€” Skyulf Project
SPDX-License-Identifier: Apache-2.0