Highlights
- Pro
Stars
Achieve state of the art inference performance with modern accelerators on Kubernetes
A Datacenter Scale Distributed Inference Serving Framework
A high-throughput and memory-efficient inference and serving engine for LLMs
Kubectl plugin to ease sniffing on kubernetes pods using tcpdump and wireshark
Tile primitives for speedy kernels
APM, Application Performance Monitoring System
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
A tool for automatically generating markdown documentation for helm charts
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
[CNCF Sandbox Project] Managing your Kubernetes clusters (including public, private, edge, etc.) as easily as visiting the Internet
Go module providing unified interface and efficient clients to work with various object storage providers until like GCS, S3, Azure, SWIFT, COS and more.
DevSpace - The Fastest Developer Tool for Kubernetes âš¡ Automate your deployment workflow with DevSpace and develop software directly inside Kubernetes.
This is a place for various problem detectors running on the Kubernetes nodes.
Module, Model, and Tensor Serialization/Deserialization
Terraform Crusoe Cloud provider
Nuitka is a Python compiler written in Python. It's fully compatible with Python 2.6, 2.7, 3.4-3.13. You feed it your Python app, it does a lot of clever things, and spits out an executable or exte…
Fast container image distribution plugin with lazy pulling
Kubernetes Image Puller is used for caching images on a cluster. It creates a DaemonSet downloading and running the relevant container images on each node.
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 17+ clouds, or on-prem).
Kubernetes Operator to automate Helm, DaemonSet, StatefulSet & Deployment updates