Numaflow is a Kubernetes-native platform for running massively parallel data processing and streaming jobs.
A Numaflow Pipeline is implemented as a Kubernetes custom resource, and consists of one or more sources, data processing and sink vertices.
Numaflow installs in a few minutes and is easier and cheaper to use for simple data processing applications than a full-featured stream processing platforms.
- Kubernetes-native: If you know Kubernetes, you already know how to use Numaflow.
- Language agnostic: Use your favorite programming language.
- Exactly-Once semantics: No input element is duplicated or lost even as pods are rescheduled or restarted.
- Auto-scaling with back-pressure: Each vertex automatically scales from zero to whatever is needed.
- Data aggregation (e.g. group-by)