Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Deep-Learning-Profiling-Tools/triton-viz

Repository files navigation

Triton-Viz: A Visualization Toolkit for programming with Triton


Logo

Welcome to Triton-Viz, a visualization and profiling toolkit designed for deep learning applications. Built with the intention of making kernel programming in tile-based DSLs like Triton more intuitive.

Visit our site to see our tool in action!

Table of Contents
  1. About
  2. Getting Started
  3. Working with examples
  4. DSL frontends
  5. Analysis clients
  6. License

About

Triton-Viz helps developers inspect Triton kernels with visualization, profiling, and memory-safety analysis tools. It can run many examples through Triton's interpreter, so GPU access is not required for basic debugging workflows.

Getting Started

Prerequisites

  • Python >= 3.10

Installation of Triton-Viz

Windows Note: Triton-viz depends on Triton, which can only be installed on Windows Subsystem for Linux (WSL). Once installed, follow below instructions in WSL.

Most users can install directly from PyPI:

pip install triton-viz

If you want to run examples from this repo, contribute, or build the web UI, install from source instead:

git clone https://github.com/Deep-Learning-Profiling-Tools/triton-viz.git
cd triton-viz
uv sync # or "uv sync --extra test" if you're running tests

Web UI Build

The PyPI package ships with prebuilt web UI assets in triton_viz/static, so you do not need npm to run the visualizer. If you want to modify the web UI, rebuild the TS sources:

npm install
npm run build:frontend

Optional: Enable NKI Support

For PyPI installs, install with the nki extra and AWS Neuron repository:

pip install triton-viz[nki] --extra-index-url https://pip.repos.neuron.amazonaws.com

For source installs:

uv sync --extra nki # or "uv sync --extra nki --extra test" if also running NKI-related tests

Note that you need to specify all features that you want in one statement when using uv sync, i.e. if you want both NKI and testing support, you must run uv sync --extra nki --extra test. The below statements are wrong and will remove the NKI install when installing test packages:

uv sync --extra nki # NKI support but no testing
uv sync --extra test # tests but no NKI support

Testing

  • To run core Triton-viz tests, run pytest tests/.
  • (if NKI installed) To run NKI-specific tests, run pytest tests/ -m nki.
  • To run all tests (Triton + NKI), run pytest tests/ -m "".
  • To run visualizer web UI tests, run npm run test:frontend.

Working with Examples

Run an example directly with Python:

python examples/visualizer/matmul.py

Use the decorator API when writing or modifying a Triton kernel:

import triton
import triton.language as tl
import triton_viz


@triton_viz.trace("sanitizer")  # also supports "tracer" and "profiler"
@triton.jit
def kernel(x_ptr, out_ptr, BLOCK: tl.constexpr):
    offsets = tl.arange(0, BLOCK)
    values = tl.load(x_ptr + offsets)
    tl.store(out_ptr + offsets, values)

Use the CLI wrappers to run an existing Python script without editing it. These wrappers patch plain @triton.jit kernels, so use them with scripts that do not already apply @triton_viz.trace(...).

triton-sanitizer examples/sanitizer/oob_cli.py
triton-profiler examples/profiler/load_store_cli.py
triton-visualizer trace.tvz

For visualizer workflows, save a trace and launch the UI from Python:

import triton_viz

triton_viz.save("trace.tvz")
triton_viz.launch()

DSL Frontends

Triton is the default DSL frontend. NKI support is optional and selected with the frontend argument:

triton_viz.trace("tracer")  # Triton
triton_viz.trace("tracer", frontend="nki")  # NKI
triton_viz.trace("tracer", frontend="nki_beta2")  # NKI Beta 2

The runtime integration code lives under triton_viz/core/frontend/. NKI simulation runtimes live under triton_viz/core/simulation/.

Analysis Clients

Analyze kernels across visualization, profiling, and sanitization with a single line of code.

  • Visualizer: currently supports load, store, and matmul operations for 1/2/3D tensors (more operations and dimensions coming soon).
  • Profiler: flags non-unrolled loops, inefficient mask usage, and missing buffer_load optimizations while tracking load/store byte counts with low-overhead sampling.
  • Sanitizer: symbolically checks tensor memory accesses for out-of-bounds errors and emits reports with tensor metadata, call stack, and expression trees; optional fake-memory storage avoids real reads.

Save and load traces

import triton_viz

triton_viz.save("trace.tvz")
triton_viz.load(
    "trace.tvz"
)  # automatically clears out existing records, use kwarg "append=True" to prevent this
triton_viz.launch()

CLI: triton-visualizer trace.tvz. The archive is a zip file containing manifest.json plus tensors.npz, and triton_viz.load(...) restores the normal trace state for existing consumers.

Environment variables

Triton-Viz uses a small set of environment variables to configure runtime behavior. Unless noted, boolean flags are enabled only when set to 1.

  • TRITON_VIZ_VERBOSE (default: 0): enable verbose logging and extra debug output.
  • TRITON_VIZ_NUM_SMS (default: 1): number of concurrent SMs to emulate for the CPU interpreter (min 1).
  • TRITON_VIZ_PORT (default: 8000 with share=True, 5001 with share=False): port for the Flask server.
  • ENABLE_SANITIZER (default: 1): enable the sanitizer pipeline that checks memory accesses.
  • ENABLE_PROFILER (default: 1): enable the profiler pipeline that collects performance data.
  • ENABLE_TIMING (default: 0): collect timing data during execution.
  • REPORT_GRID_EXECUTION_PROGRESS (default: 0): report per-program block execution progress in the interpreter.
  • SANITIZER_ENABLE_FAKE_TENSOR (default: 0): use fake tensor storage for sanitizer runs to avoid real memory reads.
  • PROFILER_ENABLE_LOAD_STORE_SKIPPING (default: 1): skip redundant load/store checks to reduce profiling overhead.
  • PROFILER_ENABLE_BLOCK_SAMPLING (default: 1): sample a subset of blocks to reduce profiling overhead.
  • PROFILER_DISABLE_BUFFER_LOAD_CHECK (default: 0): disable buffer load checks in the profiler.

More Puzzles

If you're interested in fun puzzles to work with in Triton, do check out: Triton Puzzles

License

Triton-Viz is licensed under the MIT License. See the LICENSE for details.

Publication

If you find this repo useful for your research, please cite our paper:

@inproceedings{ramesh2025tritonviz,
  author={Ramesh, Tejas and Rush, Alexander and Liu, Xu and Yin, Binqian and Zhou, Keren and Jiao, Shuyin},
  title={Triton-Viz: Visualizing GPU Programming in AI Courses},
  booktitle = {Proceedings of the 56th ACM Technical Symposium on Computer Science Education (SIGCSE TS '25)},
  numpages = {7},
  location = {Pittsburgh, Pennsylvania, United States},
  series = {SIGCSE TS '25}
}

@inproceedings{wu2026tritonsanitizer,
  author    = {Wu, Hao and Zhao, Qidong and Chen, Songqing and Chen, Yang and Hao, Yueming and Liu, Tony C. W. and Chen, Sijia and Aziz, Adnan and Zhou, Keren},
  title     = {Triton-Sanitizer: A Fast and Device-Agnostic Memory Sanitizer for Triton with Rich Diagnostic Context},
  year      = {2026},
  publisher = {Association for Computing Machinery},
  address   = {New York, NY, USA},
  location  = {Pittsburgh, PA, USA},
  booktitle = {Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems},
  series    = {ASPLOS '26},
  keywords  = {GPU, Debugging, Symbolic Execution, Memory Safety, Triton, Memory Access Errors}
}

(back to top)