Codestin Search App

v0.1.5

[CI]Add norm and layout_plot (tile-ai#534)

* [CI]Add norm and layout_plot

* fix lint

* Remove obsolete test files for RMS normalization and plot layout, streamlining the testing suite.

* Add make_mma_load_base_layout function to create MMA result layouts

- Introduced a new function `make_mma_load_base_layout` for generating layout functions for storing MMA results in fragment buffers.
- Added detailed docstring explaining parameters, return values, and potential exceptions.
- Implemented logic for handling different data types and matrix configurations, including assertions for input validation.
- Defined internal functions for mapping fragment indices to threads and local indices, enhancing the layout functionality.

* Enhance MMA load test with additional imports and functionality

- Added imports for `tilelang.language`, `Literal`, `Callable`, `DataType`, `IndexMap`, and `get_mma_micro_size` to support extended functionality.
- Improved the `make_mma_load_base_layout` function by ensuring it can handle various data types and configurations.
- Updated the test function `test_mma_load_base_layout` to validate the layout for float16 matrix A.

* Fix formatting in test_fragment_mma_load_a.py by adding a blank line for improved readability.

* Add RMS normalization functions to test_rms_norm.py

- Introduced `rms_norm` and `rms_norm_splitk` functions for RMS normalization, enhancing the testing capabilities.
- Implemented kernel functions with shared memory allocation and parallel processing for improved performance.
- Updated the test function to validate the new RMS normalization implementations.

* Add reference program for RMS normalization in test_rms_norm.py

- Introduced `ref_program` function to provide a reference implementation for RMS normalization.
- This addition enhances the testing framework by allowing comparisons against a known reference output.

* Enhance RMS normalization tests with additional imports and formatting

- Added import for `tilelang.language` to support extended functionality in `test_rms_norm.py`.
- Improved code readability by adding blank lines for better separation of code sections.

* Update RMS normalization test parameters and enhance layout plotting

- Increased matrix dimensions in `test_rms_norm` to 8192 for improved performance testing.
- Removed obsolete test functions in `test_fragment_mma_load_a.py` to streamline the test suite.
- Enhanced layout plotting functionality by ensuring proper visualization of base, warp, and block layouts in `test_fragment_mma_load_a.py`.

* Refactor RMS normalization test parameters and improve layout plotting readability

- Simplified the parameters in `test_rms_norm` by removing `blk_k` for clarity.
- Enhanced code readability in `test_fragment_mma_load_a.py` by adjusting the formatting of the `block_layout` definition and removing the unused `warp_cols` variable.

* Enhance RMS normalization with split-k implementation and additional profiling

- Added a new function `test_rms_norm_splitk` to test the split-k variant of RMS normalization.
- Updated the main RMS normalization script to include profiling for the split-k implementation.
- Ensured all checks pass with appropriate latency measurements for both reference and tile-lang implementations.

* Remove obsolete test file `test_fragment_mma_load_a.py` to streamline the test suite.

* Refactor `rms_norm.py` to streamline benchmarking output and remove redundant code. Comment out the `plot_layout` call in `fragment_mma_load_a.py` for clarity.

* Refactor `test_rms_norm.py` by removing redundant test function `test_rms_norm_splitk` to streamline the test suite and improve clarity.

---------

Co-authored-by: Your Name <[email protected]>

Jun 4, 2025
a32009b
zip
tar.gz

v0.1.4

[Documentation] Fix Installation Documentation (tile-ai#405)

* Update Installation.md

* Update installation prerequisites in documentation

---------

Co-authored-by: Lei Wang <[email protected]>

Apr 18, 2025
a41a473
zip
tar.gz

v0.1.3

[Release] Bump version to 0.1.3 (tile-ai#264)

* Bump version to 0.1.3

* Refactor Docker script to streamline installation commands

- Removed the installation of the Python environment and CMake from the Docker run command, simplifying the execution process.
- Updated the command to focus on pip installation and running tox for testing across multiple Python versions.

Mar 23, 2025
f308c8a
zip
tar.gz

v0.1.2.post1

[Example] Implement NSA Decode tilelang exampls (tile-ai#168)

* [Refactor] Update BitBLAS Benchmark with TileLang Carver Imports and Roller Hints Generation

- Replace BitBLAS imports with TileLang Carver imports in benchmark_matmul.py
- Modify roller hints generation using new TileLang Carver template and utility functions
- Update get_roller_hints_from_func to handle None cases and improve return logic
- Adjust DefaultPolicy to handle different codegen dictionary formats

* [Refactor] Update Thread Binding and Import Statements in TileLang Kernels

- Replace T.thread_binding() with T.get_thread_binding() across multiple kernel test files
- Update import statements for MMA layout and macro generator in dequantize GEMM and FP8 examples
- Move map_torch_type utility function to tilelang.utils.tensor
- Remove unnecessary imports and improve code organization

* Refactor Native Sparse Attention Example with Enhanced Triton Kernel

- Update parallel_nsa_fwd_kernel to support more flexible sparse attention computation
- Add support for block counts and offsets in the Triton kernel
- Modify kernel grid and computation logic for improved performance
- Update example script to use naive_nsa_simple reference implementation
- Improve type hints and kernel configuration

* Add Native Sparse Attention Examples with Tilelang and Triton Implementations

- Introduce new example scripts for native sparse attention:
  * example_tilelang_nsa_fwd.py: Forward pass implementation using TileLang
  * example_tilelang_nsa_decode.py: Decoding-specific sparse attention implementation
  * example_triton_nsa_fwd.py: Triton-based sparse attention forward pass
- Update reference.py with naive implementations for sparse attention
- Support different sparse attention scenarios including forward pass and inference
- Add comprehensive testing and validation against reference implementations

* lint fix

Mar 7, 2025
d8a06c0
zip
tar.gz

v0.1.2

[Release] Bump Version to v0.1.2 (tile-ai#155)

* Remove Torch CPP backend and update execution backend options

- Remove TorchCPPKernelAdapter and related code from JIT modules
- Update execution backend options in jit/__init__.py, kernel.py, and adapter/__init__.py
- Remove "torch_cpp" from supported execution backend literals
- Simplify backend validation and remove unused torch_cpp-related code
。

* lint fix

* Add block sparse attention implementations for TileLang and Triton

- Implement block sparse attention kernels for TileLang and Triton
- Add example scripts for block sparse attention with top-k and threshold-based masking
- Include utility functions for generating sparse attention masks
- Demonstrate causal attention with block-level sparsity
- Add test cases to validate sparse attention implementations against PyTorch reference

* Bump version to 0.1.1

* Bump version to 0.1.2

Mar 6, 2025
c8c7dec
zip
tar.gz

v0.1.1

[Release] Bumpy version to v0.1.1 (tile-ai#107)

* Remove Torch CPP backend and update execution backend options

- Remove TorchCPPKernelAdapter and related code from JIT modules
- Update execution backend options in jit/__init__.py, kernel.py, and adapter/__init__.py
- Remove "torch_cpp" from supported execution backend literals
- Simplify backend validation and remove unused torch_cpp-related code
。

* lint fix

* Add block sparse attention implementations for TileLang and Triton

- Implement block sparse attention kernels for TileLang and Triton
- Add example scripts for block sparse attention with top-k and threshold-based masking
- Include utility functions for generating sparse attention masks
- Demonstrate causal attention with block-level sparsity
- Add test cases to validate sparse attention implementations against PyTorch reference

* Bump version to 0.1.1

* Refactor block sparse attention examples for improved code quality

- Apply consistent code formatting and style in TileLang and Triton block sparse attention implementations
- Add ruff linter ignore comment for specific line in Triton implementation
- Improve readability by adjusting indentation and line breaks
- Standardize sparse mask generation and test function implementations
- Minor optimizations in test case configurations

* lint

Feb 23, 2025
59342bb
zip
tar.gz

v0.1.0

bump version into v0.1.0 (tile-ai#76)

Feb 12, 2025
02a2cba
zip
tar.gz

v0.0.1

[Release] Bump Version into 0.0.1 (#18)

* remove code ql ci

* update comprehensive tilelang-benchmark link

* Bump Version into 0.0.1

* fix setup sdist issues

Jan 20, 2025
473977b
zip
tar.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

v0.1.5

v0.1.4

v0.1.3

v0.1.2.post1

v0.1.2

v0.1.1

v0.1.0

v0.0.1

Uh oh!

Tags: tile-ai/tilescale