Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ZionK1/DynamiQK

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DynamiQK

This is a Chisel generator that produces an optimized Query-Key (Q×K) attention hardware module capable of detecting and exploiting sparse attention patterns (Grid, A‑shape, Tri‑shape, Vertical‑slash) for both inter‑modality and intra‑modality attention. By tailoring the datapath to each pattern, it reduces computation and memory bandwidth in vision‑language and multimodal transformer architectures.

The generator reads pattern metadata, builds Compressed Sparse Row (CSR) masks, and emits a fully parameterized Chisel module for the QK compute engine as well as a Pattern Detection Module.

Authors

Current Implementation

  • Pattern Detection Module
    • Chisel Component
    • Tester
  • QK Module
    • Chisel component
      • Performs optimized computations for inter/intramodal sparse pattern inputs
    • Scala Model
    • Tester
      • Reports performance benefit from pattern-aware QxK multiplication

Testing

To run the included tests, enter the following in a sbt shell:

sbt test

Note: The default parameters for the test take a few minutes to run. You can lower the dimensions of the inputs by making BM and BN smaller, which will significantly drop the runtime. However, there will be less of speedup, given the nature of less workload. Conversely, raising BM, BN, and ITERS will lead to a greater speedup reported as the dense case gets worse with more required computations.

Future Goals

  • Implement pattern detection for all sparsity patterns (Currently only capable of A-shape and Vertical Slash)
  • Generating appropriate multi-modal matrix input
    • Option 1: Run MMInference flow to reproduce matrix inputs/fetch pre-existing input data from MMInference repo
    • Option 2: Generate sparse matrix input data through GenerativeAI
  • Benchmark performance and functionality of DynamiQK vs. SpAtten (related work) for QK computation step, right now the tester doesn't support softmax-normalized attention score validation.

Related Works

About

Dynamic pattern-driven optimizations for QxK

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published