Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Pull requests: HabanaAI/vllm-hpu-extension

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

add step3p5 specific silu
#412 opened May 6, 2026 by ranzhejiang Contributor Loading…
[aice/v1.22.0][WIP] add static moe swiglustep for bf16
#411 opened Mar 11, 2026 by ranzhejiang Contributor Loading…
Block matmul and kv_cache in dynamic quantization
#395 opened Dec 3, 2025 by HolyFalafel Loading…
[WA] bypass the GLM OOM issue
#380 opened Oct 15, 2025 by czhu15 Loading…
pass chunk_size and global_num_experts to the MoE kernel
#369 opened Sep 19, 2025 by yangulei Contributor Loading…
Enable chunked prefill
#362 opened Sep 14, 2025 by jzhoulon Loading…
[HS-6944] Fix for deepseek distill models
#359 opened Sep 10, 2025 by nazneenn Loading…
[aice/v.1.22] refactor chunk size code
#354 opened Sep 1, 2025 by ranzhejiang Contributor Loading…
Fix for Llama4 models (targets main)
#341 opened Aug 19, 2025 by vidyasiv Loading…
Add support for block_softmax_const_max
#327 opened Aug 7, 2025 by mswiniarsk Contributor Draft
Add flag pin_memory to call from hpu.py in vllm
#325 opened Aug 5, 2025 by xuechendi Contributor Loading…
Add Calibration Script for SGLang FP8
#318 opened Jul 29, 2025 by SKRohit Loading…
Fix the fusedsdpa with sliding window alignment issue
#298 opened Jul 17, 2025 by libinta Contributor Loading…
Draft: Proper chunked prefill bucketing
#295 opened Jul 16, 2025 by kzawora-intel Collaborator Draft
Add block_softmax_adjustment and block_softmax kernels
#289 opened Jul 16, 2025 by czhu15 Loading…
Introduce block_softmax_adjustment kernel (#163)
#263 opened Jul 8, 2025 by kdamaszk Contributor Draft
Enable block_softmax_adjustment on Gaudi2
#254 opened Jul 2, 2025 by kdamaszk Contributor Draft
Add pre-commit static checks
#247 opened Jun 30, 2025 by kzawora-intel Collaborator Loading…
Exponential bucketing tweaks
#224 opened Jun 13, 2025 by madamczyk-intel Contributor Loading…
Add useful internal vllm test
#200 opened May 27, 2025 by nirda7 Contributor Draft
ProTip! Filter pull requests by the default branch with base:main.