-
-
Notifications
You must be signed in to change notification settings - Fork 11.3k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Model][Perf] Use cos and sin cache in QwenVL
qwen
Related to Qwen models
#28798
opened Nov 16, 2025 by
gcanlin
Loading…
[Misc] Add backup hash algorithm for FIPS constrained environments
kv-connector
new-model
Requests to new models
#28795
opened Nov 16, 2025 by
geodavic
Loading…
5 tasks
Add INT4 + LoRA support with tensor materialization
documentation
Improvements or additions to documentation
performance
Performance-related issues
#28793
opened Nov 16, 2025 by
sheikheddy
Loading…
[Metrics] Fix KV cache usage percent metric multiproc
v1
#28792
opened Nov 16, 2025 by
jaywonchung
Loading…
3 of 5 tasks
Add INT4 compressed-tensors + LoRA support (including MoE)
documentation
Improvements or additions to documentation
#28791
opened Nov 16, 2025 by
sheikheddy
Loading…
[Bugfix] Fix Llama3JsonToolParser to support deeply nested JSON parameters
documentation
Improvements or additions to documentation
frontend
llama
Related to Llama models
tool-calling
v1
#28789
opened Nov 15, 2025 by
ym820
Loading…
[Build] Add OpenAI triton_kernels
ci/build
#28788
opened Nov 15, 2025 by
varun-sundar-rabindranath
•
Draft
[Perf] Optimize multi-token incremental detokenization
v1
#28786
opened Nov 15, 2025 by
OthmanMohammad
Loading…
docs: prefix caching seems quite outdated
documentation
Improvements or additions to documentation
#28784
opened Nov 15, 2025 by
longregen
Loading…
[Disagg] Support large batch size in proxy server and update NixlConnector doc for DP
documentation
Improvements or additions to documentation
kv-connector
v1
#28782
opened Nov 15, 2025 by
minosfuture
Loading…
5 tasks
Fix: align vllm bench serve ignore_eos behavior with legacy benchmark…
performance
Performance-related issues
#28780
opened Nov 15, 2025 by
Amitjoiya
Loading…
5 tasks
[Bugfix]: nccl connnector memory leak
kv-connector
#28779
opened Nov 15, 2025 by
weichengz0616
Loading…
5 tasks
[NIXL][XPU] update install script of NIXL
ci/build
kv-connector
#28778
opened Nov 15, 2025 by
zhenwei-intel
Loading…
[Model] Add support for openPangu_Pro_Moe_v2
documentation
Improvements or additions to documentation
new-model
Requests to new models
v1
#28775
opened Nov 15, 2025 by
yt0428
Loading…
5 tasks
fix a corner case that could cause out-of-sync with async scheduling and dp >1
v1
#28774
opened Nov 15, 2025 by
bangshengtang
Loading…
[Model][QwenVL] Optimize Related to Qwen models
Qwen2_5_VisionAttention q,k preparation
qwen
#28769
opened Nov 15, 2025 by
lgeiger
Loading…
Fix gpt oss weight loading with EP + bf16
gpt-oss
Related to GPT-OSS models
ready
ONLY add when PR is ready to merge/full CI is needed
#28765
opened Nov 15, 2025 by
ashors1
Loading…
5 tasks
[not4land] Test CI
ready
ONLY add when PR is ready to merge/full CI is needed
#28764
opened Nov 15, 2025 by
jerryzh168
Loading…
5 tasks
[DO NOT MERGE][Attention] FlashAttention ViT support
ci/build
nvidia
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#28763
opened Nov 15, 2025 by
MatthewBonanni
Loading…
3 of 5 tasks
[Bugfix][cache_kernels]: Fix OOB in cache_kernels.cu
#28760
opened Nov 14, 2025 by
Flink-ddd
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2025-11-12.