Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Model][Perf] Use cos and sin cache in QwenVL qwen Related to Qwen models
#28798 opened Nov 16, 2025 by gcanlin Loading…
[Misc] Add backup hash algorithm for FIPS constrained environments kv-connector new-model Requests to new models
#28795 opened Nov 16, 2025 by geodavic Loading…
5 tasks
[DNM] kimi k2 thinking tool calling frontend gpt-oss Related to GPT-OSS models
#28794 opened Nov 16, 2025 by qandrew Draft
5 tasks
Add INT4 + LoRA support with tensor materialization documentation Improvements or additions to documentation performance Performance-related issues
#28793 opened Nov 16, 2025 by sheikheddy Loading…
[Metrics] Fix KV cache usage percent metric multiproc v1
#28792 opened Nov 16, 2025 by jaywonchung Loading…
3 of 5 tasks
Add INT4 compressed-tensors + LoRA support (including MoE) documentation Improvements or additions to documentation
#28791 opened Nov 16, 2025 by sheikheddy Loading…
[Bugfix] Fix Llama3JsonToolParser to support deeply nested JSON parameters documentation Improvements or additions to documentation frontend llama Related to Llama models tool-calling v1
#28789 opened Nov 15, 2025 by ym820 Loading…
[BugFix] Fix async scheduling + chunked prefill + preemption bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed v1
#28787 opened Nov 15, 2025 by njhill Loading… v0.11.1
docs: prefix caching seems quite outdated documentation Improvements or additions to documentation
#28784 opened Nov 15, 2025 by longregen Loading…
[Disagg] Support large batch size in proxy server and update NixlConnector doc for DP documentation Improvements or additions to documentation kv-connector v1
#28782 opened Nov 15, 2025 by minosfuture Loading…
5 tasks
Fix: align vllm bench serve ignore_eos behavior with legacy benchmark… performance Performance-related issues
#28780 opened Nov 15, 2025 by Amitjoiya Loading…
5 tasks
[Bugfix]: nccl connnector memory leak kv-connector
#28779 opened Nov 15, 2025 by weichengz0616 Loading…
5 tasks
[Model] Add support for openPangu_Pro_Moe_v2 documentation Improvements or additions to documentation new-model Requests to new models v1
#28775 opened Nov 15, 2025 by yt0428 Loading…
5 tasks
[Model][QwenVL] Optimize Qwen2_5_VisionAttention q,k preparation qwen Related to Qwen models
#28769 opened Nov 15, 2025 by lgeiger Loading…
[BugFix] Fix PP performance and PP kv connector output regression bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed v1
#28768 opened Nov 15, 2025 by njhill Loading… v0.11.1
Fix gpt oss weight loading with EP + bf16 gpt-oss Related to GPT-OSS models ready ONLY add when PR is ready to merge/full CI is needed
#28765 opened Nov 15, 2025 by ashors1 Loading…
5 tasks
[not4land] Test CI ready ONLY add when PR is ready to merge/full CI is needed
#28764 opened Nov 15, 2025 by jerryzh168 Loading…
5 tasks
[DO NOT MERGE][Attention] FlashAttention ViT support ci/build nvidia ready ONLY add when PR is ready to merge/full CI is needed v1
#28763 opened Nov 15, 2025 by MatthewBonanni Loading…
3 of 5 tasks
add support for --fully-sharded-loras in fused_moe
#28761 opened Nov 14, 2025 by gnovack Loading…
ProTip! Updated in the last three days: updated:>2025-11-12.