-
-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix broken test vllm:test_kernels - test_attention_selector.py::test_flash_attn
#17873
opened May 9, 2025 by
tracelogfb
Loading…
measure peak memory correctly by removing already used memory
v1
#17872
opened May 8, 2025 by
MiladInk
Loading…
[FP8][ROCm][Attention] Enable FP8 KV cache on ROCm for V1
v1
#17870
opened May 8, 2025 by
gshtras
Loading…
Don't load generation config if generation_config=vllm
#17868
opened May 8, 2025 by
yinghai
Loading…
[V1] Fast decode prepare path for prepare_inputs logic
documentation
Improvements or additions to documentation
needs-rebase
v1
#17866
opened May 8, 2025 by
alexm-redhat
Loading…
[V1] Add minItems, maxItems support with xgrammar
ci/build
structured-output
v1
#17865
opened May 8, 2025 by
russellb
Loading…
[BugFix][AMD] Compatible patch for latest AITER(05/07/2025)
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
#17864
opened May 8, 2025 by
qli88
Loading…
[Model] Broadcast Ovis2 implementation to fit Ovis1.5 and Ovis1.6
documentation
Improvements or additions to documentation
frontend
multi-modality
Related to multi-modality (#4194)
[CI] Make JSON output tests less likely to fail
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#17859
opened May 8, 2025 by
russellb
Loading…
[BugFix] [ROCm]: Bugfix and handle addition case of input for
rocm_aiter_rms_norm
#17857
opened May 8, 2025 by
tjtanaa
Loading…
[CI/Build] Automatically retry flaky tests
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#17856
opened May 8, 2025 by
DarkLight1337
Loading…
[Bugfix]: v1 engine - consider lora adapters in allowed_token_ids
v1
#17855
opened May 8, 2025 by
bbrowning
Loading…
Fix Whisper crash caused by invalid
max_num_batched_tokens
config
#17853
opened May 8, 2025 by
inkcherry
Loading…
[Frontend]: always try to load lora adapter from reslover
frontend
#17851
opened May 8, 2025 by
CormickKneey
Loading…
[Feature]Add support for models quantized with AutoRound
#17850
opened May 8, 2025 by
wenhuach21
Loading…
[Doc] Update several links in reasoning_outputs.md
documentation
Improvements or additions to documentation
#17846
opened May 8, 2025 by
windsonsea
Loading…
[Feature] Support
tool_choice: required
when using Xgrammar as the StructuredOutputBackend
.
ci/build
structured-output
v1
#17845
opened May 8, 2025 by
chaunceyjiang
•
Draft
[Bugfix] Fix QKVCrossParallelLinear::sync_weight_attrs for PyTorch compile
#17844
opened May 8, 2025 by
anko-intel
Loading…
[Misc] add jsonargment to support --hf-overrides
ci/build
frontend
needs-rebase
#17842
opened May 8, 2025 by
lengrongfu
Loading…
[P/D][V1] Add generic KV Connector for delegation to external implementations
#17840
opened May 8, 2025 by
sdavidbd
Loading…
[V1][Structured Output] Update llguidance (ONLY add when PR is ready to merge/full CI is needed
>= 0.7.11
) to avoid AttributeError (no StructTag
)
ci/build
ready
#17839
opened May 8, 2025 by
shen-shanshan
Loading…
[Core] Parallel multi-modal processor
multi-modality
Related to multi-modality (#4194)
v1
#17831
opened May 8, 2025 by
DarkLight1337
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.