-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[TRTQA-2802][fix]: add --host for mgmn serve examples script
#4175
opened May 9, 2025 by
xinhe-nv
Loading…
Breaking change: perf: Enable scheduling overlap by default
#4174
opened May 9, 2025 by
kaiyux
Loading…
fix: draft target README and assertion for logits-based acceptance
#4167
opened May 8, 2025 by
mayani-nv
Loading…
[TRTLLM-5054][fix] Removing repeated loading of input processor
#4161
opened May 8, 2025 by
rakib-hasan
Loading…
[TRTLLM-5050][feat] Enable per-request stats with PyT backend
#4156
opened May 8, 2025 by
pcastonguay
Loading…
Feat: support exporting softmax statistics and update the kernel-selection heuristic
#4155
opened May 8, 2025 by
PerkzZheng
Loading…
[TRTLLM-4911] feat(scaffolding): make sampling_params only setable by controller
#4151
opened May 8, 2025 by
dc3671
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.