-
Notifications
You must be signed in to change notification settings - Fork 567
Insights: InternLM/lmdeploy
Overview
Could not load contribution data
Please try again later
21 Pull requests merged by 9 people
-
fix reward model api
#3703 merged
Jul 2, 2025 -
disable torch.compile in cuda graph runner
#3691 merged
Jul 1, 2025 -
fix profile_throughput.py
#3692 merged
Jul 1, 2025 -
support partial quant
#3694 merged
Jul 1, 2025 -
Defer build cache engine
#3695 merged
Jul 1, 2025 -
Defer build cache engine
#3693 merged
Jul 1, 2025 -
Fix convert bf16 to numpy
#3686 merged
Jul 1, 2025 -
[ci] change flash atten installation in pr test
#3688 merged
Jul 1, 2025 -
support partial quant
#3682 merged
Jul 1, 2025 -
defer building cache engine until weight migration is done
#3683 merged
Jun 30, 2025 -
fix pt engine stop & cancel
#3681 merged
Jun 30, 2025 -
add reward model api
#3665 merged
Jun 30, 2025 -
support do_preprocess=False for chat.completions
#3645 merged
Jun 27, 2025 -
upgrade torch and triton
#3677 merged
Jun 27, 2025 -
Reduce sampling memory usage
#3666 merged
Jun 26, 2025 -
Seperate api_server and pytorch engine into different processors
#3627 merged
Jun 26, 2025 -
raise ImportError when enable ep and not install dlblas
#3636 merged
Jun 26, 2025 -
Support load fused moe weights
#3672 merged
Jun 26, 2025 -
set ray envs
#3643 merged
Jun 26, 2025 -
move import transformers in patch
#3660 merged
Jun 26, 2025 -
[ascend]use custon transdata in python kernel
#3671 merged
Jun 26, 2025
8 Pull requests opened by 5 people
-
ray close wait for forward finish
#3676 opened
Jun 26, 2025 -
bump version to v0.9.1
#3685 opened
Jun 30, 2025 -
support sleep/wakeup for pt engine
#3687 opened
Jun 30, 2025 -
[ascend]suppot deepseek eager_mode
#3696 opened
Jul 1, 2025 -
Relax FP8 TP requirement
#3697 opened
Jul 1, 2025 -
consume the weight tensors that locates on the local_rank when updating model weight
#3698 opened
Jul 1, 2025 -
refactor vl inputs split
#3699 opened
Jul 2, 2025 -
Preliminary Blackwell (sm_120a, RTX 50 series) support
#3701 opened
Jul 2, 2025
5 Issues closed by 4 people
-
[Bug] 大量请求以后报错
#3673 closed
Jun 30, 2025 -
[Bug] install with pip will reinstall cpu version of torch
#3549 closed
Jun 30, 2025 -
[Feature] can support qwen3(no moe) fp8 turbmind
#3684 closed
Jun 29, 2025 -
[Feature] 支持 pytoprch >=2.7
#3674 closed
Jun 27, 2025 -
[Bug] Atlas 300I Duo 单卡 `lmdeploy` 运行直接卡住,没有相关日志提示
#3668 closed
Jun 26, 2025
8 Issues opened by 8 people
-
prompts should be a list
#3704 opened
Jul 2, 2025 -
[feature] gradio doesn't supports think output. Can we support think output.
#3702 opened
Jul 2, 2025 -
[Feature] 请问是否支持GLM-4.1V-Thinking?
#3700 opened
Jul 2, 2025 -
cannot import name 'ImagesKwargs' from 'transformer.processing_utils'
#3689 opened
Jun 30, 2025 -
[Feature] support Qwen3 Embedding and Reranker
#3679 opened
Jun 27, 2025 -
[Feature] 通过Ray在多机单卡上启动
#3678 opened
Jun 26, 2025 -
[Bug]长文本输入时,响应时间比较久
#3675 opened
Jun 26, 2025
13 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Add Gloo communication to turobmind
#3362 commented on
Jun 27, 2025 • 4 new comments -
[Feature] metrics support
#3534 commented on
Jul 2, 2025 • 2 new comments -
Improve turbomind's prefix cache
#3332 commented on
Jun 26, 2025 • 1 new comment -
[Bug] 大批量离线推理时[safe_run] exception caught: AttributeError 'NoneType' object has no attributet 'get'
#3658 commented on
Jun 26, 2025 • 0 new comments -
[Feature] W4A8-FP8 support in AWQ quantization
#2766 commented on
Jun 27, 2025 • 0 new comments -
[Bug] tool-call-parser和reasoning-parser不能一起用
#3655 commented on
Jun 27, 2025 • 0 new comments -
[Bug] InternVL2.5 78B stuck during inference
#3529 commented on
Jun 30, 2025 • 0 new comments -
Stuck during parallel inference.
#3057 commented on
Jun 30, 2025 • 0 new comments -
[Bug] Qwen3 Reasoning Parser 解析错误
#3664 commented on
Jul 1, 2025 • 0 new comments -
[Bug] 300I DUO上部署qwen2-vl-7b报错:RuntimeError: numel: integer multiplication overflow
#3629 commented on
Jul 1, 2025 • 0 new comments -
[Bug] 运行一段时间,只接受请求不处理
#3616 commented on
Jul 2, 2025 • 0 new comments -
custom triton cache manager
#3659 commented on
Jun 26, 2025 • 0 new comments -
fix free cache in MPEngine branch
#3670 commented on
Jul 2, 2025 • 0 new comments