Thanks to visit codestin.com
Credit goes to github.com

Skip to content

update mla rope mcore>=0.18 (0.15-0.18 compat)#114

Merged
Jintao-Huang merged 3 commits into
modelscope:mainfrom
Jintao-Huang:update_rope_0605
Jun 5, 2026
Merged

update mla rope mcore>=0.18 (0.15-0.18 compat)#114
Jintao-Huang merged 3 commits into
modelscope:mainfrom
Jintao-Huang:update_rope_0605

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the README files to correct the categorization of Kimi models and introduces a new RoPE utility module (rope.py) supporting both conventional (bshd) and packed sequence (thd) formats. Feedback on the new RoPE implementation highlights several critical issues: a hardcoded 4D tensor slicing in _rotate_half that could cause runtime errors on other dimensionalities, potential AttributeError and ValueError exceptions when context parallel (cp_group) is not enabled or initialized, and performance degradation due to multiple synchronous host-device transfers when accessing cu_seqlens on the GPU.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/mcore_bridge/utils/rope.py Outdated
Comment thread src/mcore_bridge/utils/rope.py Outdated
Comment thread src/mcore_bridge/utils/rope.py Outdated
Comment thread src/mcore_bridge/utils/rope.py Outdated
@Jintao-Huang
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the README documentation to categorize kimi_k25 as a VL model and adjusts the import path of apply_rotary_pos_emb in deepseek_v4.py. It also introduces a patch for rope_utils.apply_rotary_pos_emb in patcher.py to handle mla_rotary_interleaved. Feedback points out that the current patching mechanism can cause a TypeError on older Megatron-Core versions if unsupported keyword arguments are passed, and suggests a robust argument-filtering approach using signature inspection.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/mcore_bridge/patcher.py
@Jintao-Huang Jintao-Huang changed the title update rope mcore>=0.18 (0.15-0.18 compat) update mla rope mcore>=0.18 (0.15-0.18 compat) Jun 5, 2026
@Jintao-Huang Jintao-Huang merged commit 1c7b106 into modelscope:main Jun 5, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant