Update draft model #2

hanq-moreh · 2025-11-10T05:21:01Z

Motivation

The goal of this pull request is to enable hot-swapping of the draft model without restarting the serving server.
Currently, updating the speculative draft model requires a full server restart, which interrupts ongoing requests and complicates integration with runtime speculative model training pipelines.

By allowing the draft model to be reloaded dynamically at runtime, we can:

Continuously update and redeploy the draft model while the service is running.
Avoid service downtime and reduce operational overhead.

Modifications

Added is_draft_model filed to UpdateWeightFromDiskReqInput in io_struct.py to distinguish draft model updates.
Implemented update_weights_from_disk() in eagle_worker.py
Refactored setting up lm_head and embedding for draft model into set_embed_and_head() in eagle_worker.py.
Updated update_weights_from_disk() in scheduler_update_weights_mixin.py to handle is_draft_model=True for draft model weight updates.
Added self.pending_weight_update_queue and maybe_process_pending_weight_update() in scheduler.py to defer weight updates until no running batch is active.
Add test code in test/srt/rl/test_update_weights_from_disk.py

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

hanq-moreh added 4 commits November 7, 2025 02:54

update draft model weight during runtime

a7078f3

add test code

5ac26df

add test code

b9f5f09

apply pre-commit

827b9c6

gitgod-bot assigned hanq-moreh Nov 10, 2025

github-actions bot added speculative-decoding deepseek labels Nov 10, 2025

add docs

d3ddad4

github-actions bot added the documentation Improvements or additions to documentation label Nov 10, 2025

hanq-moreh and others added 6 commits November 10, 2025 05:51

resolve conflicts with main branch

0298685

remove unnecessary comments

7163bfc

resolve conflicts with main branch

7224982

resolve conflicts with main branch

31999b9

Merge branch 'main' into update_draft_model

4d70932

resolve conflicts with main branch

7338ba4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update draft model #2

Update draft model #2

Uh oh!

hanq-moreh commented Nov 10, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Update draft model #2

Are you sure you want to change the base?

Update draft model #2

Uh oh!

Conversation

hanq-moreh commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hanq-moreh commented Nov 10, 2025 •

edited

Loading