-
Notifications
You must be signed in to change notification settings - Fork 448
feat(deps): Bump torch upper bound (<2.7.0) + set allowable range for vllm versions (>=0.8.0,<0.9.0)
#3283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
60e0a66 to
ccd6781
Compare
ccd6781 to
643c5ef
Compare
torch + vllm versionstorch + vllm versions
torch + vllm versionsvllm to 0.8.3
643c5ef to
283f011
Compare
vllm to 0.8.3torch upper bound (<2.7.0) + set allowable range for vllm versions (>=0.8.0,<0.9.0)
283f011 to
eb8ee94
Compare
eb8ee94 to
41878de
Compare
41878de to
cf0e64c
Compare
cf0e64c to
e2f7068
Compare
e2f7068 to
66e27ac
Compare
|
E2E (NVIDIA L40S x4) workflow launched on this PR: View run |
|
@RobotSail when the large E2E job passes, will we be okay to merge this change and release it? Or, would you like any additional testing performed to ensure that training works as intended? |
|
e2e workflow succeeded on this PR: View run, congrats! |
|
We need to validate this change in an E2E test with |
|
Please see: #3288 |
|
Should we merge #3288 instead of this? |
|
E2E (NVIDIA L40S x4) LLAMA workflow launched on this PR: View run |
|
e2e workflow failed on this PR: View run, please investigate. |
b84ed08 to
a5db9e3
Compare
a5db9e3 to
e5386ea
Compare
|
E2E (NVIDIA L40S x4) LLAMA workflow launched on this PR: View run |
|
e2e workflow failed on this PR: View run, please investigate. |
requirements-vllm-cuda.txt
Outdated
| # vLLM only supports Linux platform (including WSL) | ||
| vllm==0.7.3 ; sys_platform == 'linux' and platform_machine == 'x86_64' | ||
| # vLLM only supports Linux platform (including WSL). Do not cap this dependency here. Cap in constraints-dev.txt | ||
| vllm>=0.8.0 ; sys_platform == 'linux' and platform_machine == 'x86_64' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're building for aarch64 now. Why is this limited to x86_64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Let me update this to remove the x86_64 limitation.
Add `constraints.txt` to restrict the CI to using torch==2.6.0 and vllm<0.9.0. Also bump the minimum `torch` version because vLLM 0.8.z is only compatible with `torch==2.6.0` (so far). Also update the large Llama job to use fallback logic to try other availability zones. Signed-off-by: Courtney Pacheco <[email protected]>
e5386ea to
8733d7c
Compare
|
This PR did not auto-merge because the large llama E2E job failed. (It has a bug in it.) Therefore, Mergify will not auto-merge this PR as is because it detected a CI failure. Since the required CI checks all pass, I'm going to merge this PR. The llama large E2E job will be fixed in a future PR. |
Checklist:
conventional commits.
Changes
vllm==0.7.3withvllm>=0.8.0and utilize a newconstraints-dev.txtfile to restrict the upper bound onvllmin the CI onlytorch>=2.3.0,<2.6.0totorch>=2.6.0,<2.7.0, and settorch==2.6.0in the newconstraints-dev.txtfile for the CI onlyNote:
vllm>=0.8.0,<0.8.0is only compatible withtorch>=2.6.0right now, hence we have bumped thetorchlower bound.