Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@booxter
Copy link
Contributor

@booxter booxter commented Jan 31, 2025

Closes #1463

Signed-off-by: Ihar Hrachyshka [email protected]

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the
    conventional commits.
  • Changelog updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Functional tests have been added, if necessary.
  • E2E Workflow tests have been added, if necessary.

@mergify mergify bot added CI/CD Affects CI/CD configuration container Affects containization aspects testing Relates to testing ci-failure PR has at least one CI failure labels Jan 31, 2025
@booxter booxter force-pushed the py312-burn-baby-burn branch from cd3f4b6 to 456b29b Compare January 31, 2025 23:35
@mergify mergify bot added ci-failure PR has at least one CI failure and removed ci-failure PR has at least one CI failure labels Jan 31, 2025
@booxter
Copy link
Contributor Author

booxter commented Feb 1, 2025

@mergify rebase

@mergify
Copy link
Contributor

mergify bot commented Feb 1, 2025

rebase

✅ Branch has been successfully rebased

@booxter booxter force-pushed the py312-burn-baby-burn branch from 456b29b to fcd51fe Compare February 1, 2025 02:48
@mergify mergify bot removed the ci-failure PR has at least one CI failure label Feb 1, 2025
@booxter
Copy link
Contributor Author

booxter commented Feb 1, 2025

CUDA job failure here is because of #3111 which is not related to python 3.12

@github-actions
Copy link

github-actions bot commented Feb 1, 2025

E2E (NVIDIA L40S x4) workflow launched on this PR: View run

@mergify mergify bot added the ci-failure PR has at least one CI failure label Feb 1, 2025
@github-actions
Copy link

github-actions bot commented Feb 1, 2025

e2e workflow succeeded on this PR: View run, congrats!

@booxter
Copy link
Contributor Author

booxter commented Feb 3, 2025

@RobotSail so the NVIDIA L40S x4 job passed. What else is missing / broken that we could validate?

@fabiendupont
Copy link
Contributor

@booxter, probably worth rebasing to use prebuilt vLLM wheel and pass the failed test.

@RobotSail
Copy link
Member

@booxter This test didn't test the codepath of training with use_dolomite: true, which is the part that blocked us from migrating last time.

@booxter
Copy link
Contributor Author

booxter commented Feb 7, 2025

@RobotSail thanks for this. Would that be enough to trigger it?

diff --git a/src/instructlab/defaults.py b/src/instructlab/defaults.py
index 67f4695e..7afb9eda 100644
--- a/src/instructlab/defaults.py
+++ b/src/instructlab/defaults.py
@@ -123,7 +123,7 @@ class _InstructlabDefaults:
         "lora_alpha": 32,
         "lora_dropout": 0.1,
         "lora_target_modules": ["q_proj", "k_proj", "v_proj", "o_proj"],
-        "use_dolomite": False,
+        "use_dolomite": True,
     }
     SUPPORTED_CONTENT_FORMATS = ["json"]


@RobotSail
Copy link
Member

Yeah that should work

@RobotSail
Copy link
Member

@booxter If you are already running on 4xL40s then make sure to disable lora to do a full fine tune

@booxter
Copy link
Contributor Author

booxter commented Feb 7, 2025

@RobotSail probably a stupid question but! :) how do I do this in CI? Isn't it already disabled for 4xL40s job?

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had activity within 60 days. It will be automatically closed if no further activity occurs within 30 days.

@github-actions github-actions bot added the stale label Apr 10, 2025
@mergify
Copy link
Contributor

mergify bot commented Apr 10, 2025

This pull request has merge conflicts that must be resolved before it can be merged. @booxter please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase This Pull Request needs to be rebased label Apr 10, 2025
@ktdreyer ktdreyer removed the stale label Apr 11, 2025
@booxter
Copy link
Contributor Author

booxter commented Apr 12, 2025

We can revive when there's a request and staffing to make this happen.

@booxter booxter closed this Apr 12, 2025
@booxter booxter reopened this Apr 16, 2025
@booxter
Copy link
Contributor Author

booxter commented Apr 16, 2025

FYI: functional failure in the latest run is because of infra issue in Quay returning 502. Unrelated.

@github-actions
Copy link

E2E (NVIDIA L40S x4) workflow launched on this PR: View run

@github-actions
Copy link

e2e workflow failed on this PR: View run, please investigate.

@github-actions
Copy link

E2E (NVIDIA L40S x4) workflow launched on this PR: View run

@github-actions
Copy link

e2e workflow succeeded on this PR: View run, congrats!

@booxter booxter force-pushed the py312-burn-baby-burn branch from 6852b8b to 4523b85 Compare April 28, 2025 20:50
@mergify mergify bot added ci-failure PR has at least one CI failure and removed ci-failure PR has at least one CI failure labels Apr 28, 2025
@booxter booxter force-pushed the py312-burn-baby-burn branch from 4523b85 to 6de0066 Compare April 28, 2025 20:59
@mergify mergify bot removed the ci-failure PR has at least one CI failure label Apr 28, 2025
@mergify
Copy link
Contributor

mergify bot commented Apr 28, 2025

This pull request has merge conflicts that must be resolved before it can be merged. @booxter please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added needs-rebase This Pull Request needs to be rebased ci-failure PR has at least one CI failure labels Apr 28, 2025
@github-actions
Copy link

E2E (NVIDIA L40S x4) - Python 3.12 workflow launched on this PR: View run

@github-actions
Copy link

e2e workflow failed on this PR: View run, please investigate.

@booxter
Copy link
Contributor Author

booxter commented Apr 28, 2025

Sigh. I hit the bug with symbols because the workflow files from the Cortney's branch for #3161 didn't include it. I will have to wait for #3161 to land before retrying.

@booxter booxter force-pushed the py312-burn-baby-burn branch from 6de0066 to 61a4014 Compare April 28, 2025 23:14
@mergify mergify bot added ci-failure PR has at least one CI failure and removed needs-rebase This Pull Request needs to be rebased ci-failure PR has at least one CI failure labels Apr 28, 2025
@github-actions
Copy link

E2E (NVIDIA L40S x4) - Python 3.12 workflow launched on this PR: View run

@booxter booxter force-pushed the py312-burn-baby-burn branch from 61a4014 to c59d331 Compare April 30, 2025 16:43
@mergify mergify bot removed the ci-failure PR has at least one CI failure label Apr 30, 2025
@github-actions
Copy link

E2E (NVIDIA L40S x4) - Python 3.12 workflow launched on this PR: View run

@mergify mergify bot added the ci-failure PR has at least one CI failure label Apr 30, 2025
@booxter booxter closed this Jun 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI/CD Affects CI/CD configuration ci-failure PR has at least one CI failure container Affects containization aspects documentation Improvements or additions to documentation testing Relates to testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support Python 3.12

4 participants