Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Releases: AI-Hypercomputer/xpk

v0.14.0

22 Oct 13:12
5b5f972

Choose a tag to compare

What's Changed

New Features

Bug fixes

New Contributors

Full Changelog: v0.13.0...v0.14.0

v0.13.0

24 Sep 14:17
97de171

Choose a tag to compare

Full Changelog: v0.12.0...v0.13.0

v0.12.0

19 Sep 14:08
c7c45d9

Choose a tag to compare

What's Changed

New Features

  • Add TAS support for workloads on DWS clusters by @FIoannides in #615
  • feat: Add link to GCP Workloads list under xpk workload list by @jamOne- in #621
  • feat: Make workloads link navigate to aiml jobsets list by @jamOne- in #624
  • Increase kubectl wait times by @scaliby in #625
  • Add pathways + nap support by @scaliby in #623

Bug fixes

  • Fix error when updating a cluster that already has JobSet by @lukebaumann in #622
  • Don't set --enable-queued-provisioning for flex multislice by @SikaGrr in #631

New Contributors

Full Changelog: v0.11.0...v0.12.0

v0.11.0

01 Sep 07:09
c96ea9d

Choose a tag to compare

What's Changed

New Features

Bug fixes

New Contributors

Full Changelog: v0.10.1...v0.11.0

Release v0.10.1

24 Jul 09:16
3288f8b

Choose a tag to compare

Bugs:

  • Fixed kueue installation for tpu clusters #546

Full Changelog: v0.10.0...v0.10.1

v0.10.0

18 Jul 13:58

Choose a tag to compare

Highlights

DWS Flex support for GPUs and TPUs
Managed Lustre storage attach support

What's Changed

New Features

Bug fixes

  • Fix issue in control_plane_endpoints_config.dns_endpoint_config.allow… by @SujeethJinesh in #499
  • Fix broken A3 High workloads by @gcie in #494
  • Bring back shared_memory volume for A3 Mega and A3 High by @gcie in #512
  • Provided the required permissions for JAX to list the pods by @sharabiani in #509
  • fix the incorrect number of chips per VM for v5litepod-8 by @gcie in #513
  • Update Kueue and Jobset controller default limit value by @ycchenzheng in #502
  • Fix cluster creation from reservation by @pawloch00 in #522

New Contributors

Full Changelog: v0.9.0...v0.10.0

v0.9.0

13 Jun 09:08
ad39147

Choose a tag to compare

Highlights

GPUDirect-TCPX support for H100 accelerator (A3-High VMs)
A command to adapt a cluster to XPK expected config (xpk cluster adapt)
DWS Calendar Mode Reservations

What's Changed

New Features

Bug fixes

  • Merge main to develop by @gcie in #458
  • Update pathways.py with worker component type. by @RoshaniN in #456
  • Fix error when xpk storage attach --type=gcpfilestore without --mount-options by @gcie in #463
  • Update README.md - text edit in Advanced usage section by @kzmyslona in #473
  • Update PathwaysJob Version to v0.1.1 To Fix RM OOM by @SujeethJinesh in #477
  • Placement Policy removed from A3-Mega blueprints with --spot by @sharabiani in #478
  • Enable DNS Access to Prevent Connection Timeout Errors by @SujeethJinesh in #483
  • Fix DWS Calendar Mode Reservations for A3 Mega by @gcie in #484

New Contributors

Full Changelog: v0.8.0...v0.9.0

v0.8.0

14 Apr 17:33
v0.8.0
7e24869

Choose a tag to compare

Highlights

  • Support for provisioning A4 GKE clusters
  • PathwaysJob integration
  • Storage support (Parallelstore and Hyperdisk)

What's Changed

New Features

  • Add the option to use Multi-tier checkpointing in workloads by @abhinavclemson in #447
  • Integrate PathwaysJob into XPK. by @RoshaniN in #448
  • Implement Parallelstore and Hyperdisk storages attach by @sharabiani in #436
  • A4 support for prod by @gcie in #412
  • Add --mount-options parameter to xpk storage attach/create by @gcie in #450

Bug fixes

  • Update JOBSET_VERSION from 0.7.2 to 0.8.0 by @SujeethJinesh in #425
  • fix yaml alignment when remote-python-sidecar-image is passed by @sadikneipp in #426
  • Bring back manual manifest specification for attaching storage by @gcie in #427
  • Fix XPK version in Pypi release by @sharabiani in #428
  • Remove sudo requirement from make by @sharabiani in #435
  • fix: workloads not scheduling on A3 Ultra clusters by @gcie in #441
  • Disable creating additional networks for L4 and A2 clusters by @gcie in #444
  • Fix xpk workload create for L4 and A100 by @gcie in #452

Full Changelog: v0.7.2...v0.8.0

v0.7.2

27 Mar 20:05
6ba0019

Choose a tag to compare

What's Changed

Bug fixes

Full Changelog: v0.7.1...v0.7.2

v0.7.1

25 Mar 14:08
745df78

Choose a tag to compare

What's Changed

Bug fixes

  • fix yaml alignment when remote-python-sidecar-image is passed by @sadikneipp in #426
  • Bring back manual manifest specification for attaching storage by @gcie in #427
  • Fix XPK version in Pypi release by @sharabiani in #428