Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@janpfeifer
Copy link
Contributor

  • Added install_linux_amd64_amazonlinux.sh and pre-built libraries for amazonlinux (built using old glibc support).
  • Fixed installation scripts: s/sudo/$_SUDO. Also made them more verbose.
  • Removed dependency on xargs in installation script for Linux.
  • Improved documentation on Nvidia GPU card detection, and error message if not found.
  • Updated GitHub action (go.yaml) to only change the README.md with the result of the change, if pushing to the
    main branch.
  • Added prjt.arena to avoid costly allocations for CGO calls, and merged some of CGO calls for general speed-ups.
    The following functions had > 50% improvements on their fixed-cost (measured on transfers with 1 value, and minimal programs)
    execution time (not the variable part):
    • Buffer.ToHost()
    • Client.BufferFromHost()
    • LoadedExecutable.Execute()
  • Added BufferToHost and BufferFromHost benchmarks.
  • Added support for environment variable XLA_DEBUG_OPTIONS: if set, it is parsed as a DebugOptions proto that
    is passed to the JIT-compilation of a computation graph.
  • LoadedExecutable.Execute() now waits for the end of the execution (by setting
    PJRT_LoadedExecutable_Execute_Args.device_complete_events).
    Previous behavior lead to odd behavior and was undefined (not documented).
  • Package dtypes:
    • Added tests;
    • Added SizeForDimensions() to be used for dtypes that uses fractions of bytes (like 4 bits).
  • Added Client.NewSharedBuffer (and the lower level client.CreateViewOfDeviceBuffer()) to create buffers with shared
    memory with the host, for faster input.
    • Added AlignedAlloc and AlignedFree required by client.CreateViewOfDeviceBuffer.
  • Added Buffer.Data for direct access to a buffer's data. Undocumented in PJRT, and likely only works on CPU.
  • Fixed coverage script.

Added BufferToHost benchmarks;
Consolidated CGO calls into fewer for Buffer.ToHost for significant gains for small tensors.
Added benchmarks and its results.
Added arena benchmarks.
Updated Add1 benchmarks.
Updated benchmark and CHANGELOG.
@janpfeifer janpfeifer merged commit 3e4e41d into main Dec 19, 2024
1 check passed
@janpfeifer janpfeifer deleted the io branch August 25, 2025 05:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants