Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
a47bc17
Cache buffer dimensions;
janpfeifer Dec 10, 2024
9333d9a
Updated CHANGELOG.
janpfeifer Dec 10, 2024
392b736
Merge CGO calls for `Client.BufferFromHost()`
janpfeifer Dec 11, 2024
3670131
Destroy events after waiting for them.
janpfeifer Dec 11, 2024
8ce7b73
Added malloc benchmark. Replaced C.malloc by C.calloc.
janpfeifer Dec 11, 2024
1e47e0f
Added a simple arena object to PJRT to speedup small C/C++ allocations.
janpfeifer Dec 11, 2024
d47fa8d
Fixed cases with 0 input/output buffers.
janpfeifer Dec 11, 2024
79c38f9
Updated XLA/PJRT dependency.
janpfeifer Dec 11, 2024
4301519
Removed spurious debug message.
janpfeifer Dec 12, 2024
41c3116
Added a second computation to benchmark: f(x) = (x+1)/2
janpfeifer Dec 12, 2024
02bce69
Improved more buffer methods by using arena.
janpfeifer Dec 12, 2024
cc9f5b3
Added median/5%-tile based benchmarking.
janpfeifer Dec 12, 2024
f257082
Converted benchmarks to use mean/5%-tile.
janpfeifer Dec 13, 2024
38865fd
Updated benchmarks for Arena tests.
janpfeifer Dec 13, 2024
0d46bdb
Added support for ENV variable XLA_DEBUG_OPTIONS.
janpfeifer Dec 15, 2024
881ee45
Added documentation on environment variables used.
janpfeifer Dec 15, 2024
df59ffb
Changed benchmarks to use github.com/janpfeifer/go-benchmarks.
janpfeifer Dec 15, 2024
bbba9b9
Added --bench_duration flag.
janpfeifer Dec 16, 2024
83bfa5e
Expose count of live Buffer and LoadedExecutable objects: LoadedExecu…
janpfeifer Dec 16, 2024
99c2ec2
Updated pjrt_c_api.h file.
janpfeifer Dec 17, 2024
f9d9621
Moved benchmarks results to spreadsheet.
janpfeifer Dec 17, 2024
4694ba7
Added C.ExecuteAndWait.
janpfeifer Dec 17, 2024
941260f
go mod tidy.
janpfeifer Dec 17, 2024
1dfbf6d
go mod tidy
janpfeifer Dec 17, 2024
f78f088
Disable running benchmark tests if -test.short is set.
janpfeifer Dec 17, 2024
9e99ee0
Make execution wait for it to finish.
janpfeifer Dec 17, 2024
15606fd
BufferFromHost -> use arena (and save some 50ns)
janpfeifer Dec 17, 2024
3ffd966
Updated CHANGELOG.
janpfeifer Dec 17, 2024
762f472
dtypes: added tests and SizeForDimensions.
janpfeifer Dec 17, 2024
1738795
Cache the dtype of a buffer.
janpfeifer Dec 17, 2024
7de3b6d
Fixed missing nil error.
janpfeifer Dec 17, 2024
029a8ae
Initial version of Client.CreateViewOfDeviceBuffer
janpfeifer Dec 17, 2024
7487e8d
Added AlignedAlloc and AlignedFree.
janpfeifer Dec 17, 2024
1bde60d
Finished implementing, documenting and testing client.CreateViewOfDev…
janpfeifer Dec 17, 2024
c85659d
- Added NewSharedBuffer().
janpfeifer Dec 18, 2024
1ea4426
Added Buffer.Data.
janpfeifer Dec 19, 2024
f10f5d9
Fixed coverage script.
janpfeifer Dec 19, 2024
4ef585a
Bumped version number in CHANGELOG.
janpfeifer Dec 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/go.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ jobs:
- name: Install Go
uses: actions/setup-go@v5
with:
go-version: "1.22.x"
go-version: "1.23.x"

- name: Install Gopjrt C library gomlx_xlabuilder and PJRT plugin
shell: bash
Expand Down
14 changes: 13 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -329,7 +329,7 @@ Also, see [this blog post](https://opensource.googleblog.com/2024/03/pjrt-plugin
Because of https://github.com/golang/go/issues/13467 : C API's cannot be exported across packages, even within the same repo.
Even a function as simple as `func Add(a, b C.int) C.int` in one package cannot be called from another.
So we need to wrap everything, and more than that, one cannot create separate sub-packages to handle separate concerns.
THis is also the reason the library `chelper.go` is copied in both `pjrt` and `xlabuilder` packages.
This is also the reason the library `chelper.go` is copied in both `pjrt` and `xlabuilder` packages.
* **Why does PJRT spits out so much logging ? Can we disable it ?**
This is a great question ... imagine if every library we use decided they also want to clutter our stderr?
I have [an open question in Abseil about it](https://github.com/abseil/abseil-cpp/discussions/1700).
Expand All @@ -340,6 +340,18 @@ Also, see [this blog post](https://opensource.googleblog.com/2024/03/pjrt-plugin
before calling `pjrt.GetPlugin`. But it may have unintended consequences, if some other library is depending
on the fd 2 to work, or if a real exceptional situation needs to be reported and is not.

## Environment Variables

That help control or debug how **gopjrt** work:

* `PJRT_PLUGIN_LIBRARY_PATH`: Path to search for PJRT plugins. **gopjrt** also searches in `/usr/local/lib/gomlx/pjrt`,
the standard library paths for the system and `$LD_LIBRARY_PATH`.
* `XLA_DEBUG_OPTIONS`: If set, it is parsed as a `DebugOptions` proto that
is passed during the JIT-compilation (`Client.Compile()`) of a computation graph.
It is not documented how it works in PJRT (e.g. I observed a great slow down when this is set,
even if set to the default values), but [the proto has some documentation](https://github.com/gomlx/gopjrt/blob/main/protos/xla.proto#L40).
* `GOPJRT_INSTALL_DIR` and `GOPJRT_NOSUDO`: used by the install scripts, see "Installing" section above.

## Links to documentation

* [Google Drive Directory with Design Docs](https://drive.google.com/drive/folders/18M944-QQPk1E34qRyIjkqDRDnpMa3miN): Some links are outdated or redirected, but very valuable information.
Expand Down
5 changes: 3 additions & 2 deletions c/WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,12 @@ http_archive(
# Notice bazel.sh scrape the line below for the OpenXLA version, the format
# of the line should remain the same (the hash in between quotes), or bazel.sh
# must be changed accordingly.
OPENXLA_XLA_COMMIT_HASH = "90af2896ab4992ff14a1cd2a75ce02e43f46c090" # From 2024-11-24
# OPENXLA_XLA_COMMIT_HASH = "90af2896ab4992ff14a1cd2a75ce02e43f46c090" # From 2024-11-24
OPENXLA_XLA_COMMIT_HASH = "e2e8952ad0fac8833e9a78f9b3689e803ff8524f" # From 2024-12-11

http_archive(
name = "xla",
sha256 = "a910124d546bc79edb685612edaa3d56153f0e0927f967e8defaf312b833d404", # From 2024-11-24
sha256 = "5ec6919a25952fa790904983481ccb51ebbe20bbc53e15ddbb6d3e0b3aa3dfe1", # From 2024-12-11
strip_prefix = "xla-" + OPENXLA_XLA_COMMIT_HASH,
urls = [
"https://github.com/openxla/xla/archive/{hash}.zip".format(hash = OPENXLA_XLA_COMMIT_HASH),
Expand Down
8 changes: 3 additions & 5 deletions chelper.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,17 +31,15 @@ func cSizeOf[T any]() C.size_t {
// It must be manually freed with cFree() by the user.
func cMalloc[T any]() (ptr *T) {
size := cSizeOf[T]()
cPtr := (*T)(C.malloc(size))
C.memset(unsafe.Pointer(cPtr), 0, size)
cPtr := (*T)(C.calloc(1, size))
return cPtr
}

// cMallocArray allocates space to hold n copies of T in the C heap and initializes it to zero.
// It must be manually freed with C.free() by the user.
func cMallocArray[T any](n int) (ptr *T) {
size := cSizeOf[T]() * C.size_t(n)
cPtr := (*T)(C.malloc(size))
C.memset(unsafe.Pointer(cPtr), 0, size)
size := cSizeOf[T]()
cPtr := (*T)(C.calloc(C.size_t(n), size))
return cPtr
}

Expand Down
4 changes: 2 additions & 2 deletions cmd/run_coverage.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@

# Run this from the root of gopjrt repository to generate docs/coverage.out with the coverage data.

PACKAGE_COVERAGE="./pjrt ./xlabuilder"
go test -v -cover -coverprofile docs/coverage.out -coverpkg ${PACKAGE_COVERAGE}
PACKAGE_COVERAGE="github.com/gomlx/gopjrt/pjrt,github.com/gomlx/gopjrt/xlabuilder"
go test -cover -coverprofile docs/coverage.out -coverpkg="${PACKAGE_COVERAGE}" ./... -test.count=1 -test.short
go tool cover -func docs/coverage.out -o docs/coverage.out
22 changes: 21 additions & 1 deletion docs/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,31 @@
# Next
# v0.5.0 - 2024/12/19 - Adding direct access to PJRT buffers for CPU.

* Added `install_linux_amd64_amazonlinux.sh` and pre-built libraries for amazonlinux (built using old glibc support).
* Fixed installation scripts: s/sudo/$_SUDO. Also made them more verbose.
* Removed dependency on `xargs` in installation script for Linux.
* Improved documentation on Nvidia GPU card detection, and error message if not found.
* Updated GitHub action (`go.yaml`) to only change the README.md with the result of the change, if pushing to the
`main` branch.
* Added `prjt.arena` to avoid costly allocations for CGO calls, and merged some of CGO calls for general speed-ups.
The following functions had > 50% improvements on their fixed-cost (measured on transfers with 1 value, and minimal programs)
execution time (**not the variable part**):
* `Buffer.ToHost()`
* `Client.BufferFromHost()`
* `LoadedExecutable.Execute()`
* Added `BufferToHost` and `BufferFromHost` benchmarks.
* Added support for environment variable `XLA_DEBUG_OPTIONS`: if set, it is parsed as a `DebugOptions` proto that
is passed to the JIT-compilation of a computation graph.
* `LoadedExecutable.Execute()` now waits for the end of the execution (by setting
`PJRT_LoadedExecutable_Execute_Args.device_complete_events`).
Previous behavior lead to odd behavior and was undefined (not documented).
* Package `dtypes`:
* Added tests;
* Added `SizeForDimensions()` to be used for dtypes that uses fractions of bytes (like 4 bits).
* Added `Client.NewSharedBuffer` (and the lower level `client.CreateViewOfDeviceBuffer()`) to create buffers with shared
memory with the host, for faster input.
* Added `AlignedAlloc` and `AlignedFree` required by `client.CreateViewOfDeviceBuffer`.
* Added `Buffer.Data` for direct access to a buffer's data. Undocumented in PJRT, and likely only works on CPU.
* Fixed coverage script.

# v0.4.9 - 2024-11-25

Expand Down
Loading