Releases: JuliaGPU/KernelAbstractions.jl
Releases · JuliaGPU/KernelAbstractions.jl
v0.9.38
KernelAbstractions v0.9.38
Feature changes
- Add API support for unified memory allocations
Merged pull requests:
- [0.9] Unified memory allocations (#632) (@christiangnrd)
v0.9.37
KernelAbstractions v0.9.37
Feature changes
- Support
@kernel
definition inside functions
Merged pull requests:
- Use stacked method tables (#615) (@vchuravy)
- avoid boxing when
@kernel
is used as a closure (#625) (@simeonschaub)
v0.9.36
KernelAbstractions v0.9.36
Feature changes
get_backend
support for StaticArrays
Merged pull requests:
- Use Printf to report errors from POCL (#592) (@vchuravy)
- use unsafe_indices for a few examples (#612) (@vchuravy)
- Switch to SPIRVIntrinsics 0.3 and the new backend (#614) (@vchuravy)
- KA.__synchronize, add GLOBAL_MEM_FENCE semantic (#618) (@vchuravy)
- add get_backend for StaticArrays (#621) (@vchuravy)
Closed issues:
- How to improve CPU performance? (#357)
v0.9.35
KernelAbstractions v0.9.35
Merged pull requests:
- Implement a CPU backend using POCL (#556) (@vchuravy)
- [0.10] Forbid divergent execution of work-group barriers (#558) (@vchuravy)
- Bump julia-actions/setup-julia from 1 to 2 (#561) (@dependabot[bot])
- Switch Format.yml to CUDA.jl style (#568) (@vchuravy)
- Test pocl#main on CI (#569) (@vchuravy)
- CompatHelper: add new compat entry for SPIRVIntrinsics at version 0.2, (keep existing compat) (#571) (@github-actions[bot])
- CompatHelper: add new compat entry for GPUCompiler at version 1, (keep existing compat) (#572) (@github-actions[bot])
- CompatHelper: add new compat entry for LLVM at version 9, (keep existing compat) (#573) (@github-actions[bot])
- Check that malformed allocations throw and don't stackoverflow (#576) (@vchuravy)
- Check that malformed allocations throw and don't stackoverflow (#576) (#577) (@vchuravy)
- Avoid callgraph recursion due to exception branch in get_global_id (#579) (@vchuravy)
- Remove CPU(static=true) test (#580) (@vchuravy)
- Set SPIR-V to 1.2 (#582) (@vchuravy)
- use POCL with fixes (#589) (@vchuravy)
- use barrier with LOCAL_MEM_FENCE (#591) (@vchuravy)
- Test correct backend in examples test (#597) (@christiangnrd)
- Switch to pocl_jll@v7 (#599) (@vchuravy)
- prevent
get_backend
from overflowing the stack (#602) (@nsajko) - [NFC] Ignore formatting PRs in blame (#604) (@christiangnrd)
- Enable downstream CI for 0.10 (#608) (@vchuravy)
- Disable Float16 on the CPU backend (#609) (@vchuravy)
Closed issues:
v0.9.34
KernelAbstractions v0.9.34
Merged pull requests:
- Bump googleapis/code-suggester from 2 to 4 (#560) (@dependabot[bot])
- Allow opt-out of implicit bounds-checking (#563) (@vchuravy)
- [0.9] Forbid divergent execution of work-group barriers (#564) (@vchuravy)
- Update Changelog in docs (#565) (@vchuravy)
- Fix docs and test for unsafe_indicies=true (#566) (@vchuravy)
- Fix indicies->indices typo everywhere (#567) (@vchuravy)
v0.9.33
KernelAbstractions v0.9.33
Merged pull requests:
v0.9.32
KernelAbstractions v0.9.32
- Clarify the semantics of
KernelAbstractions.copyto!
and addKernelAbstractions.pagelock!
- Add support for multiple devices per backend
Merged pull requests:
- Run Runic after explicit return rule addition (#516) (@fredrikekre)
- Avoid the exception branch in expand (#518) (@vchuravy)
- Allow for ndims query (#551) (@vchuravy)
- Switch Runic CI (#552) (@vchuravy)
- Update quickstart.md (#553) (@Dale-Black)
- support multiple devices per backend (#554) (@vchuravy)
- Document the semantics of copyto! and add pagelock! (#555) (@vchuravy)
Closed issues:
- Add Feature to Select Devices to Execute Kernels On (#458)
v0.9.31
KernelAbstractions v0.9.31
Merged pull requests: