Tags: miteshs/iree
Tags
Integrate LLVM to llvm/llvm-project@28d6673e21f7 (iree-org#24530) Signed-off-by: Stefan Schuermans <[email protected]> Signed-off-by: Stefan Schuermans <[email protected]>
[Codegen][Tuner] Add TileAndFuse constraints for conv (iree-org#24526) TF constraints for conv will emit one smt.constraints op for each IGEMM conv and Direct conv. Each separated constraint set is tested to have the same amount of smt solutions as the old tuner constraints. See the SMT-LIB string [comparison](nod-ai/amd-shark-ai@5d51750). Issue: iree-org#23535
[Codegen] Add VectorDistribute constraints for attention (iree-org#24528 ) This PR ports the constraint generation phase for the attention ops from the Python tuner to the compiler-side SMT-emission. Assisted-by: [Claude Code](https://claude.ai/code) --------- Signed-off-by: Bangtian Liu <[email protected]> Co-authored-by: Jakub Kuderski <[email protected]>
[Codegen] Add VectorDistribute constraints for attention (iree-org#24528 ) This PR ports the constraint generation phase for the attention ops from the Python tuner to the compiler-side SMT-emission. Assisted-by: [Claude Code](https://claude.ai/code) --------- Signed-off-by: Bangtian Liu <[email protected]> Co-authored-by: Jakub Kuderski <[email protected]>
[Codegen] Add VectorDistribute constraints for attention (iree-org#24528 ) This PR ports the constraint generation phase for the attention ops from the Python tuner to the compiler-side SMT-emission. Assisted-by: [Claude Code](https://claude.ai/code) --------- Signed-off-by: Bangtian Liu <[email protected]> Co-authored-by: Jakub Kuderski <[email protected]>
[ROCM][Codegen] Add experimental amdgcn SPIR-V path (iree-org#24499) Adds an experimental ROCm amdgcn SPIR-V path that emits HIP-loadable SPIR-V instead of native HSACO, enabling the HIP runtime to JIT compile device code for the target GPU. The main changes are: - `--iree-rocm-use-spirv` flag to select `rocm-spirv-fb` for HIP targets. - ROCDL prepare for SPIR-V: Adjust address spaces, calling conventions, remove AMDGPU-specific function attributes. - ROCMTarget: Serialize LLVM SPIR-V output into a HIP-loadable offload bundle. - Focused lit coverage for lowering, serialization, command-line handling. --------- Signed-off-by: Austin Lu <[email protected]> Co-authored-by: Jakub Kuderski <[email protected]>
[ROCM][Codegen] Add experimental amdgcn SPIR-V path (iree-org#24499) Adds an experimental ROCm amdgcn SPIR-V path that emits HIP-loadable SPIR-V instead of native HSACO, enabling the HIP runtime to JIT compile device code for the target GPU. The main changes are: - `--iree-rocm-use-spirv` flag to select `rocm-spirv-fb` for HIP targets. - ROCDL prepare for SPIR-V: Adjust address spaces, calling conventions, remove AMDGPU-specific function attributes. - ROCMTarget: Serialize LLVM SPIR-V output into a HIP-loadable offload bundle. - Focused lit coverage for lowering, serialization, command-line handling. --------- Signed-off-by: Austin Lu <[email protected]> Co-authored-by: Jakub Kuderski <[email protected]>
[HAL] Refactor HAL executable exports to function IDs (iree-org#24507) Refactor the durable HAL runtime API from export ordinals to executable-local function ids while leaving compiler dialect churn minimal. Function ids are carried as 64-bit C and VM tokens so future backends can use executable-local encodings; current dense-table implementations validate and decode them as indexes. Public rename map: - iree_hal_executable_export_count -> iree_hal_executable_function_count - iree_hal_executable_export_info -> iree_hal_executable_function_info - iree_hal_executable_export_parameters -> iree_hal_executable_function_parameters - iree_hal_executable_lookup_export_by_name -> iree_hal_executable_lookup_function_by_name - iree_hal_executable_export_info_t -> iree_hal_executable_function_info_t - iree_hal_executable_export_parameter_t -> iree_hal_executable_function_parameter_t - iree_hal_executable_export_flags_t -> iree_hal_executable_function_flags_t - IREE_HAL_EXECUTABLE_EXPORT_FLAG_* -> IREE_HAL_EXECUTABLE_FUNCTION_FLAG_* - IREE_HAL_EXECUTABLE_EXPORT_PARAMETER_* -> IREE_HAL_EXECUTABLE_FUNCTION_PARAMETER_* - iree_hal_executable_vtable_t export_count/export_info/export_parameters -> function_count/function_info/function_parameters - iree_hal_executable_vtable_t lookup_export_by_name -> lookup_function_by_name - HAL dispatch export_ordinal/entry_point arguments -> function/function_id Compiler HAL ops and attributes intentionally keep their existing export names in this change. Runtime VM imports now use function_id, and executable.lookup.function is exposed from the HAL and hal_loader modules so callers can resolve stable function names to runtime ids before dispatch.
[WebGPU] Add WebGPU HAL driver, WGSL compiler target, CTS, and sample. ( iree-org#24463) This adds the first end-to-end WebGPU target path for IREE: a compiler backend that emits WGSL executables and a JavaScript-hosted HAL driver that can submit those executables through the browser/Node WebGPU API from a freestanding wasm32 runtime. Most gaps now exist in infrastructure and hosting applications, with the HAL being largely complete. The important product boundary is that this is a WebGPU driver for the Web platform, not an Emscripten port and not a native Dawn HAL. The C runtime owns IREE's HAL object model, synchronization contracts, command recording, executable metadata, and queue ordering. JavaScript owns the ambient WebGPU objects, Promise completion delivery, and the import module that maps integer wasm handles to real GPUAdapter/GPUDevice/GPUBuffer/GPUQueue objects. That split keeps the ABI narrow. All values crossing the wasm boundary are integers or pointers into wasm linear memory. WebGPU objects are represented as uint32 handles in a JS-side table, handle 0 is null, and async WebGPU APIs complete through the JS proactor token ring introduced by the wasm runtime commit. The C side never gets a raw JS object and the JS side does not need to understand HAL resources beyond the declared import ABI. The driver uses an instruction-stream bridge instead of one wasm import per HAL command. HAL command buffers and one-shot queue operations compile into compact uint32 instruction blocks. JavaScript walks those blocks in one bridge call, resolves dynamic bindings from a binding table, reuses static bindings for cached recordings, batches encoder commands, and submits pending GPUCommandBuffers at explicit queue-surface boundaries. This makes the wasm/JS boundary a command-stream boundary instead of a per-command overhead cliff. The runtime queue contract follows WebGPU's actual execution model. CPU-only operations can signal after their wait completes. GPU-submit operations wait, encode/submit work, register queue.onSubmittedWorkDone(), and signal HAL semaphores only when WebGPU reports that submitted work is complete. Queue epochs and async frontiers preserve causal ordering for downstream waits, while submitted-provenance tracking keeps FIFO waits from adding unnecessary host-side round trips. WebGPU does not provide every primitive that IREE's HAL exposes directly. The driver internalizes those gaps instead of pushing them onto callers: fill uses a builtin WGSL compute shader, unaligned copy/update paths fall back to a copy shader, executable loading creates compute pipelines and bind group layouts from WGSL, and command execution presents the usual HAL fill/copy/update/dispatch surface even though WebGPU splits those operations across queue, encoder, and compute-pass APIs. The compiler side lowers through the existing SPIR-V path and translates SPIR-V to WGSL with Tint/Dawn. The serialized executable format is `webgpu-wgsl-fb`: a FlatBuffer containing WGSL shader modules plus per-export metadata such as entry point names, workgroup sizes, binding flags, constant counts, source/debug data, and the information the runtime needs to create pipelines and bind groups. The target is registered as the `webgpu` device and `webgpu-spirv` executable backend. The initial runtime support contract is intentionally narrow. WebGPU exposes one queue per device, so the driver currently routes through a single queue while keeping queue state isolated enough for future queue[N] shaping. The JavaScript inline host can validate WGSL and run CTS-style entry points, but blocking C code cannot make JavaScript Promises settle while the same wasm thread is waiting. CTS expected failures document those blocking-completion cases instead of pretending they are implemented. This commit includes: * A `webgpu` HAL driver with driver/device/allocator/buffer/semaphore/ executable objects, executable cache, FD-backed file helpers, registration module, and public driver creation API. * A C import ABI and JavaScript companion module for WebGPU object handles, adapter/device requests, buffer mapping, command encoding, pipeline creation, bind group creation, command-stream execution, cached recordings, and queue.onSubmittedWorkDone() completion delivery. * A compact WebGPU command ISA and builder that records HAL commands into block-backed uint32 streams with dynamic/static binding slots and automatic encoder begin/end insertion. * Builtin WGSL fill/copy shaders used to provide HAL semantics where WebGPU has no native command or requires stricter alignment than HAL callers expose. * A `webgpu-spirv` compiler plugin that reuses SPIR-V codegen, prepares SPIR-V for WebGPU constraints, translates with Tint/Dawn, and packages WGSL plus executable metadata into `webgpu-wgsl-fb`. * HAL CTS wiring for the wasm32-wasi WebGPU path, including the Node `webgpu` package loader, WASI preopen/output setup, and expected failures for blocking-completion cases that the inline host cannot yet satisfy. * A WebGPU hello-world sample that builds a VMFB, dumps generated WGSL, and validates that WGSL through Dawn's WebGPU implementation. * Build-system integration for Bazel and CMake, including generated CMake targets and explicit selection of the WebGPU SPIR-V compiler target. The CTS coverage exercises the runtime side under wasm32-wasi with the JS WebGPU bridge. The passing coverage includes buffer, command buffer, core, file, and queue CTS groups, with expected failures kept to the operations that require a blocking C wait while JavaScript Promise completions are still pending on the same inline host. Together with the wasm runtime commit below it, this establishes the first coherent WebGPU bring-up slice: IREE can generate WGSL for WebGPU, package it in a HAL executable format, create WebGPU pipelines from that executable, validate a hello-world shader end to end, and run meaningful HAL CTS coverage through the same wasm/JS bridge that applications will use. Future changes will build tooling and samples that run VMFB programs.
[Codegen] Remove deprecated transform.iree.match_callback tests (iree… …-org#24500) Part of iree-org#24466 (sub-task: remove deprecated tests under `compiler/src/iree/compiler/Codegen/Common/test/`). Removes 10 lit tests that depend on the retired `transform.iree.match_callback` op and its supporting machinery (`register_match_callbacks`, `take_first`, `emit_remark`) from `llvm-external-projects/iree-dialects`. - 7 files use the deprecated op directly - 3 driver tests (`batch_matmuls.mlir`, `convolutions.mlir`, `reductions.mlir`) have RUN lines that only invoke the deleted `*_spec.mlir` files - `BUILD.bazel` and `CMakeLists.txt` updated to drop matching srcs/exclude/data entries - Net diff: 12 files changed, 933 deletions(-), 0 insertions(+) - No remaining source-tree references to the deleted files Signed-off-by: Alex-Wengg <[email protected]>
PreviousNext