Codestin Search App

ciflow/trunk/17588

Merge branch 'main' into dont_build_ref_model_tests

Feb 23, 2026
ce2f8d2
zip
tar.gz

ciflow/trunk/17586

Arm backend: Add FP16 tests of models (mv3, ic3)

Add testing of the following models executed in FP16:
 - MobileNetV3
 - InceptionV3

This patch verifies that the Arm backend is able to lower full models in
FP16 to valid TOSA, and execute them with acceptable numerical accuracy.

Signed-off-by: Martin Lindström <[email protected]>
Change-Id: Ice3c6913598d540f7c7a52e403260943a7c8c597

Feb 23, 2026
0a67311
zip
tar.gz

ciflow/trunk/17075

Merge branch 'main' into main

Feb 23, 2026
0804a80
zip
tar.gz

ciflow/trunk/17022

Add MaxPool1D decomposition pass support (#17022)

Summary:
Pull Request resolved: #17022

Implement DecomposeMaxPool1dPass to enable MaxPool1D support on ARM backend
by decomposing max_pool1d to view_copy → max_pool2d → view_copy.

## Implementation Strategy

### Decomposition Approach (Optimal for TOSA/Vela)
The pass decomposes max_pool1d into max_pool2d via view_copy operations:
1. view_copy: (N, C, L) → (N, C, 1, L) - add height dimension
2. max_pool2d: with adapted params [k]→[1,k], [s]→[1,s], [p]→[0,p]
3. view_copy: (N, C, 1, L_out) → (N, C, L_out) - remove height dimension

### Why This Approach is Optimal

1. **view_copy maps to TOSA RESHAPE** which is zero-cost in Vela:
   - Classified as memory_only_ops (Reshape, Squeeze, ExpandDims, Identity)
   - Bypassed entirely when conditions met (NPU-produced, single consumer)
   - Tensor equivalence enables memory aliasing (same address)

2. **TFA Pipeline Placement (before quantization)**:
   - view_copy is in _one_to_one_shared_input_qspec (line 407)
   - max_pool2d is in _one_to_one_shared_input_or_input_act_qspec (line 455)
   - Both get proper SharedQuantizationSpec from annotator automatically

3. **Quantization Handling**:
   - Clear qparams on intermediate view_copy ops (let annotator fill them)
   - Preserve original meta on max_pool2d for proper tracing
   - MAX_POOL2D doesn't need zero-point handling (unlike AVG_POOL2D)

### TOSA/Vela Constraints Validated
- U55: Stride ≤3 ✓, Kernel ≤256x256 ✓
- U85: Extended stride support via accumulator save/restore
- Dilation: Handled by separate DecomposeMaxPool2dPass if needed

Reviewed By: 3l1

Differential Revision: D91760459

Feb 21, 2026
18cedc6
zip
tar.gz

ciflow/trunk/17531

3/x: Wire LoadBackendOptionsMap through Program and Method (#17531)

Summary:

This diff wires the `LoadBackendOptionsMap` through the executor layer, connecting `Program::load_method()` and `Method` to accept and route backend options to delegates.

Key changes:
- `Program::load_method()` now accepts optional `LoadBackendOptionsMap*` parameter
- `Method` stores reference to the options map and looks up options by backend ID
- When initializing delegates, the runtime queries the map for backend-specific options and passes them to `BackendInitContext`

This enables the end-to-end flow:
```
Module::load(method_name, options_map)
  → Program::load_method(..., options_map)
    → Method initialization
      → Backend delegate init with runtime_specs from options_map
```

Reviewed By: larryliu0820

Differential Revision: D92461088

Feb 20, 2026
20af563
zip
tar.gz

ciflow/trunk/17109

Add STABLE softmax decomposition config for Ethos-U (#17109)

Summary:

Current behavior for U55 defaults to UNSTABLE. It seems like this is because mask is not supported on U55, but there does not seem to be an inherent need to couple this. Defaulting to UNSTABLE negatively impacts quantization performance.

This PR adds a new `STABLE` option to `SoftmaxDecompositionConfig` that provides numerically stable softmax decomposition without masked fill decomposition.

The three softmax configs now behave as follows:
- `MASKED`: Stable softmax (with amax subtraction) + masked fill decomposition
- `UNSTABLE`: Unstable softmax (no amax subtraction), no masked fill decomposition
- `STABLE`: Stable softmax (with amax subtraction), no masked fill decomposition

For Ethos-U55 targets, `disable_masked_softmax()` now sets the config to `STABLE` instead of `UNSTABLE`, providing numerically stable softmax while avoiding masked fill decomposition which is not needed for these targets.

Reviewed By: Ninja91

Differential Revision: D92058235

Feb 20, 2026
47e3736
zip
tar.gz

ciflow/trunk/17519

Arm backend: Consolidate simple operator visitors

Signed-off-by: Sebastian Larsson <[email protected]>
Change-Id: I23339f808f1074adea1fafddf90110c04fc5695f

Feb 18, 2026
20787cb
zip
tar.gz

ciflow/nightly/17539

Disable in-place threshold test

Feb 18, 2026
cae06c1
zip
tar.gz

ciflow/trunk/16929

Merge branch 'main' into marlin-sdpa-pass-fix

Feb 16, 2026
ac08e60
zip
tar.gz

ciflow/trunk/17419

Merge branch 'main' into win32-death-tests

Feb 13, 2026
5cd5d6e
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ciflow/trunk/17588

ciflow/trunk/17586

ciflow/trunk/17075

ciflow/trunk/17022

ciflow/trunk/17531

ciflow/trunk/17109

ciflow/trunk/17519

ciflow/nightly/17539

ciflow/trunk/16929

ciflow/trunk/17419

Tags: pytorch/executorch