Codestin Search App

ronlieb · 2026-06-04T20:20:11Z

No description provided.

…lvm#201369) If the extern variable is constexpr of of non-array type, we should diagnose it as missing an initializer. Otherwise, we diagose a read of non-constexpr variable.

…lvm#185929) Loop Strength Reduce can give different (and worse) results for a loop when it is followed by uses of variables used inside the loop. This is because the uses outside the loop increase the size of the search space, which can lead to using NarrowSearchSpaceByPickingWinnerRegs which often discards the best solution. Solve this by narrowing the search space by merging uses outside the loop with uses inside the loop. This ignores the Kind and AccessTy of the use which can mean that the cost may be inaccurate, but it will give the same cost as if we had just ignored the uses outside of the loop.

…on by `dataFill` (llvm#200202) `omp_target_memset` was initially implemented before the existance of `offload`. Because of this, a slow path was chosen to implement `omp_target_memset`, first allocating memory on the host, calling `memset` on that memory, and then transferring this to the device. Aside from the inefficient way of setting device memory, this also causes a data transfer event for the OpenMP Tools Interface, interfering with the added memset event in OpenMP v6.0. Since offload implements setting data via `dataFill` by now, replace the slow path by just calling `dataFill` instead. This resolves both the inefficiency, and removes the superfluous event dispatched to a tool. Signed-off-by: Jan André Reuter <[email protected]>

…storage. (llvm#200886) Value::setRawBits had inconsistent units: the default value and the size assert treated the parameter as bytes (sizeof(Storage)), while the memcpy treated it as bits (NBits / 8). A caller passing the natural byte count (e.g. sizeof(long long)) ended up copying only sizeof(T)/8 bytes -- one byte for an 8-byte payload, leaving the rest stale. The one in-tree caller compensated by multiplying by 8, hiding the bug. Rename the parameter to NBytes and drop the / 8 so the API name, default, assert, and memcpy all agree on bytes. Update the caller in InterpreterValuePrinter.cpp to pass ElemSize directly. Right-size the Storage::m_RawBits array while we are here: it was sizeof(long double) * 8 bytes, which reads like a bit/byte confusion since the widest typed member of the union is long double itself. The oversized array made sizeof(Value) ~144 bytes on x86_64 instead of ~40, bloating every copy/move of a Value. Add a regression test exercising setRawBits with both an explicit byte count and the default argument. Pre-fix the test fails for both: the explicit-count branch copies 1 byte instead of 8, and the default branch copies sizeof(Storage)/8 bytes instead of the full union width.

This PR modifies regex in error message to match on z/OS: ``` [Errno 129] EDC5129I No such file or directory.: 'temp1.txt' wc: file "missing-file": EDC5129I No such file or directory. cat: does-not-exist: EDC5129I No such file or directory. ```

Implement the functionality to read and parse a pre-parsed perf-script profile generated by perf2bolt's '--profile-format=perfscript' option. The '-ps' option defines the perfscript input profile format. It requires specifying the aggregation type ('--spe', '--ba') if it differs from the default one ('brstack'). Note that the profile has to also be generated using the exact same aggregation type. Examples: For ARM SPE: 1) $ perf2bolt BINARY -p perf.data -o test.text --spe --profile-format=perfscript 2) $ perf2bolt BINARY -o test.fdata -p test.text --spe -ps For Brstack aggregation: 1) $ perf2bolt BINARY -p perf.data -o test.text --profile-format=perfscript 2) $ perf2bolt BINARY -o test.fdata -p test.text -ps

This PR twaeks the clang/test/DebugInfo/line.cpp test to pass on z/OS. The reason the test was failing is that the RUN lines which specify -triple %itanium_abi_triple expands to s390x-ibm-zos when run on z/OS. The IR that is emitted for this triple does not match the patterns expected by the test. This PR tweaks the patterns in the CHECK lines so that the test also passes on z/OS.

… loop" (llvm#201581) This is causing buildbot failures. Reverts llvm#185929

By doing the IR printing inside DXILPrettyPrinter, we have the option to customise what we print and include the info that we collect and generate in DXILDebugInfo.

Fixes a buildbot failure related to FP rounding error in LV debug output.

`TestDAP_restart_console` is already failing on Windows. It reliably crashes (UNRESOLVED) on some Windows version, including inside Docker containers. This is preventing us from enabling pre-merge CI testing for lldb on Windows in llvm#198906. This patch skips the test entirely. See llvm#200840 for more details.

`OnCreateThread` runs from the `DebuggerThread` loop after a `CREATE_THREAD_DEBUG_EVENT`. Each iteration of that loop ends with a `ContinueDebugEvent`, which on Windows resumes every thread in the debuggee that *isn't* individually suspended with `SuspendThread`. If a thread is created while the debuggee is stopped, all the existing threads are suspended expect the new one. After the next ContinueDebugEvent it just runs, while lldb's StateType still reads eStateStopped. This patch suspends the new thread when the debuggee is stopped. This fixes `TestTwoHitsOneActual.py` and `TestBreakOnLambdaCapture.py` when running the test suite with `LLDB_USE_LLDB_SERVER=1`. rdar://178718627

…sions (llvm#198583) Update FuncToEmitC to bail-out before creating invalid EmitC ops for unsupported cases. FuncToEmitC now rejects functions, calls, and returns whose converted result type is `emitc.array`, instead of relying on later `emitc.func`, `emitc.call`, or `emitc.return` verifier failures. This does not add support for returning memrefs from functions. It only makes the existing limitation explicit at the conversion boundary. ## Tests Added negative tests for the standalone conversion pass. This pass marks their source ops illegal, so when a pattern bails-out the pass reports a legalization failure. This is the expected behavior and documents the unsupported cases directly. `convert-to-emitc` is more permissive because it allows partial conversion and does not mark the same source ops illegal, so it can leave unsupported ops unconverted without reporting the same failures. Assisted-by: Codex (refine description). I reviewed all text before submission.

…lvm#201596) The recently-added structured script feature currently relies on DAP-based debuggers, of which the only one currently supported by Dexter is LLDB. In order to prevent the tests that depend on this feature from running for other debuggers, we require LLDB for the script test directory.

For compile time/memory reasons, dag-maps-huge-region is the number of memory instructions at which we create a barrier and reset maps. Previously we'd get to dag-maps-huge-region number of instructions, then add a barrier in the middle of the current set of instructions, and continue processing the second half of remaining instructions. With this change, now we simply add a barrier every time we reach dag-maps-huge-region number of memory instructions, and blow away all previous instructions. So now instead of waiting until we get to 1000 memory operations before creating a barrier for 500 of them, we do it at 500 and do it for all 500. With this change, -dag-maps-huge-region=500 still has addChainDependencies() taking up over half of the codegen pipeline in some cases I looked at, but it's much better than the previous 90%.

…lvm#200814) This patch is to rename ClangExecutable to DriverExecutable and getClangProgramPath to getDriverProgramPath. This makes the name more neutral and less confusing when used in flang.

I looked at llvm/include/llvm/CodeGen/MachineBlockHashInfo.h, BlendedBlockHash function and rewrote failing test. --------- Co-authored-by: mattarde <[email protected]>

…#197316) The majority of these dependencies are available in the [Bazel-Central-Registry](https://github.com/bazelbuild/bazel-central-registry) (BCR) and to improve build performance for bzlmod users, llvm-project should pull from the BCR to consolidate targets.

Part of llvm#185382 Move the test cases to [intrinsics.c](https://github.com/llvm/llvmproject/pull/clang/test/CodeGen/AArch64/neon/intrinsics.c) Removed the test cases from [neon-intrinsics.c](https://github.com/llvm/llvmproject/pull/clang/test/CodeGen/AArch64/neon/intrinsics.c) Removed [neon-across.c](clang/test/CodeGen/AArch64/neon-across.c) --------- Co-authored-by: Andrzej Warzyński <[email protected]>

The documentation of the sentinel attribute was missing, this PR documents the behavior of the sentinel attribute.

We can implement these using combinations of rev, rev8, and ppairoe.*. Rename REV16->REV16_RV64. A hypothetical REV16 on RV32 would have a different encoding like REV and REV8. Long term we should probably custom lower these instead of having complex isel patterns. That would allow additional optimizations. But I think the isel patterns are fine as a starting point.

…lvm#201546) Previously, attempting to select the intrinsic @llvm.aarch64.neon.scalar.uqxtn would cause GlobalISel to fall back to SDAG. This was both due to: 1. RegBankSelect placing the operands on gpr banks. 2. No instruction selection patterns for the intrinsic. Add pattern, and fix RegBankSelect to place operands on the correct banks.

… with invalid iterator types (llvm#201461) Previously, diagnostic notes issued for errors encountered due to invalid iterator types in C++11 range-based for statements reported the range type as the iterator type instead of the invalid iterator type. Now fixed.

…0918) This patch introduces ISA under BHI_CTRL CPUID. The following tech paper is published in May, 2025: [intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html#ibhf](https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html#ibhf) As shown in the paper, The encoding is F3 48 0F 1E F8. It does not need c-intrinsic. --------- Co-authored-by: mattarde <[email protected]>

Out of order deps

Include the AArch64 SME (Scalable Matrix Extension) source files in the compiler-rt builtins library when targeting aarch64. Added a selection based on OS platform to use either Apple or Non-Apple sources.

These two functions do expensive per-regunit work, but are no-ops if there are no Copies, so short-circuit this case.

This change creates new FP-specific binary operations and updates the existing binary operations that previously accepted any arithmetic type to only allow integer and vector-of-integer types. This change is being done to prepare for extended floating-point handling such as strict FP semantics and fast-math handling. It also simplifies the handling of integer overflow flags. Assisted-by: Cursor / claude-opus-4.8

These tests are caused by bugs in clang where arm64e support is not yet complete.

3e44733 a7a53bf

…ion (llvm#201614) flang has supported this for a long time, but it wasn't documented as an extension

…ound" (llvm#201640) Reverts llvm#201502 due to buildbot breakage: https://lab.llvm.org/buildbot/#/builders/187/builds/20579

Problem: LLVM generates `umov w8, v0.h[0]` + `strh w8, [x0]` instead of `str h0, [x0]` when storing vector lane 0 to memory, specifically when SimplifyCFG merges stores across branches -- splitting the extractelement and store into different basic blocks and preventing the existing DAG combine from firing. https://godbolt.org/z/v5G9ohMPa Root cause: SimplifyCFG creates a PHI + merged store in a successor block. SelectionDAG ISel processes each block independently, so it lowers the extract to `UMOV` (GPR) in the predecessor and the store sees only a GPR value via the PHI. Late tail duplication puts the store back in the same block, but the `UMOV` is already baked in. Fix: Added a post-RA peephole in `AArch64LoadStoreOptimizer` (step 6 in `optimizeBlock`) that recognizes `UMOVvi*_idx0` + GPR store patterns and replaces them with direct FPR sub-register stores. The peephole: - Handles all element sizes: i8 (`bsub`), i16 (`hsub`), i32 (`ssub`), i64 (`dsub`) - Correctly updates liveness by clearing intervening kill flags on the vector register - Bails out if the GPR value has other uses, the vector register is clobbered, or the store doesn't kill the GPR Assisted-by: Claude Fixes: llvm#137086 --------- Co-authored-by: Kunal Pathak <[email protected]>

Convert the ten user-facing RST docs under lldb/docs/use/ to MyST Markdown. This is the third batch of an incremental RST -> Markdown migration; PR1 covered the small leaf pages and PR2 covered the contributor-facing docs under resources/. Files: formatting, intel_pt, map, remote, symbolfilejson, symbolication, symbols, troubleshooting, tutorial, variable. Verified by building the docs on origin/main and on this branch with identical sphinx flags and diffing both the warnings and the rendered HTML. After file extension and line numbers are normalized, the warning sets match exactly. Seven of the ten pages are byte-identical. The three that differ (symbolication, tutorial, variable) differ only in CommonMark collapsing two-spaces-after-period to one and MyST renaming auto-numbered footnote IDs (`id6` -> `footnote-1`) plus adding an `<hr>` separator before footnote sections. The diff also surfaced three semantic regressions in the conversion, fixed here: - variable.md lost cross-reference behavior on single-backtick refs to `SBValue` and `SBData`. RST's default role is `any`, so single backticks attempted xrefs; in MyST single backticks are plain code spans. Converted these occurrences to explicit `{any}`...``. - map.md emitted bare `[Section Name]` for the page TOC, which CommonMark treats as an undefined reference shortcut and falls through to literal text. Converted to `[Section Name](#slug)`. - variable.md emitted `[format name][format name]` as a similar undefined reference shortcut. Converted to `[format name](#format- name)` to match the new `(format-name)=` anchor. Context: https://discourse.llvm.org/t/rfc-make-myst-markdown-the-llvm-docs-format-rip-rest/ Assisted-by: Claude

llvm#201646) Since this operation is simply a zero-offset view, attach the FortranObjectViewOpInterface to allow FIR AA to walk this if needed.

Calling `FileManager::GetUniqueIDMapping()` during modular builds gets very expensive if the `FileManager` has seen lots of files. This function is used in two places in the `ASTWriter` to look up `HeaderFileInfo` in `HeaderSearch`. This PR changes the storage of `HeaderFileInfo` from `FileEntry::getUID()`-indexed `std::vector<T>` to `llvm::DenseMap<FileEntryRef, T>`, improving scanning performance by ~2.5%.

…llvm#201509) All of i64, f64, v2i32, v4i16, v8i8 are assigned to the DoubleRegs register class (64-bit register pairs). A bitcast between any two of these types is a machine-level no-op (ie. the same physical register is reinterpreted with a different type). HexagonPatterns.td had NopCast_pat entries for all int-to-int bitcasts within DoubleRegs, and explicit patterns for f64 <-> i64, but was missing patterns for f64 <-> v2i32, f64 <-> v4i16, and f64 <-> v8i8. The same gap existed in IntRegs for f32 <-> v2i16 and f32 <-> v4i8. Without a tableGen pattern for "f64 = bitcast v2i32" node, the instruction selector crashed with: LLVM ERROR: Cannot select: t26: f64 = bitcast t6 t6: v2i32,ch = CopyFromReg t0, Register:v2i32 %2 Fix by adding the five missing NopCast_pat entries. Fixes: llvm#195495

…iles (llvm#201643) Makes it easier to move around crash diagnostics. Reland of llvm#198838 with crash-diagnostics-tar.c and crash-report-crashfile.m fixed.

… byte shift instructions (llvm#201641)

After llvm#199152, CMake failed for me with: ``` CMake Error at cmake/modules/AddLLVM.cmake:2805 (get_target_property): get_target_property() called with non-existent target "llvm-nm". Call Stack (most recent call first): F:/Dev/llvm-project/lldb/source/API/CMakeLists.txt:205 (get_host_tool_path) ``` I'm not sure why it didn't fail in CI or on the buildbots. The fix here is to add llvm-nm before lldb like we do with other projects.

Relands llvm#199528 This implements a new strategy for collecting the template arguments, by relying on the qualifiers and template parameter lists to navigate the template context of out-of-line definitions. This greatly simplifies the signature of that function, by removing a bunch of workarounds, and simpliffying a couple that weren't removed yet. Since this now relies on qualifiers and template parameter lists, this patch expends most of its effort making sure these are placed, transformed and propagated to template instantiations. Also makes the explicit specialization AST nodes stop abusing the template parameter lists by storing it's own template parameter list, creating a dedicated field for them, similar to partial specializations.

Adds a new dep

tbaederr and others added 30 commits June 4, 2026 14:13

[clang][bytecode] Fix a diagnostic difference with extern variables (l…

03127a0

…lvm#201369) If the extern variable is constexpr of of non-array type, we should diagnose it as missing an initializer. Otherwise, we diagose a read of non-constexpr variable.

[FunctionAttrs] Regenerate test checks (NFC) (llvm#201576)

4d2a670

[X86] Add test coverage for llvm#199445 (llvm#201564)

5e0b3c9

Revert "[LSR] Narrow search space by merging users outside and inside…

ba57a01

… loop" (llvm#201581) This is causing buildbot failures. Reverts llvm#185929

[DirectX] Move IR printing to DXILPrettyPrinter (llvm#198318)

011fab8

By doing the IR printing inside DXILPrettyPrinter, we have the option to customise what we print and include the info that we collect and generate in DXILDebugInfo.

[LV][NFC] Fix force-scalable-vectorization-always.ll (llvm#201580)

39dc841

Fixes a buildbot failure related to FP rounding error in LV debug output.

[clang][driver] Rename ClangExecutable and getClangProgramPath (NFC) (l…

2e9f45a

…lvm#200814) This patch is to rename ClangExecutable to DriverExecutable and getClangProgramPath to getDriverProgramPath. This makes the name more neutral and less confusing when used in flang.

[X86] Fix MachineBlockInfo hash for machine-block-hash.mir (llvm#201039)

9d3f50a

I looked at llvm/include/llvm/CodeGen/MachineBlockHashInfo.h, BlendedBlockHash function and rewrote failing test. --------- Co-authored-by: mattarde <[email protected]>

[GlobalISel] Add bitcast chain combine (llvm#200694)

1fe66fc

[Clang][Docs] Documented sentinel attribute (llvm#196088)

8858ddd

The documentation of the sentinel attribute was missing, this PR documents the behavior of the sentinel attribute.

[bazel][NFC] Run buildifier on libc/BUILD.bazel (llvm#201616)

a314c10

Out of order deps

Include AArch64 SME builtins to compiler-rt for Bazel. (llvm#196607)

1e87cdf

Include the AArch64 SME (Scalable Matrix Extension) source files in the compiler-rt builtins library when targeting aarch64. Added a selection based on OS platform to use either Apple or Non-Apple sources.

[MCP] Early exit if no copies (NFC) (llvm#201602)

48f50e8

These two functions do expensive per-regunit work, but are no-ops if there are no Copies, so short-circuit this case.

charles-zablit and others added 19 commits June 4, 2026 18:01

[lldb][windows] enable CI tests (llvm#198906)

e492f11

[lldb] xfail tests for arm64e caused by compiler bugs (llvm#201454)

ff25d31

These tests are caused by bugs in clang where arm64e support is not yet complete.

[gn build] Port commits (llvm#201639)

131ca5c

3e44733 a7a53bf

[flang][docs] Documented c_float128 and c_float128_complex extens…

c193b2d

…ion (llvm#201614) flang has supported this for a long time, but it wasn't documented as an extension

Revert "[SCEV] Fix ScalarEvolution::getBackedgeTakenInfo when L not f…

7d27cf7

…ound" (llvm#201640) Reverts llvm#201502 due to buildbot breakage: https://lab.llvm.org/buildbot/#/builders/187/builds/20579

[flang][acc] Attach FortranObjectViewOpInterface to acc.unwrap_private (

2851820

llvm#201646) Since this operation is simply a zero-offset view, attach the FortranObjectViewOpInterface to allow FIR AA to walk this if needed.

[clang] Add -fcrash-diagnostics-tar for tarball of crash reproducer f…

9908117

…iles (llvm#201643) Makes it easier to move around crash diagnostics. Reland of llvm#198838 with crash-diagnostics-tar.c and crash-report-crashfile.m fixed.

[X86] combineConcatVectorOps - add handling for X86ISD::VSHLDQ\VSRLDQ…

1b85dfd

… byte shift instructions (llvm#201641)

[bazel][lldb] Port a7a53bf (llvm#201660)

9f6b3b3

Adds a new dep

Add missing REQUIRES: asserts to test case which needs it (llvm#201626)

07c318f

[scudo] Log if randomness degrades. (llvm#201482)

1970332

merge main into amd-staging

2cf7f0a

ronlieb requested review from a team, dpalermo, kirthana14m and skganesan008 June 4, 2026 20:20

ronlieb requested review from lamb-j and nicolasvasilache as code owners June 4, 2026 20:20

ronlieb removed request for lamb-j and nicolasvasilache June 4, 2026 20:20

dpalermo approved these changes Jun 4, 2026

View reviewed changes

ronlieb merged commit 16c790e into amd-staging Jun 5, 2026
150 of 161 checks passed

ronlieb deleted the amd/merge/upstream_merge_20260604142231 branch June 5, 2026 02:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merge main into amd-staging#2793

merge main into amd-staging#2793
ronlieb merged 63 commits into
amd-stagingfrom
amd/merge/upstream_merge_20260604142231

ronlieb commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

ronlieb commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants