-
Notifications
You must be signed in to change notification settings - Fork 15k
Open
Milestone
Description
Found in this Fortran test case. We find incorrect answers after loop-vectorize and only see this on aarch64. This is the LLVM IR before and after loop vectorize, with the dependency on the Fortran runtime removed:
- Before loop vectorize (good output)
- After loop vectorize (bad output)
- Opt LV debug output
This is the expected output of the program alongside the output after LV:
Good:
v[90][90] = 40.000000 + 0.000000i
y[90][90] = 21.000000 + 0.000000i
Bad:
v[90][90] = 40.000000 + 40.000000i ;; << real component sum is written to both
y[90][90] = 21.000000 + 0.000000i
You can see the real component is written to both the real and imaginary component of this array. If we replace the two instructions responsible for computing the imaginary component, the test yields the expected output again:
139,141c139,140
< %broadcast.splatinsert13 = insertelement <2 x double> poison, double %47, i64 0, !dbg !8
< %broadcast.splat14 = shufflevector <2 x double> %broadcast.splatinsert13, <2 x double> poison, <2 x i32> zeroinitializer, !dbg !8
< %48 = fadd contract <2 x double> %broadcast.splat12, %broadcast.splat14, !dbg !8
---
> %wide.load.x = load <2 x double>, ptr %46, align 8, !dbg !8
> %48 = fadd contract <2 x double> %wide.load, %wide.load.x, !dbg !8
This godbolt link has the corrected IR:
; this looks wrong
%47 = load double, ptr %46, align 8, !dbg !8
%broadcast.splatinsert13 = insertelement <2 x double> poison, double %47, i64 0, !dbg !8
%broadcast.splat14 = shufflevector <2 x double> %broadcast.splatinsert13, <2 x double> poison, <2 x i32> zeroinitializer, !dbg !8
%48 = fadd contract <2 x double> %broadcast.splat12, %broadcast.splat14, !dbg !8
; Hand-edits which give expected results
;%wide.load.x = load <2 x double>, ptr %46, align 8, !dbg !8
;%48 = fadd contract <2 x double> %wide.load, %wide.load.x, !dbg !8
Note that if we compile the exact same IR pre-LV for a generic x86 architecture we do not see the same behavior:
> opt /home/amancinelli/tmp.iJQ3cjnSSc-before-pass-316.ll -passes=loop-vectorize -S -o /tmp/t.ll && clang /tmp/t.ll && ./a.out
opt: WARNING: failed to create target machine for 'aarch64-unknown-linux-gnu': unable to get target for 'aarch64-unknown-linux-gnu', see --version and --triple.
warning: overriding the module target triple with x86_64-unknown-linux-gnu
[-Woverride-module]
1 warning generated.
v[90][90] = 40.000000 + 0.000000i
y[90][90] = 21.000000 + 0.000000i
Metadata
Metadata
Assignees
Type
Projects
Status
Needs Triage