-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[msan] Unpoison indirect outputs for userspace using memset for large operands #79924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[msan] Unpoison indirect outputs for userspace using memset for large operands #79924
Conversation
Created using spr 1.3.4
@llvm/pr-subscribers-compiler-rt-sanitizer @llvm/pr-subscribers-llvm-transforms Author: Fangrui Song (MaskRay) ChangesModify #77393 to clear shadow memory using The instruction selector lowers the
Full diff: https://github.com/llvm/llvm-project/pull/79924.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
index 2b697557d8a92c..0806d7a5b14527 100644
--- a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
@@ -4559,9 +4559,12 @@ struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
} else {
// ElemTy, derived from elementtype(), does not encode the alignment of
// the pointer. Conservatively assume that the shadow memory is unaligned.
+ // Avoid StoreInst as SizeVal may be large, expanding to many
+ // instructions.
auto [ShadowPtr, _] =
getShadowOriginPtrUserspace(Operand, IRB, IRB.getInt8Ty(), Align(1));
- IRB.CreateAlignedStore(getCleanShadow(ElemTy), ShadowPtr, Align(1));
+ IRB.CreateMemSet(ShadowPtr, ConstantInt::getNullValue(IRB.getInt8Ty()),
+ SizeVal, Align(1));
}
}
diff --git a/llvm/test/Instrumentation/MemorySanitizer/msan_asm_conservative.ll b/llvm/test/Instrumentation/MemorySanitizer/msan_asm_conservative.ll
index 894f76b9b8d32a..86ca697ed9a4c6 100644
--- a/llvm/test/Instrumentation/MemorySanitizer/msan_asm_conservative.ll
+++ b/llvm/test/Instrumentation/MemorySanitizer/msan_asm_conservative.ll
@@ -177,8 +177,8 @@ entry:
}
; CHECK-LABEL: @f_2i_2o_mem
-; USER-CONS: store i32 0, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @id1 to i64), i64 87960930222080) to ptr), align 1
-; USER-CONS: store i32 0, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @id2 to i64), i64 87960930222080) to ptr), align 1
+; USER-CONS: call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @id1 to i64), i64 87960930222080) to ptr), i8 0, i64 4, i1 false)
+; USER-CONS: call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @id2 to i64), i64 87960930222080) to ptr), i8 0, i64 4, i1 false)
; CHECK-CONS: call void @__msan_instrument_asm_store({{.*}}@id1{{.*}}, i64 4)
; CHECK-CONS: call void @__msan_instrument_asm_store({{.*}}@id2{{.*}}, i64 4)
; CHECK: call void asm "", "=*m,=*m,*m,*m,~{dirflag},~{fpsr},~{flags}"(ptr elementtype(i32) @id1, ptr elementtype(i32) @id2, ptr elementtype(i32) @is1, ptr elementtype(i32) @is2)
@@ -196,7 +196,7 @@ entry:
; CHECK-LABEL: @f_1i_1o_memreg
; CHECK: [[IS1_F7:%.*]] = load i32, ptr @is1, align 4
-; USER-CONS: store i32 0, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @id1 to i64), i64 87960930222080) to ptr), align 1
+; USER-CONS: call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @id1 to i64), i64 87960930222080) to ptr), i8 0, i64 4, i1 false)
; CHECK-CONS: call void @__msan_instrument_asm_store({{.*}}@id1{{.*}}, i64 4)
; CHECK: call void @__msan_warning
; CHECK: call i32 asm "", "=r,=*m,r,*m,~{dirflag},~{fpsr},~{flags}"(ptr elementtype(i32) @id1, i32 [[IS1_F7]], ptr elementtype(i32) @is1)
@@ -215,7 +215,7 @@ entry:
}
; CHECK-LABEL: @f_3o_reg_mem_reg
-; USER-CONS: store i32 0, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @id2 to i64), i64 87960930222080) to ptr), align 1
+; USER-CONS: call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @id2 to i64), i64 87960930222080) to ptr), i8 0, i64 4, i1 false)
; CHECK-CONS: call void @__msan_instrument_asm_store(ptr @id2, i64 4)
; CHECK: call { i32, i32 } asm "", "=r,=*m,=r,~{dirflag},~{fpsr},~{flags}"(ptr elementtype(i32) @id2)
@@ -240,7 +240,7 @@ entry:
; CHECK: [[PAIR1_F9:%.*]] = load {{.*}} @pair1
; CHECK: [[C1_F9:%.*]] = load {{.*}} @c1
; CHECK: [[MEMCPY_S1_F9:%.*]] = load {{.*}} @memcpy_s1
-; USER-CONS: store { i32, i32 } zeroinitializer, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @pair2 to i64), i64 87960930222080) to ptr), align 1
+; USER-CONS: call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @pair2 to i64), i64 87960930222080) to ptr), i8 0, i64 8, i1 false)
; CHECK-CONS: call void @__msan_instrument_asm_store({{.*}}@pair2{{.*}}, i64 8)
; CHECK: call void @__msan_warning
; KMSAN: call void @__msan_warning
@@ -257,9 +257,9 @@ entry:
}
; CHECK-LABEL: @f_3i_3o_complex_mem
-; USER-CONS: store { i32, i32 } zeroinitializer, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @pair2 to i64), i64 87960930222080) to ptr), align 1
-; USER-CONS-NEXT: store i8 0, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @c2 to i64), i64 87960930222080) to ptr), align 1
-; USER-CONS-NEXT: store i64 0, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @memcpy_d1 to i64), i64 87960930222080) to ptr), align 1
+; USER-CONS: call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @pair2 to i64), i64 87960930222080) to ptr), i8 0, i64 8, i1 false)
+; USER-CONS-NEXT: call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @c2 to i64), i64 87960930222080) to ptr), i8 0, i64 1, i1 false)
+; USER-CONS-NEXT: call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @memcpy_d1 to i64), i64 87960930222080) to ptr), i8 0, i64 8, i1 false)
; CHECK-CONS: call void @__msan_instrument_asm_store({{.*}}@pair2{{.*}}, i64 8)
; CHECK-CONS: call void @__msan_instrument_asm_store({{.*}}@c2{{.*}}, i64 1)
; CHECK-CONS: call void @__msan_instrument_asm_store({{.*}}@memcpy_d1{{.*}}, i64 8)
|
LGTM in general. But the downside here is that the compiler won't be able to optimize away repeated stores if they are transformed to memset calls. |
Created using spr 1.3.4
TIL. Changed to keep using StoreInst when size <= 32, similar to |
Created using spr 1.3.4
if (Size <= 32) | ||
IRB.CreateAlignedStore(getCleanShadow(ElemTy), ShadowPtr, Align(1)); | ||
else | ||
IRB.CreateMemSet(ShadowPtr, ConstantInt::getNullValue(IRB.getInt8Ty()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will hit interceptor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, we do this all the time here.
auto [ShadowPtr, _] = | ||
getShadowOriginPtrUserspace(Operand, IRB, IRB.getInt8Ty(), Align(1)); | ||
IRB.CreateAlignedStore(getCleanShadow(ElemTy), ShadowPtr, Align(1)); | ||
if (Size <= 32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe some comment wouldn't hurt here to clarify this constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit arbitrary. Perhaps
// The size threshold matches shouldUseBZeroPlusStoresToInitialize for -ftrivial-auto-var-init=zero
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memset will go into interceptor and do a lot of check, see __msan_memset
, so it's more expensive than a regular memset. I can't tell what can be optimal constant here, but I would expect something larger. But I doubt it will make a meaningful difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your previous comment about the interceptor. The committed patch does contain this description:
"The intrinsic, if lowered to libcall, will use the msan interceptor."
Inline asm isn't commonly used:) This patch is for =m
in extended asm, which I believe is more uncommon, and if used, usually with small objects. I guess 32 and 64 won't make a difference.
Modify #77393 to clear shadow memory using
llvm.memset.*
when the sizeis large, similar to
shouldUseBZeroPlusStoresToInitialize
in clang for-ftrivial-auto-var-init=
. The intrinsic, if lowered to libcall, will use the msan interceptor.The instruction selector lowers a
StoreInst
to multiple stores, notutilizing
memset
. When the size is large (e.g.store { [100 x i32] } zeroinitializer, ptr %12, align 1
), thegenerated code will be long (and
CodeGenPrepare::optimizeInst
willeven crash for a huge size).