[clang-repl] Fix Value::setRawBits unit confusion and right-size raw storage.#200886
Conversation
…storage Value::setRawBits had inconsistent units: the default value and the size assert treated the parameter as bytes (sizeof(Storage)), while the memcpy treated it as bits (NBits / 8). A caller passing the natural byte count (e.g. sizeof(long long)) ended up copying only sizeof(T)/8 bytes -- one byte for an 8-byte payload, leaving the rest stale. The one in-tree caller compensated by multiplying by 8, hiding the bug. Rename the parameter to NBytes and drop the / 8 so the API name, default, assert, and memcpy all agree on bytes. Update the caller in InterpreterValuePrinter.cpp to pass ElemSize directly. Right-size the Storage::m_RawBits array while we are here: it was sizeof(long double) * 8 bytes, which reads like a bit/byte confusion since the widest typed member of the union is long double itself. The oversized array made sizeof(Value) ~144 bytes on x86_64 instead of ~40, bloating every copy/move of a Value. Add a regression test exercising setRawBits with both an explicit byte count and the default argument. Pre-fix the test fails for both: the explicit-count branch copies 1 byte instead of 8, and the default branch copies sizeof(Storage)/8 bytes instead of the full union width.
89caae4 to
926d06e
Compare
|
@llvm/pr-subscribers-clang Author: Vassil Vassilev (vgvassilev) ChangesValue::setRawBits had inconsistent units: the default value and the size assert treated the parameter as bytes (sizeof(Storage)), while the memcpy treated it as bits (NBits / 8). A caller passing the natural byte count (e.g. sizeof(long long)) ended up copying only sizeof(T)/8 bytes -- one byte for an 8-byte payload, leaving the rest stale. The one in-tree caller compensated by multiplying by 8, hiding the bug. Rename the parameter to NBytes and drop the / 8 so the API name, default, assert, and memcpy all agree on bytes. Update the caller in InterpreterValuePrinter.cpp to pass ElemSize directly. Right-size the Storage::m_RawBits array while we are here: it was sizeof(long double) * 8 bytes, which reads like a bit/byte confusion since the widest typed member of the union is long double itself. The oversized array made sizeof(Value) ~144 bytes on x86_64 instead of ~40, bloating every copy/move of a Value. Add a regression test exercising setRawBits with both an explicit byte count and the default argument. Pre-fix the test fails for both: the explicit-count branch copies 1 byte instead of 8, and the default branch copies sizeof(Storage)/8 bytes instead of the full union width. Full diff: https://github.com/llvm/llvm-project/pull/200886.diff 4 Files Affected:
diff --git a/clang/include/clang/Interpreter/Value.h b/clang/include/clang/Interpreter/Value.h
index b91301e6096eb..23ef123ded8ee 100644
--- a/clang/include/clang/Interpreter/Value.h
+++ b/clang/include/clang/Interpreter/Value.h
@@ -98,7 +98,7 @@ class REPL_EXTERNAL_VISIBILITY Value {
REPL_BUILTIN_TYPES
#undef X
void *m_Ptr;
- unsigned char m_RawBits[sizeof(long double) * 8]; // widest type
+ unsigned char m_RawBits[sizeof(long double)]; // widest typed member
};
public:
@@ -140,7 +140,10 @@ class REPL_EXTERNAL_VISIBILITY Value {
void *getPtr() const;
void setPtr(void *Ptr) { Data.m_Ptr = Ptr; }
- void setRawBits(void *Ptr, unsigned NBits = sizeof(Storage));
+ /// Copy `NBytes` bytes from `Ptr` into the raw storage. Default copies
+ /// the full Storage width. Used by the value printer to read a single
+ /// array element through a typed lens without an extra heap allocation.
+ void setRawBits(void *Ptr, unsigned NBytes = sizeof(Storage));
#define X(type, name) \
void set##name(type Val) { Data.m_##name = Val; } \
diff --git a/clang/lib/Interpreter/InterpreterValuePrinter.cpp b/clang/lib/Interpreter/InterpreterValuePrinter.cpp
index 1754e7812469a..79f1e2b6571c6 100644
--- a/clang/lib/Interpreter/InterpreterValuePrinter.cpp
+++ b/clang/lib/Interpreter/InterpreterValuePrinter.cpp
@@ -204,7 +204,7 @@ std::string Interpreter::ValueDataToString(const Value &V) const {
if (ElemTy->isBuiltinType()) {
// Single dim arrays, advancing.
uintptr_t Offset = (uintptr_t)V.getPtr() + Idx * ElemSize;
- InnerV.setRawBits((void *)Offset, ElemSize * 8);
+ InnerV.setRawBits((void *)Offset, ElemSize);
} else {
// Multi dim arrays, position to the next dimension.
size_t Stride = ElemCount / N;
diff --git a/clang/lib/Interpreter/Value.cpp b/clang/lib/Interpreter/Value.cpp
index d4c9d51ffcb61..b985361ed748a 100644
--- a/clang/lib/Interpreter/Value.cpp
+++ b/clang/lib/Interpreter/Value.cpp
@@ -229,9 +229,9 @@ void *Value::getPtr() const {
return Data.m_Ptr;
}
-void Value::setRawBits(void *Ptr, unsigned NBits /*= sizeof(Storage)*/) {
- assert(NBits <= sizeof(Storage) && "Greater than the total size");
- memcpy(/*dest=*/Data.m_RawBits, /*src=*/Ptr, /*nbytes=*/NBits / 8);
+void Value::setRawBits(void *Ptr, unsigned NBytes /*= sizeof(Storage)*/) {
+ assert(NBytes <= sizeof(Storage) && "Greater than the total size");
+ memcpy(/*dest=*/Data.m_RawBits, /*src=*/Ptr, /*nbytes=*/NBytes);
}
QualType Value::getType() const {
diff --git a/clang/unittests/Interpreter/InterpreterTest.cpp b/clang/unittests/Interpreter/InterpreterTest.cpp
index 9ff9092524d21..2df29b3d5def5 100644
--- a/clang/unittests/Interpreter/InterpreterTest.cpp
+++ b/clang/unittests/Interpreter/InterpreterTest.cpp
@@ -421,6 +421,37 @@ TEST_F(InterpreterTest, Value) {
EXPECT_STREQ(prettyPrint.c_str(), "(D) (One) : unsigned int 1\n");
}
+// Regression: Value::setRawBits's NBytes parameter must be interpreted as a
+// byte count end-to-end. Before this was fixed, the parameter was named
+// NBits and the memcpy divided by 8, so a caller passing sizeof(T) (the
+// natural byte count) ended up copying only sizeof(T)/8 bytes -- leaving
+// the upper bytes uninitialised. The only in-tree caller compensated by
+// multiplying by 8, hiding the bug.
+TEST_F(InterpreterTest, ValueSetRawBitsCopiesByteCount) {
+ std::vector<const char *> Args;
+ std::unique_ptr<Interpreter> Interp = createInterpreter(Args);
+
+ // Explicit byte count: writing sizeof(long long) bytes must round-trip
+ // every byte. Pre-fix this copied 1 byte (8 / 8) and left the upper 7
+ // bytes stale.
+ Value V;
+ llvm::cantFail(Interp->ParseAndExecute("long long x = 0; x", &V));
+ ASSERT_EQ(V.getKind(), Value::K_LongLong);
+ long long Src = 0x0123456789ABCDEFLL;
+ V.setRawBits(&Src, sizeof(Src));
+ EXPECT_EQ(V.getLongLong(), Src);
+
+ // Default NBytes argument copies sizeof(Storage). Pre-fix this copied
+ // sizeof(Storage) / 8 bytes, dropping the high half of an 8-byte payload.
+ Value V2;
+ llvm::cantFail(Interp->ParseAndExecute("long long y = 0; y", &V2));
+ ASSERT_EQ(V2.getKind(), Value::K_LongLong);
+ unsigned char Buf[sizeof(long double)] = {};
+ std::memcpy(Buf, &Src, sizeof(Src));
+ V2.setRawBits(Buf);
+ EXPECT_EQ(V2.getLongLong(), Src);
+}
+
TEST_F(InterpreterTest, TranslationUnit_CanonicalDecl) {
std::vector<const char *> Args;
std::unique_ptr<Interpreter> Interp = createInterpreter(Args);
|
AaronBallman
left a comment
There was a problem hiding this comment.
Should this come with a release note?
| #undef X | ||
| void *m_Ptr; | ||
| unsigned char m_RawBits[sizeof(long double) * 8]; // widest type | ||
| unsigned char m_RawBits[sizeof(long double)]; // widest typed member |
There was a problem hiding this comment.
I think this is still a problem, but a different problem -- this doesn't handle _BitInt properly, nor __int128, etc. and the size of a long double is quite platform-specific.
There was a problem hiding this comment.
Yes, you are correct, but I will address that in a separate PR.
This is more like a memory optimization I don't think we need to put anything in the readme. |
Value::setRawBits had inconsistent units: the default value and the size assert treated the parameter as bytes (sizeof(Storage)), while the memcpy treated it as bits (NBits / 8). A caller passing the natural byte count (e.g. sizeof(long long)) ended up copying only sizeof(T)/8 bytes -- one byte for an 8-byte payload, leaving the rest stale. The one in-tree caller compensated by multiplying by 8, hiding the bug.
Rename the parameter to NBytes and drop the / 8 so the API name, default, assert, and memcpy all agree on bytes. Update the caller in InterpreterValuePrinter.cpp to pass ElemSize directly.
Right-size the Storage::m_RawBits array while we are here: it was sizeof(long double) * 8 bytes, which reads like a bit/byte confusion since the widest typed member of the union is long double itself. The oversized array made sizeof(Value) ~144 bytes on x86_64 instead of ~40, bloating every copy/move of a Value.
Add a regression test exercising setRawBits with both an explicit byte count and the default argument. Pre-fix the test fails for both: the explicit-count branch copies 1 byte instead of 8, and the default branch copies sizeof(Storage)/8 bytes instead of the full union width.