-
Notifications
You must be signed in to change notification settings - Fork 872
[DO NOT MERGE] Feature/supervised aql value #22034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
Conversation
β¦ervised-aql-value
β¦ervised-aql-value
β¦ervised-aql-value
β¦ervised-aql-value
β¦ervised-aql-value
β¦ervised-aql-value
β¦ervised-aql-value
β¦ervised-aql-value
β¦ervised-aql-value
β¦COMPILE, just pushed to have the current changes)
β¦ervised-aql-value
β¦stead of first byte of payload
β¦ervised-aql-value
β¦ervised-aql-value
β¦ervised-aql-value
β¦angodb into feature/supervised-aql-value
β¦(), fix memoryUsage
β¦ervised-aql-value
β¦angodb into feature/supervised-aql-value
β¦k object in test
β¦ervised-aql-value
β¦ervised-aql-value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: ID Extraction Lacks Resource Monitoring
In the get function with vector of names, when s.isCustom() is true and the path ends with _id, the returned AqlValue from extractIdString doesn't receive the rm (ResourceMonitor) parameter. This causes memory accounting inconsistency when the source is a VPACK_SUPERVISED_SLICE, similar to the bug in getIdAttribute at line 436.
arangod/Aql/AqlValue.cpp#L591-L598
arangodb/arangod/Aql/AqlValue.cpp
Lines 591 to 598 in 789fd96
| return AqlValue{AqlValueHintNull{}}; | |
| } else if (s.isCustom()) { | |
| // _id needs special treatment | |
| if (i + 1 == n) { | |
| // x.y._id | |
| mustDestroy = true; | |
| return AqlValue( | |
| transaction::helpers::extractIdString(&resolver, s, prev)); |
β¦ervised-aql-value
| static inline uint8_t* allocateSupervised( | ||
| arangodb::ResourceMonitor& rm, std::uint64_t len, | ||
| MemoryOriginType mot = MemoryOriginType::New) { | ||
| std::size_t total = kPrefix + static_cast<std::size_t>(len); | ||
| void* base = nullptr; | ||
|
|
||
| // choose allocator based on MemoryOriginType | ||
| if (mot == MemoryOriginType::Malloc) { | ||
| base = std::malloc(total); | ||
| } else { | ||
| base = ::operator new(total); // default (New) | ||
| } | ||
|
|
||
| if (ADB_UNLIKELY(base == nullptr)) { | ||
| THROW_ARANGO_EXCEPTION(TRI_ERROR_OUT_OF_MEMORY); | ||
| } | ||
|
|
||
| *reinterpret_cast<arangodb::ResourceMonitor**>(base) = &rm; | ||
| rm.increaseMemoryUsage(total); | ||
| return reinterpret_cast<uint8_t*>(base); | ||
| } | ||
|
|
||
| static inline void deallocateSupervised( | ||
| uint8_t* base, std::uint64_t len, | ||
| MemoryOriginType mot = MemoryOriginType::New) noexcept { | ||
| if (base == nullptr) { | ||
| return; | ||
| } | ||
| auto* rm = *reinterpret_cast<arangodb::ResourceMonitor**>(base); | ||
| rm->decreaseMemoryUsage(len + static_cast<std::uint64_t>(kPrefix)); | ||
|
|
||
| if (mot == MemoryOriginType::Malloc) { | ||
| std::free(base); | ||
| } else { // MemoryOriginType::New | ||
| ::operator delete(static_cast<void*>(base)); | ||
| } | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these definitions should be moved to the cpp file.
| if (mot == MemoryOriginType::Malloc) { | ||
| base = std::malloc(total); | ||
| } else { | ||
| base = ::operator new(total); // default (New) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this - we should always just allocate with new. The memory origin is only necessary for compatibility with allocations from the velocypack library (which AFAIR uses malloc), so we can release them correctly. But our own allocations should always be done with new.
| if (mot == MemoryOriginType::Malloc) { | ||
| std::free(base); | ||
| } else { // MemoryOriginType::New | ||
| ::operator delete(static_cast<void*>(base)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not be necessary (see above).
| ResourceMonitor* rm = nullptr; | ||
| if (this->type() == VPACK_SUPERVISED_SLICE) { | ||
| rm = this->_data.supervisedSliceMeta.getResourceMonitor(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern is repeated several times - can we just add a function getResourceMonitor?
ResourceMonitor* getResourceMonitor() {
if (type() == VPACK_SUPERVISED_SLICE) {
return _data.supervisedSliceMeta.getResourceMonitor();
}
return nullptr;
}
| case VPACK_MANAGED_STRING: { | ||
| auto s = slice(t); | ||
| builder.add(s); | ||
| } break; | ||
| case VPACK_SUPERVISED_SLICE: { | ||
| builder.add(VPackSlice{_data.supervisedSliceMeta.getPayloadPtr()}); | ||
| } break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just add this to the cases above?
case VPACK_MANAGED_STRING:
case VPACK_SUPERVISED_SLICE: {
auto s = slice(t);
builder.add(s);
} break;
| AqlValue::AqlValue() noexcept { erase(); } | ||
|
|
||
| AqlValue::AqlValue(DocumentData& data) noexcept { | ||
| AqlValue::AqlValue(DocumentData& data, arangodb::ResourceMonitor* rm) noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove the ResourceMonitor parameter here since we don't use it.
| if (_data.aqlValueType == VPACK_INLINE && | ||
| _data.inlineSliceMeta.slice[0] == '\x00') { | ||
| return true; | ||
| } | ||
| if (_data.aqlValueType == VPACK_MANAGED_STRING && | ||
| _data.managedStringMeta.pointer == nullptr) { | ||
| return true; | ||
| } | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the new managed string special handling?
| case T::RANGE: | ||
| return a._data.rangeMeta.range == b._data.rangeMeta.range; | ||
| } | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should never get here - I would add a TRI_ASSERT(false) before the return.
| using T = AqlValue::AqlValueType; | ||
| auto ta = a.type(); | ||
| auto tb = b.type(); | ||
|
|
||
| if (ta == tb) { | ||
| switch (ta) { | ||
| case T::VPACK_INLINE: | ||
| return VPackSlice(a._data.inlineSliceMeta.slice) | ||
| .binaryEquals(VPackSlice(b._data.inlineSliceMeta.slice)); | ||
| case T::VPACK_INLINE_INT64: | ||
| case T::VPACK_INLINE_UINT64: | ||
| case T::VPACK_INLINE_DOUBLE: | ||
| return a._data.longNumberMeta.data.intLittleEndian.val == | ||
| b._data.longNumberMeta.data.intLittleEndian.val; | ||
| case T::VPACK_SLICE_POINTER: | ||
| return a._data.slicePointerMeta.pointer == | ||
| b._data.slicePointerMeta.pointer; | ||
| case T::VPACK_MANAGED_SLICE: | ||
| return a._data.managedSliceMeta.pointer == | ||
| b._data.managedSliceMeta.pointer; | ||
| case T::VPACK_MANAGED_STRING: | ||
| return a._data.managedStringMeta.pointer == | ||
| b._data.managedStringMeta.pointer; | ||
| case T::VPACK_SUPERVISED_SLICE: { | ||
| auto as = VPackSlice(a._data.supervisedSliceMeta.getPayloadPtr()); | ||
| auto bs = VPackSlice(b._data.supervisedSliceMeta.getPayloadPtr()); | ||
| return as.binaryEquals(bs); // ignore monitor* | ||
| } | ||
| case T::RANGE: | ||
| return a._data.rangeMeta.range == b._data.rangeMeta.range; | ||
| } | ||
| return false; | ||
| } | ||
| switch (t) { | ||
| case AqlValue::VPACK_INLINE: | ||
| return VPackSlice(a._data.inlineSliceMeta.slice) | ||
| .binaryEquals(VPackSlice(b._data.inlineSliceMeta.slice)); | ||
| case AqlValue::VPACK_INLINE_INT64: | ||
| case AqlValue::VPACK_INLINE_UINT64: | ||
| case AqlValue::VPACK_INLINE_DOUBLE: | ||
| // equal is equal. sign/endianess does not matter | ||
| return a._data.longNumberMeta.data.intLittleEndian.val == | ||
| b._data.longNumberMeta.data.intLittleEndian.val; | ||
| case AqlValue::VPACK_SLICE_POINTER: | ||
| return a._data.slicePointerMeta.pointer == | ||
| b._data.slicePointerMeta.pointer; | ||
| case AqlValue::VPACK_MANAGED_SLICE: | ||
| return a._data.managedSliceMeta.pointer == | ||
| b._data.managedSliceMeta.pointer; | ||
| case AqlValue::VPACK_MANAGED_STRING: | ||
| return a._data.managedStringMeta.pointer == | ||
| b._data.managedStringMeta.pointer; | ||
| case AqlValue::RANGE: | ||
| return a._data.rangeMeta.range == b._data.rangeMeta.range; | ||
|
|
||
| // different types: allow supervised vs managed content-equality | ||
| auto isSup = [](T t) { return t == T::VPACK_SUPERVISED_SLICE; }; | ||
| auto isMan = [](T t) { | ||
| return t == T::VPACK_MANAGED_SLICE || t == T::VPACK_MANAGED_STRING; | ||
| }; | ||
| if ((isSup(ta) && isMan(tb)) || (isSup(tb) && isMan(ta))) { | ||
| return a.slice(ta).binaryEquals(b.slice(tb)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole equality implementation looks very fishy! It seems we are only comparing pointers instead of the values, which is very weird, but at least it is consistent with the hashing (which also only operates on pointers).
But the new implementation now breaks a fundamental requirement - if two values are equivalent they should have the same hash. Previously this was the case (we only compared pointers), but now a managed and a supervised slice could be equivalent (based on binaryEquals), but have different pointers and thus produce different hashes.
Scope & Purpose
(Please describe the changes in this PR for reviewers, motivation, rationale - mandatory)
Checklist
Related Information
(Please reference tickets / specification / other PRs etc)
Note
Introduce a new AqlValue kind that stores VPack with ResourceMonitor accounting, extend constructors/ops to use it, and add comprehensive unit tests.
AqlValueTypeVPACK_SUPERVISED_SLICEwith ResourceMonitor-aware allocation, cloning, and destruction.string_view,Buffer,Slice,DocumentData,AqlValueHintSliceCopy) to accept optionalResourceMonitor*and create supervised slices.is*),slice(),getTypeString(),length(),toVelocyPack(),destroy(),memoryUsage(), andrequiresDestruction()to handle supervised slices.at(),get(),getKeyAttribute(),getIdAttribute(),getFromAttribute(),getToAttribute(),hasKey()) to optionally copy under supervision and return slice-pointers when not copying.setSupervisedData(),allocateSupervised(),deallocateSupervised(); adjustdata()to return payload; support inCompare,hash, andstd::equal_to(content-equality across managed/supervised).operator==/!=forAqlValue.tests/Aql/AqlValueSupervisedTest.cppcovering construction, memory accounting, accessors, conversions, cloning, comparison, and destruction for supervised slices.tests/Basics/SupervisedBufferTest.cppto use new ctor signatures; include new test intests/CMakeLists.txt.Written by Cursor Bugbot for commit 676a00c. This will update automatically on new commits. Configure here.