-
Notifications
You must be signed in to change notification settings - Fork 146
Add docstring and warning for disparity between will_item_be_pickled
and is_symbol_pickle
#2548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
will_item_be_pickled
and is_symbol_pickle
result = True | ||
|
||
return norm_meta.WhichOneof("input_type") == "msg_pack_frame" | ||
result |= norm_meta.WhichOneof("input_type") == "msg_pack_frame" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct if the item to be normalized is a POD dict like {"hello": "there"}
? Then I think the norm meta will still say msgpack, but nothing will be pickled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In [11]: dict_data = {"hello": "there"}
In [12]: lib._nvs.write("test", dict_data, recursive_normalizers=True)
Out[12]: VersionedItem(symbol='test', library='test', data=n/a, version=2, metadata=None, host='S3(endpoint=s3.eu-west-1.amazonaws.com, bucket=arcticdb-ci-test-bucket-02)', timestamp=1754503211644392787)
In [13]: lib._nvs.is_symbol_pickled("test")
Out[13]: True
In [14]: lib._nvs.will_item_be_pickled(dict_data, recursive_normalizers=True)
Out[14]: True
Yes it'll msgpack-normalized. Both APIs return true but because of different reasons.
is_symbol_pickled
: as the type of data is not native
will_item_be_pickled
: As it considers msgpack-normalized data "pickled"
if is_recursive_normalize_preferred: | ||
log.warning("As the library setting recursive_normalizers is enabled, the item " | ||
"will be recursively normalized in `write`. " | ||
"However, for backward compatibility, this API will still return True.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't necessarily the library setting, it could be the env var or the argument to this method.
I would also combine with the other call to log.warning
, there is no guarantee these logs will appear near each other in a busy logfile.
I would also add more detail as to why this method has historically returned true even if pickling is not involved (not date_range searchable, queryable, etc).
I think we also discussed being able to disable this log with an env var?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning can be disabled by setting the log level to INFO
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Log level is a global setting though, we want to be able to just disable this message
{"a": [1, 2, 3], "b": {"c": np.arange(24)}, "d": [TestCustomNormalizer()]} # A random item that will be pickled | ||
] | ||
) | ||
def test_will_item_be_pickled_recursive_normalizer(lmdb_version_store_v1, data, capfd): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We stopped using capfd as all the tests it was used for were flaky. If all these tests passed for you locally we can live with the warning messages not being tested
|
||
return norm_meta.WhichOneof("input_type") == "msg_pack_frame" | ||
result |= norm_meta.WhichOneof("input_type") == "msg_pack_frame" | ||
log_warning_message = strtobool(os.getenv("VersionStore.WillItemBePickledWarningMsg", "1")) and log.is_active(_LogLevel.WARN) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Env var should follow our naming convention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be a config instead of env var
…led` and `is_symbol_pickle
efd68d9
to
fee7b04
Compare
Reference Issues/PRs
https://man312219.monday.com/boards/7852509418/pulses/9122332360
What does this implement or fix?
The PR that trying to align
will_item_be_pickled
andis_symbol_pickled
has been scrapped as the change is likely to break users' logic.Instead better docstrings are added for both. Better warning for
will_item_be_pickled
only as doing so foris_symbol_pickle
requires substantatial effortAny other comments?
Checklist
Checklist for code changes...