Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[METRICS SDK] Fix hash collision in MetricAttributes #3322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ThomsonTan
Copy link
Contributor

@ThomsonTan ThomsonTan commented Mar 25, 2025

Fixes #3060

Changes

attributes_hashmap_benchmark shows flat result before and after this change.

For significant contributions please make sure you have completed the following items:

  • CHANGELOG.md updated for non-trivial changes
  • Unit tests have been added
  • Changes in public API reviewed

Copy link

netlify bot commented Mar 25, 2025

Deploy Preview for opentelemetry-cpp-api-docs canceled.

Name Link
🔨 Latest commit 5e2778c
🔍 Latest deploy log https://app.netlify.com/sites/opentelemetry-cpp-api-docs/deploys/67ec47c3ab6ade0008608790

Copy link

codecov bot commented Mar 25, 2025

Codecov Report

Attention: Patch coverage is 95.08197% with 3 lines in your changes missing coverage. Please review.

Project coverage is 89.55%. Comparing base (83ac2ae) to head (5e2778c).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...entelemetry/sdk/metrics/state/attributes_hashmap.h 96.43% 1 Missing ⚠️
...sdk/metrics/state/filtered_ordered_attribute_map.h 92.31% 1 Missing ⚠️
sdk/src/metrics/state/temporal_metric_storage.cc 83.34% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3322      +/-   ##
==========================================
+ Coverage   89.52%   89.55%   +0.03%     
==========================================
  Files         210      210              
  Lines        6526     6502      -24     
==========================================
- Hits         5842     5822      -20     
+ Misses        684      680       -4     
Files with missing lines Coverage Δ
...clude/opentelemetry/sdk/common/attributemap_hash.h 80.00% <ø> (-8.23%) ⬇️
...telemetry/sdk/metrics/state/async_metric_storage.h 94.45% <100.00%> (-0.15%) ⬇️
...ntelemetry/sdk/metrics/state/sync_metric_storage.h 88.24% <100.00%> (+1.88%) ⬆️
...ntelemetry/sdk/metrics/view/attributes_processor.h 85.72% <100.00%> (+0.72%) ⬆️
...rc/metrics/state/filtered_ordered_attribute_map.cc 89.48% <100.00%> (+1.24%) ⬆️
sdk/src/metrics/state/observable_registry.cc 95.35% <ø> (ø)
...entelemetry/sdk/metrics/state/attributes_hashmap.h 96.37% <96.43%> (+4.06%) ⬆️
...sdk/metrics/state/filtered_ordered_attribute_map.h 82.36% <92.31%> (+25.22%) ⬆️
sdk/src/metrics/state/temporal_metric_storage.cc 97.30% <83.34%> (-0.07%) ⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@@ -74,27 +74,6 @@ inline size_t GetHashForAttributeMap(const OrderedAttributeMap &attribute_map)
return seed;
}

// Calculate hash of keys and values of KeyValueIterable, filtered using callback.
inline size_t GetHashForAttributeMap(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not expected to calculate hash on KeyValueIterable by traversing it, because the order of the elements should not affect the hashing. MetricAttributes need to be constructed from it before hash. So remove this function.

@ThomsonTan ThomsonTan marked this pull request as ready for review March 25, 2025 22:16
@ThomsonTan ThomsonTan requested a review from a team as a code owner March 25, 2025 22:16
@lalitb lalitb self-assigned this Mar 25, 2025
@lalitb
Copy link
Member

lalitb commented Mar 26, 2025

@ThomsonTan - Could you also include the benchmark results from before and after in the comments. A slight regression is acceptable, as it's a tradeoff for fixing collision.

@ThomsonTan
Copy link
Contributor Author

@ThomsonTan - Could you also include the benchmark results from before and after in the comments. A slight regression is acceptable, as it's a tradeoff for fixing collision.

Here are the before/after benchmark result. It looks like that the wall time doubled while the CPU time reduced slightly.

  • Before
./measurements_benchmark
2025-03-26T00:43:08+00:00
Running measurements_benchmark
Run on (16 X 2445.43 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 32768 KiB (x1)
Load Average: 1.85, 3.23, 1.73
--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------
BM_MeasurementsTest    2044809 ns       148990 ns         3856
  • After
measurements_benchmark
2025-03-26T00:44:26+00:00
Running measurements_benchmark
Run on (16 X 2445.43 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 32768 KiB (x1)
Load Average: 1.02, 2.64, 1.65
--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------
BM_MeasurementsTest    4287068 ns        95297 ns         1000

@ThomsonTan
Copy link
Contributor Author

@lalitb it seems the major perf gap is from the below original code. It doesn't sort the incoming KeyValueIterator before hashing and query, and the new code does the sort before the hash lookup.

  Aggregation *GetOrSetDefault(const opentelemetry::common::KeyValueIterable &attributes,
                               const AttributesProcessor *attributes_processor,
                               std::function<std::unique_ptr<Aggregation>()> aggregation_callback,
                               size_t hash)
  {
    auto it = hash_map_.find(hash);
    if (it != hash_map_.end())
    {
      return it->second.second.get();
    }

private:
std::unordered_map<size_t, std::pair<MetricAttributes, std::unique_ptr<Aggregation>>> hash_map_;
std::unordered_map<MetricAttributes, std::unique_ptr<Aggregation>, CustomHash> hash_map_;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we store MetricAttributes as a pointer? it's going to be a lot copying on rehashing...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean make the key as std::shared_ptr<MetricAttributes>, it may work with a more complicated custom hash. But I think MetricsAttributes should be created and moved into the std::unordered_map, so copying and rehashing should be in rare.

I remember the the exporting process does make a copy of MetricsAttributes, but there is an optimization for the common scenario. @lalitb any thoughts on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we are already avoiding deep copy with std::move while inserting the MetricsAttributes, also during rehashing, the hashmp will move(not copy) the MetricAttributes objects to their new locations. In general, I do see options for optimization at other places, but we can visit it separate to the PR.

@marcalff
Copy link
Member

Found no issues on this patch (LGTM), but I don't know the metric aggregation code well enough to approve.
Deferring code review to @lalitb , will merge once approved.

@ThomsonTan ThomsonTan added the pr:please-review This PR is ready for review label Mar 28, 2025
Copy link
Member

@lalitb lalitb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @ThomsonTan

@marcalff
Copy link
Member

marcalff commented Apr 1, 2025

@lalitb @ThomsonTan Assuming CI is clean, ok to merge ?

@ThomsonTan
Copy link
Contributor Author

@lalitb @ThomsonTan Assuming CI is clean, ok to merge ?

Yes, it is Ok to merge it if CI is clean again.

@marcalff marcalff added ok-to-merge The PR is ok to merge (has two approves or raised by a maintainer/approver and has one approve) and removed pr:please-review This PR is ready for review labels Apr 1, 2025
@marcalff marcalff merged commit 3bd8de9 into open-telemetry:main Apr 1, 2025
56 checks passed
@ThomsonTan ThomsonTan deleted the fix_metrics_hashmap_collision branch April 2, 2025 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-merge The PR is ok to merge (has two approves or raised by a maintainer/approver and has one approve)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hash collision risk of metric data aggregation
4 participants