Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[libcxx] Avoid hash key in __hash_table::find() if it is empty. #126837

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

xbcnn
Copy link

@xbcnn xbcnn commented Feb 12, 2025

If the hash table has no buckets yet or it's empty, the find should do fast return end(). Then computing hash key is useless and can be avoided, since it could be expensive for some key types, such as long string.

This is a small optimization but useful in cases like a checklist ( implemented as unordered_set/map) that is mostly empty.

If the hash table has no buckets yet, it's empty and the find will do fast
return end().  Then compute hash key is useless and can be avoided, since
it could be expensive for some key types, such as long string.

This is a small optimization but useful in cases like a checklist (
implemented as unordered_set) that is mostly empty.
@xbcnn xbcnn requested a review from a team as a code owner February 12, 2025 02:26
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Feb 12, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 12, 2025

@llvm/pr-subscribers-libcxx

Author: None (xbcnn)

Changes

If the hash table has no buckets yet, it's empty and the find will do fast return end(). Then compute hash key is useless and can be avoided, since it could be expensive for some key types, such as long string.

This is a small optimization but useful in cases like a checklist ( implemented as unordered_set) that is mostly empty.


Full diff: https://github.com/llvm/llvm-project/pull/126837.diff

1 Files Affected:

  • (modified) libcxx/include/__hash_table (+2-2)
diff --git a/libcxx/include/__hash_table b/libcxx/include/__hash_table
index d7b312f8774fc..a1d06d07f7c8d 100644
--- a/libcxx/include/__hash_table
+++ b/libcxx/include/__hash_table
@@ -1771,9 +1771,9 @@ template <class _Tp, class _Hash, class _Equal, class _Alloc>
 template <class _Key>
 typename __hash_table<_Tp, _Hash, _Equal, _Alloc>::iterator
 __hash_table<_Tp, _Hash, _Equal, _Alloc>::find(const _Key& __k) {
-  size_t __hash  = hash_function()(__k);
   size_type __bc = bucket_count();
   if (__bc != 0) {
+    size_t __hash       = hash_function()(__k);
     size_t __chash      = std::__constrain_hash(__hash, __bc);
     __next_pointer __nd = __bucket_list_[__chash];
     if (__nd != nullptr) {
@@ -1792,9 +1792,9 @@ template <class _Tp, class _Hash, class _Equal, class _Alloc>
 template <class _Key>
 typename __hash_table<_Tp, _Hash, _Equal, _Alloc>::const_iterator
 __hash_table<_Tp, _Hash, _Equal, _Alloc>::find(const _Key& __k) const {
-  size_t __hash  = hash_function()(__k);
   size_type __bc = bucket_count();
   if (__bc != 0) {
+    size_t __hash       = hash_function()(__k);
     size_t __chash      = std::__constrain_hash(__hash, __bc);
     __next_pointer __nd = __bucket_list_[__chash];
     if (__nd != nullptr) {

@xbcnn
Copy link
Author

xbcnn commented Feb 12, 2025

This is my first PR to the LLVM project. Please help to review and comments if any problem.
Thanks.

Copy link
Contributor

@frederick-vs-ja frederick-vs-ja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me!

Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide some benchmarks for this change?

@xbcnn
Copy link
Author

xbcnn commented Feb 12, 2025

First thanks for reviews.

Could you provide some benchmarks for this change?

Sure, but I probably need some guides such as where to put the benchmark src and how to build/run it.

@philnik777
Copy link
Contributor

First thanks for reviews.

Could you provide some benchmarks for this change?

Sure, but I probably need some guides such as where to put the benchmark src and how to build/run it.

Benchmarks are in libcxx/test/benchmarks. If there are none for this you'll have to write your own. You can run benchmarks just like you run tests with lit, except that you want to pass --param=optimize=speed --show-all to optimize the benchmark and show the output.

@xbcnn
Copy link
Author

xbcnn commented Feb 13, 2025

I add a separate benchmark: libcxx/test/benchmarks/containers/associative/hash_table_find.bench.cpp, since I need find on empty unordered_set.

I pre-generate 32K random strings(each 32-128 characters long), and do find iteration ranging from 1024~32768.
lit command: build/bin/llvm-lit --param=optimize=speed --show-all -sv build/runtimes/runtimes-bins/libcxx/test/benchmarks/containers/associative/hash_table_find.bench.cpp

With the opt:

# | ------------------------------------------------------------------------------------------
# | Benchmark                                                Time             CPU   Iterations
# | ------------------------------------------------------------------------------------------
# | BM_UnorderedSet_Find_EmptySet/long_string/1024       12276 ns        12276 ns        56892
# | BM_UnorderedSet_Find_EmptySet/long_string/2048       24622 ns        24622 ns        28436
# | BM_UnorderedSet_Find_EmptySet/long_string/4096       48973 ns        48972 ns        14249
# | BM_UnorderedSet_Find_EmptySet/long_string/8192       98178 ns        98175 ns         7116
# | BM_UnorderedSet_Find_EmptySet/long_string/16384     195976 ns       195965 ns         3569
# | BM_UnorderedSet_Find_EmptySet/long_string/32768     391325 ns       391318 ns         1782
# | BM_UnorderedSet_Find/long_string/1024               190627 ns       190621 ns         3672
# | BM_UnorderedSet_Find/long_string/2048               379007 ns       379003 ns         1847
# | BM_UnorderedSet_Find/long_string/4096               755937 ns       755924 ns          926
# | BM_UnorderedSet_Find/long_string/8192              1507339 ns      1507314 ns          464
# | BM_UnorderedSet_Find/long_string/16384             3019197 ns      3019036 ns          231
# | BM_UnorderedSet_Find/long_string/32768             6041986 ns      6041574 ns          116

Without the opt:

# | ------------------------------------------------------------------------------------------
# | Benchmark                                                Time             CPU   Iterations
# | ------------------------------------------------------------------------------------------
# | BM_UnorderedSet_Find_EmptySet/long_string/1024      163052 ns       163048 ns         4292
# | BM_UnorderedSet_Find_EmptySet/long_string/2048      324969 ns       324960 ns         2151
# | BM_UnorderedSet_Find_EmptySet/long_string/4096      647079 ns       647064 ns         1081
# | BM_UnorderedSet_Find_EmptySet/long_string/8192     1291289 ns      1291263 ns          542
# | BM_UnorderedSet_Find_EmptySet/long_string/16384    2592642 ns      2592585 ns          270
# | BM_UnorderedSet_Find_EmptySet/long_string/32768    5181394 ns      5181302 ns          135
# | BM_UnorderedSet_Find/long_string/1024               187192 ns       187189 ns         3740
# | BM_UnorderedSet_Find/long_string/2048               371351 ns       371346 ns         1883
# | BM_UnorderedSet_Find/long_string/4096               740937 ns       740902 ns          945
# | BM_UnorderedSet_Find/long_string/8192              1479457 ns      1479432 ns          473
# | BM_UnorderedSet_Find/long_string/16384             2962292 ns      2962147 ns          236
# | BM_UnorderedSet_Find/long_string/32768             5923977 ns      5923911 ns          118

On empty set, it's about 10+ times gains since no hash key computed.
On non-empty set, they're very close.

@philnik777
Copy link
Contributor

I think we could update associative_container_benchmarks.h instead to also run with zero elements. It looks like that's a useful metric. @ldionne any thoughts?

Copy link

github-actions bot commented Feb 13, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@xbcnn
Copy link
Author

xbcnn commented Feb 14, 2025

We could go ahead to double check size() != 0 before do actual lookup operations. Hash table could be already have buckets but still empty.

@xbcnn
Copy link
Author

xbcnn commented Feb 17, 2025

Updated benchmarks:

Opt with size() check

# | --------------------------------------------------------------------------------------------------
# | Benchmark                                                        Time             CPU   Iterations
# | --------------------------------------------------------------------------------------------------
# | BM_UnorderedSet_Find_EmptyNoBuckets/long_string/1024         12121 ns        12120 ns        57902
# | BM_UnorderedSet_Find_EmptyNoBuckets/long_string/2048         24171 ns        24170 ns        28959
# | BM_UnorderedSet_Find_EmptyNoBuckets/long_string/4096         48279 ns        48277 ns        14452
# | BM_UnorderedSet_Find_EmptyNoBuckets/long_string/8192         96207 ns        96204 ns         7221
# | BM_UnorderedSet_Find_EmptyNoBuckets/long_string/16384       193415 ns       193404 ns         3611
# | BM_UnorderedSet_Find_EmptyNoBuckets/long_string/32768       384792 ns       384784 ns         1820
# | BM_UnorderedSet_Find_EmptyWithBuckets/long_string/1024       12823 ns        12822 ns        54793
# | BM_UnorderedSet_Find_EmptyWithBuckets/long_string/2048       25686 ns        25683 ns        27212
# | BM_UnorderedSet_Find_EmptyWithBuckets/long_string/4096       51277 ns        51275 ns        13573
# | BM_UnorderedSet_Find_EmptyWithBuckets/long_string/8192      102201 ns       102200 ns         6820
# | BM_UnorderedSet_Find_EmptyWithBuckets/long_string/16384     204915 ns       204908 ns         3405
# | BM_UnorderedSet_Find_EmptyWithBuckets/long_string/32768     410902 ns       410897 ns         1707
# | BM_UnorderedSet_Find_NonEmpty/long_string/1024              191005 ns       190999 ns         3666
# | BM_UnorderedSet_Find_NonEmpty/long_string/2048              379653 ns       379640 ns         1844
# | BM_UnorderedSet_Find_NonEmpty/long_string/4096              756648 ns       756610 ns          925
# | BM_UnorderedSet_Find_NonEmpty/long_string/8192             1510045 ns      1509964 ns          464
# | BM_UnorderedSet_Find_NonEmpty/long_string/16384            3026213 ns      3026010 ns          231
# | BM_UnorderedSet_Find_NonEmpty/long_string/32768            6045257 ns      6044974 ns          116

No opt

# | --------------------------------------------------------------------------------------------------
# | Benchmark                                                        Time             CPU   Iterations
# | --------------------------------------------------------------------------------------------------
# | BM_UnorderedSet_Find_EmptyNoBuckets/long_string/1024        162638 ns       162636 ns         4304
# | BM_UnorderedSet_Find_EmptyNoBuckets/long_string/2048        323928 ns       323921 ns         2159
# | BM_UnorderedSet_Find_EmptyNoBuckets/long_string/4096        645449 ns       645442 ns         1083
# | BM_UnorderedSet_Find_EmptyNoBuckets/long_string/8192       1288309 ns      1288282 ns          544
# | BM_UnorderedSet_Find_EmptyNoBuckets/long_string/16384      2585671 ns      2585642 ns          271
# | BM_UnorderedSet_Find_EmptyNoBuckets/long_string/32768      5299328 ns      5299122 ns          135
# | BM_UnorderedSet_Find_EmptyWithBuckets/long_string/1024      169348 ns       169346 ns         4134
# | BM_UnorderedSet_Find_EmptyWithBuckets/long_string/2048      337295 ns       337280 ns         2075
# | BM_UnorderedSet_Find_EmptyWithBuckets/long_string/4096      671712 ns       671700 ns         1043
# | BM_UnorderedSet_Find_EmptyWithBuckets/long_string/8192     1340185 ns      1340169 ns          523
# | BM_UnorderedSet_Find_EmptyWithBuckets/long_string/16384    2689863 ns      2689757 ns          260
# | BM_UnorderedSet_Find_EmptyWithBuckets/long_string/32768    5378439 ns      5378379 ns          130
# | BM_UnorderedSet_Find_NonEmpty/long_string/1024              186529 ns       186524 ns         3751
# | BM_UnorderedSet_Find_NonEmpty/long_string/2048              370669 ns       370665 ns         1888
# | BM_UnorderedSet_Find_NonEmpty/long_string/4096              739087 ns       739044 ns          947
# | BM_UnorderedSet_Find_NonEmpty/long_string/8192             1476207 ns      1476178 ns          474
# | BM_UnorderedSet_Find_NonEmpty/long_string/16384            2953631 ns      2953560 ns          237
# | BM_UnorderedSet_Find_NonEmpty/long_string/32768            5902677 ns      5902474 ns          118

@xbcnn xbcnn requested review from philnik777 February 17, 2025 09:58
@xbcnn
Copy link
Author

xbcnn commented Feb 20, 2025

Hi @philnik777 @ldionne @frederick-vs-ja

I've added size() check along with buckets check and also updated the benchmarks.
Please help review. Thanks!

@xbcnn xbcnn changed the title [libcxx] Avoid hash key in __hash_table::find() if no buckets yet. [libcxx] Avoid hash key in __hash_table::find() if it is empty. Mar 6, 2025
@xbcnn xbcnn requested a review from frederick-vs-ja March 6, 2025 09:10
@xbcnn
Copy link
Author

xbcnn commented Apr 30, 2025

Hi @philnik777 @frederick-vs-ja
Just checking in to see if there’s anything I can do to help move this PR forward. It’s been open for some time, and I’d love to get your feedback. Let me know if changes are needed! Thanks for your time! 🙏

Copy link
Contributor

@frederick-vs-ja frederick-vs-ja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. @philnik777

@@ -0,0 +1,97 @@
//===----------------------------------------------------------------------===//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update associative_container_benchmarks.h to include zero-sized benchmarks (where appropriate) instead.

@@ -1771,9 +1771,9 @@ template <class _Tp, class _Hash, class _Equal, class _Alloc>
template <class _Key>
typename __hash_table<_Tp, _Hash, _Equal, _Alloc>::iterator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not attached to this file: Please add the before/after benchmarks in the commit message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants