Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Mar 21, 2024. It is now read-only.

Conversation

gevtushenko
Copy link
Collaborator

This PR addresses the following issue by masking inactive threads in match any. The code generation hasn't changed for thread blocks with (BlockThreads % 32) = 0.

@gevtushenko gevtushenko added this to the 2.1.0 milestone Aug 31, 2022
@gevtushenko gevtushenko added the type: bug: functional Does not work as intended. label Aug 31, 2022
gevtushenko added a commit to gevtushenko/thrust that referenced this pull request Aug 31, 2022
@gevtushenko gevtushenko added the testing: gpuCI in progress Started gpuCI testing. label Aug 31, 2022
Copy link
Collaborator

@alliepiper alliepiper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Some suggestions inline.

RankAlgorithm == cub::RadixRankAlgorithm::RADIX_RANK_MATCH_EARLY_COUNTS_ANY ||
RankAlgorithm == cub::RadixRankAlgorithm::RADIX_RANK_MATCH_EARLY_COUNTS_ATOMIC_OR;

if (uses_warp_striped_arrangement)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUB_IF_CONSTEXPR

(in <cub/detail/cpp_compatibility.cuh>)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's a test that's probably optimized by nvcc in any way, I don't see a reason to deteriorate readability. Is there anything I'm missing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to merge the PR. If there's a reason to change this, please let me know and I'll update the code in a follow up PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some compilers will issue a warning when a constant conditional is evaluated in a regular if (...). We should be using the if constexpr macros in public headers for these cases. Since this is just a test, we can fix it later if it becomes an issue.

cub::BFEDigitExtractor<Key> extractor(0, RadixBits);
block_radix_rank(temp_storage).RankKeys(keys, ranks, extractor);

if (uses_warp_striped_arrangement)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

@gevtushenko gevtushenko force-pushed the fix-main/github/block_radix_rank branch from 4735117 to 1b19a77 Compare September 1, 2022 13:51
gevtushenko added a commit to gevtushenko/thrust that referenced this pull request Sep 1, 2022
@gevtushenko gevtushenko merged commit eee0ca9 into NVIDIA:main Sep 2, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

testing: gpuCI in progress Started gpuCI testing. type: bug: functional Does not work as intended.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

BlockRadixRankMatch produces invalid results when warp size does not divide block size

3 participants