Replace fastutil with HPPC #2945
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is addresses issue #2937 . I ran a JMH microbenchmark on deletes.contains() line as it runs for each tweet processed.
Speed Comparison:
Not included here, but in a JVM benchmark, Java HashSet had a much high memory consumption and is inconsistent.
Size comparison:
Since com.carrotsearch.hppc dependency is small and has the relative performance of fastutil, it is a good replacement. Also, com.carrotsearch.hppc is more stable than Lucene's org.apache.lucene.internal.hppc.LongHashSet since it is public library.