-
Notifications
You must be signed in to change notification settings - Fork 523
Shard Merge tool #2776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Shard Merge tool #2776
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2776 +/- ##
============================================
- Coverage 70.12% 70.02% -0.10%
Complexity 1316 1316
============================================
Files 186 187 +1
Lines 11922 11939 +17
Branches 1414 1415 +1
============================================
Hits 8360 8360
- Misses 3035 3052 +17
Partials 527 527 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@JMMackenzie I will take a look and test to get this merged soon. @b8zhong has added some tests and I will correct, update as necessary. Thanks for this again 👍 |
I usually use Anserini to build CIFF indexes that I import to other tools like PISA, but I almost always forget to use You could try it on MSMARCO-v1 and MSMARCO-v2 (passages) if you wanted something large, not sure how long it would take for v2 though. Thanks for your help! |
Haus stylin
Actually, I approved before noticing broken tests. Should probably fix before merging... |
I'm not sure why these tests fail - I can't seem to generate any meaningful output. It seems the VM is crashing (I get "The forked VM terminated without properly saying goodbye" from surefire). Someone with more experience may be able to find a solution. |
This adds a tool for merging an index with multiple shards into an index with a single shard.
Help is required to test (write tests) and verify the tool.