Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@FarmersWrap
Copy link
Contributor

We almost always get better results now. Some worse results in nDCG10 are less, but only signal1m(average delta=0.002), arguana(average delta=0.002), quora(average delta=0.001)

I don't know when the change happened, but I dont change the logic for averaging and rrf

@FarmersWrap FarmersWrap marked this pull request as ready for review December 12, 2025 04:56
@FarmersWrap
Copy link
Contributor Author

Results: 331/348 values match

❌ Found 17 mismatches:

  • bioasq/average/ndcg@10: log=0.5315 vs md=0.5308 (diff=0.0007)
  • bioasq/interpolation/ndcg@10: log=0.5315 vs md=0.5308 (diff=0.0007)
  • bioasq/normalize/ndcg@10: log=0.5427 vs md=0.5428 (diff=-0.0001)
  • nfcorpus/normalize/ndcg@10: log=0.3782 vs md=0.3657 (diff=0.0125)
  • nfcorpus/normalize/r@100: log=0.3382 vs md=0.3288 (diff=0.0094)
  • signal1m/average/ndcg@10: log=0.3464 vs md=0.3467 (diff=-0.0003)
  • signal1m/interpolation/ndcg@10: log=0.3464 vs md=0.3467 (diff=-0.0003)
  • signal1m/normalize/ndcg@10: log=0.3626 vs md=0.3624 (diff=0.0002)
  • arguana/average/ndcg@10: log=0.3984 vs md=0.3986 (diff=-0.0002)
  • arguana/interpolation/ndcg@10: log=0.3984 vs md=0.3986 (diff=-0.0002)
  • arguana/normalize/ndcg@10: log=0.5738 vs md=0.5694 (diff=0.0044)
  • arguana/normalize/r@100: log=0.9908 vs md=0.9879 (diff=0.0029)
  • cqadupstack-english/normalize/ndcg@10: log=0.4678 vs md=0.4671 (diff=0.0007)
  • cqadupstack-english/normalize/r@100: log=0.7436 vs md=0.7429 (diff=0.0007)
  • cqadupstack-tex/average/ndcg@10: log=0.2333 vs md=0.2332 (diff=0.0001)
  • cqadupstack-tex/interpolation/ndcg@10: log=0.2333 vs md=0.2332 (diff=0.0001)
  • quora/normalize/ndcg@10: log=0.8858 vs md=0.8859 (diff=-0.0001)

@lintool lintool merged commit 68311a1 into castorini:master Dec 12, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants