Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@FarmersWrap
Copy link
Contributor

Reproduced the the results.
NDCG@10 Results: All values match.
Recall@100 Results: robust04:0.4465 vs 0.4474
Recall@1000 Results: robust04: 0.7219 vs 0.7237

@lintool lintool self-requested a review October 7, 2025 16:53
@lintool lintool merged commit fa746f7 into castorini:master Oct 7, 2025
1 check passed
@FarmersWrap
Copy link
Contributor Author

FarmersWrap commented Dec 12, 2025

I might have made a mistake here: I checked the commit before mine, and rerun the test.
This is compared between test(log) and the readme file
❌ Found 18 mismatches:

  • bioasq/average/ndcg@10: log=0.5315 vs md=0.5308 (diff=0.0007)
  • bioasq/interpolation/ndcg@10: log=0.5315 vs md=0.5308 (diff=0.0007)
  • bioasq/normalize/ndcg@10: log=0.5427 vs md=0.5428 (diff=-0.0001)
  • signal1m/average/ndcg@10: log=0.3464 vs md=0.3467 (diff=-0.0003)
  • signal1m/interpolation/ndcg@10: log=0.3464 vs md=0.3467 (diff=-0.0003)
  • signal1m/normalize/ndcg@10: log=0.3626 vs md=0.3624 (diff=0.0002)
  • robust04/rrf/ndcg@10: log=0.5070 vs md=0.5087 (diff=-0.0017)
  • robust04/rrf/r@100: log=0.4465 vs md=0.4474 (diff=-0.0009)
  • robust04/rrf/r@1000: log=0.7219 vs md=0.7237 (diff=-0.0018)
  • arguana/rrf/ndcg@10: log=0.5626 vs md=0.5586 (diff=0.0040)
  • arguana/rrf/r@100: log=0.9893 vs md=0.9879 (diff=0.0014)
  • arguana/average/ndcg@10: log=0.3984 vs md=0.3986 (diff=-0.0002)
  • arguana/interpolation/ndcg@10: log=0.3984 vs md=0.3986 (diff=-0.0002)
  • arguana/normalize/ndcg@10: log=0.5738 vs md=0.5694 (diff=0.0044)
  • arguana/normalize/r@100: log=0.9908 vs md=0.9879 (diff=0.0029)
  • cqadupstack-tex/average/ndcg@10: log=0.2333 vs md=0.2332 (diff=0.0001)
  • cqadupstack-tex/interpolation/ndcg@10: log=0.2333 vs md=0.2332 (diff=0.0001)
  • quora/normalize/ndcg@10: log=0.8858 vs md=0.8859 (diff=-0.0001)

@FarmersWrap
Copy link
Contributor Author

I also checked out this Lily's commit: e7e5f57
The results are identical to mine.

For both of my tests, I haven't rerun the bm25 and bge tests. I took recent bm25 and bge results for fusion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants