Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Project for an advanced lab investigating LLM benchmarks from an IR perspective. Instead of focusing on model performance, we evaluated benchmark robustness, identifying which questions truly differentiate models and whether leaderboard rankings reflect real differences or are dominated by easy, high-hubness items.

Notifications You must be signed in to change notification settings

g4ix/advLab1-HITS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

About

Project for an advanced lab investigating LLM benchmarks from an IR perspective. Instead of focusing on model performance, we evaluated benchmark robustness, identifying which questions truly differentiate models and whether leaderboard rankings reflect real differences or are dominated by easy, high-hubness items.

Topics

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published