You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Beyond One World — A benchmark for testing how well LLMs role-play version-specific characters (e.g., superheroes across universes). Covers 30 heroes and 90 canon variants through two tasks: Canon …