Pinned Loading
-
-
SWE-agent
SWE-agent PublicForked from SWE-agent/SWE-agent
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
Python
-
-
terminal-bench
terminal-bench PublicForked from harbor-framework/terminal-bench
A benchmark for LLMs on complicated tasks in the terminal
Python
-
-
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.



