comparing
SWE-bench Leaderboards
[Introducing **CodeClash**, our new evaluation where LMs compete head to head to write the best codebase!\\ \\ Click here to learn more.](https://codeclash.ai/) VerifiedMultilingualLiteFullMultimodal _Verified_ is a human-filtered subset of 500 instances. We use [mini-SWE-agent](https://github.com
Read the full SWE-bench Leaderboards review ›