Aider
Aider 0.86.2 — open-source AI pair-programming CLI.
Engineering leads adopting a Coding agent inherit a production-critical dependency. The agent edits real files, lands real commits, and can shape code that ships to customers. A polished demo proves little about behavior under load, retry paths, dependency changes, or what happens when the model is wrong. Hlido tests Coding agents in sandboxed git repos against fixed prompt batteries, captures terminal sessions and diffs as signed evidence, and scores Strategic Alpha, Execution Grit, Craft & Soul, and Value Signal.
Browse all 30 →We run install, repo-edit, refactor, test, and failure-recovery prompts in disposable git repos when the product allows it. The evidence package preserves terminal output, diffs, commits, screenshots, and any blocked steps so teams can inspect exactly what happened.
Current top 3 by Laddoo Score across the Coding corpus.
Sortable, filterable list with tier and last-tested date.
| Agent | Score | Tier | Finding | Last tested |
|---|---|---|---|---|
| Aider | 90/100 | VITAL | Aider 0.86.2 — open-source AI pair-programming CLI. Live tested in a sandboxed git repo: it edits files when given a natural-language --message, | 2026-04-26 |
| GitHub Copilot | 90/100 | VITAL | Public-surface review of GitHub Copilot | 2026-05-01 |
| Replit Agent | 90/100 | VITAL | Public-surface review of Replit Agent | 2026-05-01 |
| Sourcegraph Cody | 90/100 | VITAL | Public-surface review of Sourcegraph Cody | 2026-05-01 |
| Tabnine | 90/100 | VITAL | Public-surface review of Tabnine | 2026-05-01 |
| OpenHands | 90/100 | VITAL | Public-surface review of OpenHands | 2026-05-01 |
| Sweep | 90/100 | VITAL | Public-surface review of Sweep | 2026-05-01 |
| Zed AI | 90/100 | VITAL | Public-surface review of Zed AI | 2026-05-01 |
| Augment Code (Intent) | 78/100 | STEADY | Public-surface review of Augment Code (Intent) | 2026-05-01 |
| Cursor | 78/100 | STEADY | Public-surface review of Cursor | 2026-05-01 |
| Cline | 78/100 | STEADY | Public-surface review of Cline | 2026-05-01 |
| Continue | 78/100 | STEADY | Public-surface review of Continue | 2026-05-01 |
| Lovable | 78/100 | STEADY | Public-surface review of Lovable | 2026-05-01 |
| Open Interpreter | 78/100 | STEADY | Public-surface review of Open Interpreter | 2026-05-01 |
| GPT Engineer | 78/100 | STEADY | Public-surface review of GPT Engineer | 2026-05-01 |
| AutoGen Studio | 78/100 | STEADY | Public-surface review of AutoGen Studio | 2026-05-01 |
| SuperAGI | 78/100 | STEADY | Public-surface review of SuperAGI | 2026-05-01 |
| Warp AI | 78/100 | STEADY | Public-surface review of Warp AI | 2026-05-01 |
| Windsurf | 65/100 | FADING | Public-surface review of Windsurf | 2026-05-01 |
| ClaimCheck | 65/100 | FADING | Public-surface review of ClaimCheck | 2026-05-01 |
| Bolt.new | 65/100 | FADING | Public-surface review of Bolt.new | 2026-05-01 |
| Codeium | 65/100 | FADING | Public-surface review of Codeium | 2026-05-01 |
| v0 | 65/100 | FADING | Public-surface review of v0 | 2026-05-01 |
| Devin (Cognition) | 65/100 | FADING | Public-surface review of Devin (Cognition) | 2026-05-01 |
| MetaGPT | 65/100 | FADING | Public-surface review of MetaGPT | 2026-05-01 |
| Plandex | 65/100 | FADING | Public-surface review of Plandex | 2026-05-01 |
| Poolside | 65/100 | FADING | Public-surface review of Poolside | 2026-05-01 |
| AgentGPT | 53/100 | FADING | Public-surface review of AgentGPT | 2026-05-01 |
| Magic | 53/100 | FADING | Public-surface review of Magic | 2026-05-01 |
| Smol Developer | 40/100 | FADING | Public-surface review of Smol Developer | 2026-05-01 |