Toggle navigation
HN
Paper
All
Show
Ask
Jobs
Top stories
Today
Last 7 days
Last months
This year
Stats
Stories by lieret
Show HN: New eval from SWE-bench team evalutes LMs based on goals not tickets
5 points
lieret
2025-11-05T16:13:16Z
codeclash.ai
Show HN: Randomly switching between LMs at every step boosts SWE-bench score
5 points
lieret
2025-08-20T15:09:32Z
www.swebench.com
GPT-5 on SWE-bench: Cost and performance deep-dive
4 points
lieret
2025-08-08T16:29:14Z
mini-swe-agent.com
Show HN: New SWE-bench leaderboard compares LMs without fancy agent scaffolds
2 points
lieret
2025-07-31T14:30:43Z
www.swebench.com
Show HN: Mini-swe-agent achieves 65% on SWE-bench in 100 lines of python
6 points
lieret
2025-07-25T13:27:29Z
github.com