HN
Paper
All
Show
Ask
Jobs
Top
Today
Last 7 days
Last months
This year
Statistics
All
Show
Ask
Jobs
Top stories
Today
Last 7 days
Last months
This year
Statistics
Stories by
bisonbear
The Opus 4.7 reasoning curve - Medium is the best default?
1 points
bisonbear
2026-05-13T15:17:55Z
www.stet.sh
GPT-5.5 low vs. medium vs. high vs. xhigh: the reasoning curve on 26 real tasks
2 points
bisonbear
2026-05-08T16:58:30Z
www.stet.sh
GPT-5.5 vs. GPT-5.4 vs. Opus 4.7 on 56 real coding tasks from 2 open source repo
4 points
bisonbear
2026-05-01T16:06:35Z
www.stet.sh
I ran Opus 4.7 vs. Old Opus 4.6 vs. New Opus 4.6 on 28 Zod tasks
2 points
bisonbear
2026-04-17T18:39:23Z
www.stet.sh
Coding evals are broken. CI is green while AI code quality goes unmeasured
1 points
bisonbear
2026-04-15T15:33:47Z
www.stet.sh
Agents.md is the highest-leverage code you're not testing
1 points
bisonbear
2026-04-10T15:01:52Z
www.stet.sh
Your AI coding benchmark is hiding a 2x quality gap
3 points
bisonbear
2026-03-13T17:42:07Z
www.stet.sh
Things I Learned at the Claude Code NYC Meetup
2 points
bisonbear
2026-01-20T19:30:38Z
benr.build
Claude vs. Codex in the Messy Middle
1 points
bisonbear
2026-01-07T15:48:12Z
benr.build
Spacetime as a Neural Network
11 points
bisonbear
2025-12-29T16:55:40Z
benr.build
One agent isn't enough
18 points
bisonbear
2025-12-13T16:31:45Z
benr.build
Context Engineering: The New Skill for Working with AI Agents
1 points
bisonbear
2025-11-05T13:28:34Z
benr.build
The New Math of Building with AI
2 points
bisonbear
2025-10-17T12:50:56Z
benr.build