HN
Paper
All
Show
Ask
Jobs
Top
Today
Last 7 days
Last months
This year
Statistics
All
Show
Ask
Jobs
Top stories
Today
Last 7 days
Last months
This year
Statistics
Stories by
mustaphah
The cost YAGNI was never about
5 points
mustaphah
2026-06-27T22:14:37Z
newsletter.kentbeck.com
Writing code vs. shipping code [pdf]
3 points
mustaphah
2026-06-11T20:46:06Z
www.nber.org
Trust Factory
7 points
mustaphah
2026-06-11T13:51:58Z
newsletter.kentbeck.com
LLMs pass a standard three-party Turing test
3 points
mustaphah
2026-05-20T12:13:04Z
www.pnas.org
The small sample trap in A/B testing
4 points
mustaphah
2026-05-19T09:49:23Z
hadid.dev
Tell HN: Claude two rate limits don't know about each other
2 points
mustaphah
2026-03-13T13:23:21Z
news.ycombinator.com
Enhancing gut-brain communication reversed cognitive decline in aging mice
386 points
mustaphah
2026-03-12T16:38:51Z
med.stanford.edu
Many SWE-bench-Passing PRs would not be merged
278 points
mustaphah
2026-03-11T20:56:52Z
metr.org
AGI is an unscientific myth
4 points
mustaphah
2026-02-26T21:22:54Z
www.tandfonline.com
Web Verbs
1 points
mustaphah
2026-02-22T15:09:22Z
github.com
OpenAI's 5-month experiment: building a product with no human-written code
2 points
mustaphah
2026-02-18T17:02:46Z
openai.com
SkillsBench: Benchmarking how well agent skills work across diverse tasks
364 points
mustaphah
2026-02-16T21:15:56Z
arxiv.org
Evaluating AGENTS.md: are they helpful for coding agents?
232 points
mustaphah
2026-02-16T12:15:39Z
arxiv.org
Curosr: Expanding our long-running agents research preview
3 points
mustaphah
2026-02-14T19:58:30Z
cursor.com
Measuring Time Horizon Using Claude Code and Codex
1 points
mustaphah
2026-02-14T19:23:49Z
metr.org
SWE-ContextBench: context learning benchmark in coding
1 points
mustaphah
2026-02-13T14:50:43Z
arxiv.org
SWE-AGI: benchmarking spec-driven software construction
1 points
mustaphah
2026-02-12T20:31:57Z
arxiv.org
Code Formatting Silently Consumes Your LLM Budget
1 points
mustaphah
2026-02-09T14:59:05Z
arxiv.org
Agent Trace by Cursor: open spec for tracking AI-generated code
1 points
mustaphah
2026-01-31T12:34:59Z
agent-trace.dev
METR releases Time Horizon 1.1 with 34% more tasks
1 points
mustaphah
2026-01-31T10:29:37Z
metr.org
1
2
3
4
5