HN
Paper
All
Show
Ask
Jobs
Top
Today
Last 7 days
Last months
This year
Statistics
All
Show
Ask
Jobs
Top stories
Today
Last 7 days
Last months
This year
Statistics
Stories by
shahules
Cloning Bench: Evaluating AI Agents on Visual Website Cloning
2 points
shahules
2026-04-02T19:46:45Z
github.com
PA bench: Evaluating web agents on real world personal assistant workflows
38 points
shahules
2026-02-25T20:11:37Z
vibrantlabs.com
PA Bench: Evaluating Frontier Models on Multi-Tab Pa Tasks
7 points
shahules
2026-02-19T18:23:48Z
vibrantlabs.com
Show HN: Ragas – Open-source library for evaluating RAG pipelines
121 points
shahules
2024-03-21T15:48:16Z
github.com
Show HN: Ragas – Open-source library for evals and testing RAG systems
15 points
shahules
2024-03-20T15:02:36Z
github.com
1 points
shahules
2023-07-10T17:40:15Z
news.ycombinator.com
1 points
shahules
2023-05-11T17:52:08Z
news.ycombinator.com
1 points
shahules
2023-05-09T03:29:43Z
news.ycombinator.com
1 points
shahules
2023-05-08T16:57:01Z
news.ycombinator.com
1 points
shahules
2023-04-18T14:11:12Z
news.ycombinator.com
Show HN: The rise of open source large language models
5 points
shahules
2023-04-13T15:20:29Z
explodinggradients.com
Show HN: GPT4 vs. GPT3:What you should know
2 points
shahules
2023-03-28T14:52:32Z
explodinggradients.com