HN
Paper
All
Show
Ask
Jobs
Top
Today
Last 7 days
Last months
This year
Statistics
All
Show
Ask
Jobs
Top stories
Today
Last 7 days
Last months
This year
Statistics
Stories by
aranguri
Predicting Rare LLM Failures with 30× Fewer Rollouts
2 points
aranguri
2026-05-13T18:10:38Z
www.lesswrong.com
Safety benchmarks are inflated because models know they're being tested
3 points
aranguri
2026-05-04T20:04:39Z
www.lesswrong.com
Probes trace an emergent jailbreak in OLMo 2 to mislabeled training data
2 points
aranguri
2026-04-29T19:46:49Z
www.lesswrong.com
Seeking mentees: new techniques for model diffing and data attribution
1 points
aranguri
2026-01-10T01:43:42Z
sparai.org
Seeking mentees: richer evals to address reward hacking and eval awareness
1 points
aranguri
2026-01-10T01:42:31Z
sparai.org
Tied Crosscoders: Tracing How Chat LLM Behavior Emerges from Base Model
7 points
aranguri
2025-03-23T16:58:13Z
www.lesswrong.com
Hacker group house in Palo Alto
2 points
aranguri
2021-07-03T01:53:55Z
www.notion.so
Learning community zoom to teach each other new things 1-on-1 (sign in!)
1 points
aranguri
2020-08-25T18:31:05Z
docs.google.com