HN
Paper
All
Show
Ask
Jobs
Top
Today
Last 7 days
Last months
This year
Statistics
All
Show
Ask
Jobs
Top stories
Today
Last 7 days
Last months
This year
Statistics
Stories by
bearseascape
Model Spec Midtraining: Improving How Alignment Training Generalizes
2 points
bearseascape
2026-05-06T20:17:55Z
alignment.anthropic.com
Following the Text Gradient at Scale (2025)
9 points
bearseascape
2026-05-05T01:00:58Z
ai.stanford.edu
Transformers Are Inherently Succinct (2025)
62 points
bearseascape
2026-05-04T20:03:08Z
arxiv.org
Slople – Can you tell real ML papers from AI-generated ones?
3 points
bearseascape
2026-04-07T15:36:47Z
ml5885.github.io
Benchmarking Culture
1 points
bearseascape
2026-03-10T17:29:04Z
www.argmin.net
Why one small American town won't stop stoning its residents to death
2 points
bearseascape
2026-01-15T16:52:57Z
archiveofourown.org
The most complex model we understand [video]
2 points
bearseascape
2025-12-23T03:21:58Z
www.youtube.com
Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs
1 points
bearseascape
2025-12-11T05:07:40Z
arxiv.org
MooseAgent: A LLM Based Multi-Agent Framework for Automating Moose Simulation
13 points
bearseascape
2025-04-14T18:22:48Z
arxiv.org
Automated Researchers Can Subtly Sandbag
2 points
bearseascape
2025-03-27T15:06:16Z
alignment.anthropic.com
Auditing Language Models for Hidden Objectives
1 points
bearseascape
2025-03-27T04:11:03Z
www.anthropic.com
Policy for LLM Writing on LessWrong
2 points
bearseascape
2025-03-27T03:58:18Z
www.lesswrong.com
Towards Understanding Distilled Reasoning Models: A Representational Approach
3 points
bearseascape
2025-03-06T07:23:23Z
arxiv.org
Transformers Learn to Implement Multistep Gradient Descent with Chain of Thought
1 points
bearseascape
2025-03-03T16:35:40Z
arxiv.org
(Mis)Fitting: A Survey of Scaling Laws
2 points
bearseascape
2025-02-27T16:02:16Z
arxiv.org
Resurrecting saturated LLM benchmarks with adversarial encoding
1 points
bearseascape
2025-02-11T15:39:14Z
arxiv.org
Deep Double Descent: Where Bigger Models and More Data Hurt
2 points
bearseascape
2025-02-08T18:18:15Z
openai.com
Value-Based Deep RL Scales Predictably
68 points
bearseascape
2025-02-08T02:36:38Z
arxiv.org