Toggle navigation
HN
Paper
All
Show
Ask
Jobs
Top stories
Today
Last 7 days
Last months
This year
Stats
Stories by gpjt
Writing an LLM from scratch, part 32d – Interventions: adding attention bias
5 points
gpjt
2026-02-07T00:12:18Z
www.gilesthomas.com
Writing an LLM from scratch, part 32c – Interventions: removing dropout
1 points
gpjt
2026-02-05T23:39:19Z
www.gilesthomas.com
Writing an LLM from scratch, part 32B – Interventions: gradient clipping
2 points
gpjt
2026-02-05T01:22:01Z
www.gilesthomas.com
Writing an LLM from scratch, part 32a – Interventions: training a baseline model
1 points
gpjt
2026-02-04T02:09:18Z
www.gilesthomas.com
Getting a Custom PyTorch LLM onto the Hugging Face Hub
1 points
gpjt
2026-01-28T23:00:22Z
www.gilesthomas.com
Writing an LLM from scratch, part 31 – the models are now on Hugging Face
2 points
gpjt
2026-01-17T19:58:32Z
www.gilesthomas.com
Writing an LLM from scratch, part 30 – digging into the LLM-as-a-judge results
1 points
gpjt
2026-01-09T01:17:33Z
www.gilesthomas.com
LLM from scratch, part 29 – using DDP to train a base model in the cloud
2 points
gpjt
2026-01-07T20:45:18Z
www.gilesthomas.com
LLM from scratch, part 28 – training a base model from scratch on an RTX 3090
540 points
gpjt
2025-12-02T18:17:48Z
www.gilesthomas.com
Writing an LLM from scratch, part 27 – what's left, and what's next?
1 points
gpjt
2025-11-04T00:51:38Z
www.gilesthomas.com
Writing an LLM from scratch, part 26 – evaluating the fine-tuned model
4 points
gpjt
2025-11-03T19:41:38Z
www.gilesthomas.com
Writing an LLM from scratch, part 25 – instruction fine-tuning
2 points
gpjt
2025-10-29T21:05:43Z
www.gilesthomas.com
Writing an LLM from scratch, part 24 – the transcript hack
1 points
gpjt
2025-10-28T20:17:41Z
www.gilesthomas.com
Retro Language Models: Rebuilding Karpathy's RNN in PyTorch
3 points
gpjt
2025-10-24T18:55:35Z
www.gilesthomas.com
Writing an LLM from scratch, part 23 – fine-tuning for classification
1 points
gpjt
2025-10-22T23:04:33Z
www.gilesthomas.com
Writing an LLM from scratch, part 22 – training our LLM
254 points
gpjt
2025-10-15T23:42:12Z
www.gilesthomas.com
Revisiting Karpathy's 'Unreasonable Effectiveness of Recurrent Neural Networks'
2 points
gpjt
2025-10-11T01:02:25Z
www.gilesthomas.com
Writing an LLM from scratch, part 21 – perplexed by perplexity
1 points
gpjt
2025-10-07T19:05:23Z
www.gilesthomas.com
Writing an LLM from scratch, part 20 – starting training, and cross entropy loss
41 points
gpjt
2025-10-02T21:14:27Z
www.gilesthomas.com
How Do LLMs Work?
2 points
gpjt
2025-09-17T14:31:13Z
www.gilesthomas.com
1
2