Toggle navigation
HN
Paper
All
Show
Ask
Jobs
Top stories
Today
Last 7 days
Last months
This year
Stats
Stories by gpjt
LLM from scratch, part 28 – training a base model from scratch on an RTX 3090
78 points
gpjt
2025-12-02T18:17:48Z
www.gilesthomas.com
Writing an LLM from scratch, part 27 – what's left, and what's next?
1 points
gpjt
2025-11-04T00:51:38Z
www.gilesthomas.com
Writing an LLM from scratch, part 26 – evaluating the fine-tuned model
4 points
gpjt
2025-11-03T19:41:38Z
www.gilesthomas.com
Writing an LLM from scratch, part 25 – instruction fine-tuning
2 points
gpjt
2025-10-29T21:05:43Z
www.gilesthomas.com
Writing an LLM from scratch, part 24 – the transcript hack
1 points
gpjt
2025-10-28T20:17:41Z
www.gilesthomas.com
Retro Language Models: Rebuilding Karpathy's RNN in PyTorch
3 points
gpjt
2025-10-24T18:55:35Z
www.gilesthomas.com
Writing an LLM from scratch, part 23 – fine-tuning for classification
1 points
gpjt
2025-10-22T23:04:33Z
www.gilesthomas.com
Writing an LLM from scratch, part 22 – training our LLM
254 points
gpjt
2025-10-15T23:42:12Z
www.gilesthomas.com
Revisiting Karpathy's 'Unreasonable Effectiveness of Recurrent Neural Networks'
2 points
gpjt
2025-10-11T01:02:25Z
www.gilesthomas.com
Writing an LLM from scratch, part 21 – perplexed by perplexity
1 points
gpjt
2025-10-07T19:05:23Z
www.gilesthomas.com
Writing an LLM from scratch, part 20 – starting training, and cross entropy loss
41 points
gpjt
2025-10-02T21:14:27Z
www.gilesthomas.com
How Do LLMs Work?
2 points
gpjt
2025-09-17T14:31:13Z
www.gilesthomas.com
The maths you need to start understanding LLMs
603 points
gpjt
2025-09-02T23:10:47Z
www.gilesthomas.com
What AI chatbots are doing under the hood
2 points
gpjt
2025-08-29T19:06:20Z
www.gilesthomas.com
LLM from scratch, part 18 – residuals, shortcut connections, and the Talmud
2 points
gpjt
2025-08-18T19:25:06Z
www.gilesthomas.com
The fixed length bottleneck and the feed forward network
1 points
gpjt
2025-08-14T22:46:05Z
www.gilesthomas.com
Writing an LLM from scratch, part 17 – the feed-forward network
8 points
gpjt
2025-08-12T22:08:49Z
www.gilesthomas.com
Writing an LLM from scratch, part 16 – layer normalisation
1 points
gpjt
2025-07-08T19:17:00Z
www.gilesthomas.com
Leaving PythonAnywhere
3 points
gpjt
2025-06-05T18:15:07Z
www.gilesthomas.com
Writing an LLM from scratch, part 15 – from context vectors to logits
6 points
gpjt
2025-05-31T23:25:36Z
www.gilesthomas.com
1