Toggle navigation
HN
Paper
All
Show
Ask
Jobs
Top stories
Today
Last 7 days
Last months
This year
Stats
Stories by ModelForge
A Researcher's Field Guide to Non-Standard LLM Architectures
2 points
ModelForge
2025-11-04T15:00:57Z
magazine.sebastianraschka.com
Explanation of Gated DeltaNet (Qwen3-Next and Kimi Linear)
3 points
ModelForge
2025-11-03T16:59:49Z
github.com
The Core Components of Modern LLMs and the Models Beyond Transformers [video]
3 points
ModelForge
2025-10-27T15:40:25Z
www.youtube.com
Popular Attention Alternatives: GQA, MLA, SWA
4 points
ModelForge
2025-10-15T14:17:52Z
sebastianraschka.com
Multi-Head Latent Attention
4 points
ModelForge
2025-10-13T18:24:28Z
sebastianraschka.com
Thinking Machines Lab Co-Founder Departs for Meta
7 points
ModelForge
2025-10-11T19:57:45Z
www.wsj.com
OpenAI's internal Slack messages could cost it billions in copyright suit
8 points
ModelForge
2025-10-10T20:41:08Z
sherwood.news
LLM Evaluation from Scratch: Multiple Choice, Verifiers, Leaderboards, LLM Judge
4 points
ModelForge
2025-10-05T15:55:26Z
magazine.sebastianraschka.com
Gemma 3 270M re-implemented in pure PyTorch for local tinkering
415 points
ModelForge
2025-08-20T14:01:26Z
github.com
GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2
486 points
ModelForge
2025-08-10T15:06:07Z
magazine.sebastianraschka.com
LLM Research Papers: The 2024 List
5 points
ModelForge
2024-12-18T14:17:50Z
magazine.sebastianraschka.com
Scaling Test-Time Compute with Open LLM Models
3 points
ModelForge
2024-12-18T13:37:42Z
huggingface.co