Toggle navigation
HN
Paper
All
Show
Ask
Jobs
Top stories
Today
Last 7 days
Last months
This year
Stats
Stories by skidrow
DeepSeek-R1 and FP8 Mixed-Precision Training
2 points
skidrow
2025-04-19T14:42:57Z
research.colfax-intl.com
How to Write a Fast Matrix Multiplication from Scratch with Tensor Cores (2024)
147 points
skidrow
2025-04-19T14:42:48Z
alexarmbr.github.io
DeepSeek-R1 and FP8 Mixed-Precision Training
2 points
skidrow
2025-04-18T10:08:47Z
research.colfax-intl.com
Implementing a Fast Tensor Core Matmul on the Ada Architecture
1 points
skidrow
2025-04-18T10:06:19Z
www.spatters.ca
How to Write a Fast Matrix Multiplication from Scratch with Tensor Cores
2 points
skidrow
2025-04-18T10:04:55Z
alexarmbr.github.io
1 points
skidrow
2025-04-02T17:10:55Z
news.ycombinator.com
Understanding Peak, Max-Achievable and Delivered FLOPs
1 points
skidrow
2025-04-01T16:49:59Z
rocm.blogs.amd.com
DeepSeek-R1 and FP8 Mixed-Precision Training
1 points
skidrow
2025-04-01T16:47:32Z
research.colfax-intl.com
Outperforming cuBLAS on H100: A Worklog
3 points
skidrow
2025-04-01T16:44:17Z
cudaforfun.substack.com
Optimizing Matrix Multiplication on RDNA3
118 points
skidrow
2025-03-25T09:55:21Z
seb-v.github.io
Outperforming cuBLAS on H100: A Worklog
1 points
skidrow
2025-03-25T09:55:05Z
cudaforfun.substack.com
Mastering LLM Techniques: Inference Optimization
2 points
skidrow
2025-03-24T19:03:58Z
developer.nvidia.com
Optimizing Matrix Multiplication on RDNA3
2 points
skidrow
2025-03-24T19:02:27Z
seb-v.github.io
Outperforming cuBLAS on H100: A Worklog
4 points
skidrow
2025-03-24T19:00:33Z
cudaforfun.substack.com
Understanding Latency Hiding on GPUs [pdf]
2 points
skidrow
2025-03-17T10:55:03Z
www2.eecs.berkeley.edu
AMD Radeon RX 9070 Series Linux GPU Compute Performance
2 points
skidrow
2025-03-17T10:50:51Z
www.phoronix.com
Outperforming cuBLAS on H100: A Worklog
3 points
skidrow
2025-03-17T10:48:43Z
cudaforfun.substack.com
GPU Gems
2 points
skidrow
2025-03-16T19:06:54Z
developer.nvidia.com
Understanding Latency Hiding on GPUs [pdf]
2 points
skidrow
2025-03-16T15:51:00Z
www2.eecs.berkeley.edu
A guide to LLM inference and performance
1 points
skidrow
2025-02-16T20:55:44Z
www.baseten.co
1