HN
Paper
All
Show
Ask
Jobs
Top
Today
Last 7 days
Last months
This year
Statistics
All
Show
Ask
Jobs
Top stories
Today
Last 7 days
Last months
This year
Statistics
Stories by
traceopt
Ask HN: Should training bottleneck detection be a product or just a feature?
1 points
traceopt-ai
2026-03-12T10:31:06Z
news.ycombinator.com
Ask HN: Why does single-node DDP sometimes get slower with more GPUs?
2 points
traceopt-ai
2026-02-17T13:39:52Z
news.ycombinator.com
Show HN: Finding stragglers in multi-GPU PyTorch (DDP) training
1 points
traceopt-ai
2026-02-09T14:17:48Z
github.com
Show HN: "htop" for PyTorch training, see stalls, memory and step time live
3 points
traceopt
2026-01-19T16:25:39Z
news.ycombinator.com
Ask HN: What's still missing in live observability for ML training?
2 points
traceopt-ai
2025-10-28T17:11:36Z
news.ycombinator.com
Show HN: TraceML, a tool to trace live memory usage in PyTorch training
1 points
traceopt-ai
2025-10-01T19:33:31Z
github.com