HN
Paper
All
Show
Ask
Jobs
Top
Today
Last 7 days
Last months
This year
Statistics
All
Show
Ask
Jobs
Top stories
Today
Last 7 days
Last months
This year
Statistics
Stories by
kanacki
Show HN: I reduced LLM inference GPU calls by 94% using semantic routing
2 points
kanacki
2026-06-01T22:30:52Z
icomnewtechnologies.com