Run a 1T parameter model on a 32gb Mac by streaming tensors from NVMe from Hacker News on 2026-03-24 16:02 (#74F7X) Comments
Hypura – A storage-tier-aware LLM inference scheduler for Apple Silicon from Hacker News on 2026-03-24 16:02 (#74FBM) Comments