[Paper on Hummingbird+: low-cost FPGAs for LLM inference] Qwen3-30B-A3B Q4 at 18 t/s token-gen, 24GB, expected $150 mass production cost
·
0 reactions
·
0 comments
·
3 views
Original article
LocalLlama
Anonymous · no account needed