Multiverse Says It Compresses Llama Models by 80%
by staff from High-Performance Computing News Analysis | insideHPC on (#6WFPQ)

Multiverse Computing today released two new AI models compressed by CompactifAI, Multiverse's AI compressor: 80 percent compressed versions of Llama 3.1-8B and Llama 3.3-70B. Both models have 60 percent fewer parameters than the original models, 84 percent greater energy efficiency, 40 percent faster inference, and yield a 50 percent cost reduction ....
The post Multiverse Says It Compresses Llama Models by 80% appeared first on High-Performance Computing News Analysis | insideHPC.