Multiverse Says It Compresses Llama Models by 80%

staff

from on 2025-04-08 13:00 (#6WFPQ)

Multiverse Computing today released two new AI models compressed by CompactifAI, Multiverse's AI compressor: 80 percent compressed versions of Llama 3.1-8B and Llama 3.3-70B. Both models have 60 percent fewer parameters than the original models, 84 percent greater energy efficiency, 40 percent faster inference, and yield a 50 percent cost reduction ....

The post Multiverse Says It Compresses Llama Models by 80% appeared first on High-Performance Computing News Analysis | insideHPC.

Source	RSS or Atom Feed
Feed Location	http://insidehpc.com/feed/
Feed Title
Feed Link	http://insidehpc.com/