Article 6WFPQ Multiverse Says It Compresses Llama Models by 80%

Multiverse Says It Compresses Llama Models by 80%

by
staff
from High-Performance Computing News Analysis | insideHPC on (#6WFPQ)
multiverse-computing-logo-2-1-1024.png

Multiverse Computing today released two new AI models compressed by CompactifAI, Multiverse's AI compressor: 80 percent compressed versions of Llama 3.1-8B and Llama 3.3-70B. Both models have 60 percent fewer parameters than the original models, 84 percent greater energy efficiency, 40 percent faster inference, and yield a 50 percent cost reduction ....

The post Multiverse Says It Compresses Llama Models by 80% appeared first on High-Performance Computing News Analysis | insideHPC.

External Content
Source RSS or Atom Feed
Feed Location http://insidehpc.com/feed/
Feed Title High-Performance Computing News Analysis | insideHPC
Feed Link https://insidehpc.com/
Reply 0 comments