Article 6WFPQ Multiverse Says It Compresses Llama Models by 80%

Multiverse Says It Compresses Llama Models by 80%

by
staff
from Inside HPC & AI News | High-Performance Computing & Artificial Intelligence on (#6WFPQ)
multiverse-computing-logo-2-1-1024.png

Multiverse Computing today released two new AI models compressed by CompactifAI, Multiverse's AI compressor: 80 percent compressed versions of Llama 3.1-8B and Llama 3.3-70B. Both models have 60 percent fewer parameters than the original models, 84 percent greater energy efficiency, 40 percent faster inference, and yield a 50 percent cost reduction ....

The post Multiverse Says It Compresses Llama Models by 80% appeared first on High-Performance Computing News Analysis | insideHPC.

External Content
Source RSS or Atom Feed
Feed Location http://insidehpc.com/feed/
Feed Title Inside HPC & AI News | High-Performance Computing & Artificial Intelligence
Feed Link https://insidehpc.com/
Reply 0 comments