Mistral Says Mixtral, Its New Open Source LLM, Matches or Outperforms Llama 2 70B and GPT3.5 on Most Benchmarks
Open source model startup Mistral AI released a new LLM last week with nothing but a torrent link. It has now offered some details about Mixtral, the new LLM. From a report: Mistral AI continues its mission to deliver the best open models to the developer community. Moving forward in AI requires taking new technological turns beyond reusing well-known architectures and training paradigms. Most importantly, it requires making the community benefit from original models to foster new inventions and usages. Today, the team is proud to release Mixtral 8x7B, a high-quality sparse mixture of experts models (SMoE) with open weights. Licensed under Apache 2.0. Mixtral outperforms Llama 2 70B on most benchmarks with 6x faster inference. It is the strongest open-weight model with a permissive license and the best model overall regarding cost/performance trade-offs. In particular, it matches or outperforms GPT3.5 on most standard benchmarks. Mixtral has the following capabilities:1. It gracefully handles a context of 32k tokens.2. It handles English, French, Italian, German and Spanish.3. It has strong performance in code generation.4. It can be finetuned into an instruction-following model that achieves a score of 8.3 on MT-Bench.
Read more of this story at Slashdot.