Stability AI Launches StableLM, an Open Source ChatGPT Alternative

BeauHD

from Slashdot on 2023-04-24 22:00 (#6B43R)

An anonymous reader quotes a report from Ars Technica: On Wednesday, Stability AI released a new family of open source AI language models called StableLM. Stability hopes to repeat the catalyzing effects of its Stable Diffusion open source image synthesis model, launched in 2022. With refinement, StableLM could be used to build an open source alternative to ChatGPT. StableLM is currently available in alpha form on GitHub in 3 billion and 7 billion parameter model sizes, with 15 billion and 65 billion parameter models to follow, according to Stability. The company is releasing the models under the Creative Commons BY-SA-4.0 license, which requires that adaptations must credit the original creator and share the same license. Stability AI Ltd. is a London-based firm that has positioned itself as an open source rival to OpenAI, which, despite its "open" name, rarely releases open source models and keeps its neural network weights -- the mass of numbers that defines the core functionality of an AI model -- proprietary. "Language models will form the backbone of our digital economy, and we want everyone to have a voice in their design," writes Stability in an introductory blog post. "Models like StableLM demonstrate our commitment to AI technology that is transparent, accessible, and supportive." Like GPT-4 -- the large language model (LLM) that powers the most powerful version of ChatGPT -- StableLM generates text by predicting the next token (word fragment) in a sequence. That sequence starts with information provided by a human in the form of a "prompt." As a result, StableLM can compose human-like text and write programs. Like other recent "small" LLMs like Meta's LLaMA, Stanford Alpaca, Cerebras-GPT, and Dolly 2.0, StableLM purports to achieve similar performance to OpenAI's benchmark GPT-3 model while using far fewer parameters -- 7 billion for StableLM verses 175 billion for GPT-3. Parameters are variables that a language model uses to learn from training data. Having fewer parameters makes a language model smaller and more efficient, which can make it easier to run on local devices like smartphones and laptops. However, achieving high performance with fewer parameters requires careful engineering, which is a significant challenge in the field of AI. According to Stability AI, StableLM has been trained on "a new experimental data set" based on an open source data set called The Pile, but three times larger. Stability claims that the "richness" of this data set, the details of which it promises to release later, accounts for the "surprisingly high performance" of the model at smaller parameter sizes at conversational and coding tasks. According to Ars' "informal experiments," they found StableLM's 7B model "to perform better (in terms of outputs you would expect given the prompt) than Meta's raw 7B parameter LLaMA model, but not at the level of GPT-3." They added: "Larger-parameter versions of StableLM may prove more flexible and capable."

Source	RSS or Atom Feed
Feed Location	https://rss.slashdot.org/Slashdot/slashdotMain
Feed Title	Slashdot
Feed Link	https://slashdot.org/
Feed Copyright	Copyright Slashdot Media. All Rights Reserved.