Article 75XN2 Argonne flexes spare supercompute to build private AI inference service

Argonne flexes spare supercompute to build private AI inference service

by
from www.theregister.com - Articles on (#75XN2)
Story ImageBoffins at the Department of Energy's (DoE) Argonne National Laboratory near Chicago on Tuesday unveiled a new AI inference service cobbled together from spare supercomputing capacity. The hope is that the service can help researchers across the US, including DoE labs and those working on the Genesis Mission, advance scientific discovery across a range of fields. Argonne is home to some of the world's largest supercomputing clusters, including the No. 3-ranked Aurora supercomputer. But its compute capacity also includes several smaller, AI-optimized systems. As of writing, the lab's inference service is running atop two clusters: The first is the Sophia system, comprising 192 Nvidia A100 GPUs, most with 40 GB of memory. The second, dubbed Metis, is arguably the more interesting. That system features 32 of SambaNova's SN40L AI accelerators. Moving forward, Argonne says that the inference service will also be extended to the Nvidia GH200-based Tara and B200-based Minerva systems. The inference service provides researchers with access to a range of large language models (LLMs) through a chatbot-like portal. Models include OpenAI's GPT-OSS, Google's Gemma family, Meta's Llama herd, and a variety of domain-specific and custom models, like AuroraGPT. And at least for some of its services, Argonne appears to be using Open WebUI, a popular self-hosted chatbot service we've explored on numerous occasions. Argonne envisions researchers harnessing these models to securely analyze large datasets and experiment with integrating generative AI into their workflows. By making AI inference available as a shared resource, we are enabling researchers to apply AI at scale to their data, their simulations and their experiments without having to build and maintain their own infrastructure," ALCF director Michael Papka said in a statement. Critically, the service enables DoE researchers to experiment with chatbots in a secure manner that doesn't expose data to public services like ChatGPT. According to Argonne, researchers are already using the service to analyze experimental data in real time to predict things like plasma disruptions in fusion energy research. Boffins are also using the tech to sift through large quantities of data generated by particle accelerators and telescopes to narrow the search radius of the most likely candidates. By doing so, researchers can make better use of available supercomputing capacity, rather than wasting cycles brute forcing the problem. While LLMs and other generative AI models still struggle with hallucinations and other erroneous behavior, there's a growing corpus of research to suggest that the technology can be used to automate research or supplement traditional climate or physics models. For example, before it was air-gapped, the eggheads at Lawrence Livermore National Laboratory tasked El Capitan, the world's most powerful publicly known supe, to develop a new tsunami forecasting model. Meanwhile, Nvidia has demonstrated that AI climate models can identify storm cells faster and more accurately than existing models. (R)
External Content
Source RSS or Atom Feed
Feed Location http://www.theregister.co.uk/headlines.atom
Feed Title www.theregister.com - Articles
Feed Link https://www.theregister.com/
Reply 0 comments