Explainer: Why a Decades Old Architecture Decision is Impeding the Power of AI Computing

jelizondo

from SoylentNews on 2025-10-01 23:47 (#70F3N)

hubie writes:

How the von Neumann bottleneck is impeding AI computing:

Most computers are based on the von Neumann architecture, which separates compute and memory. This arrangement has been perfect for conventional computing, but it creates a data traffic jam in AI computing.
AI computing has a reputation for consuming epic quantities of energy. This is partly because of the sheer volume of data being handled. Training often requires billions or trillions of pieces of information to create a model with billions of parameters. But that's not the whole reason - it also comes down to how most computer chips are built.
Modern computer processors are quite efficient at performing the discrete computations they're usually tasked with. Though their efficiency nosedives when they must wait for data to move back and forth between memory and compute, they're designed to quickly switch over to work on some unrelated task. But for AI computing, almost all the tasks are interrelated, so there often isn't much other work that can be done when the processor gets stuck waiting, said IBM Research scientist Geoffrey Burr.
In that scenario, processors hit what is called the von Neumann bottleneck, the lag that happens when data moves slower than computation. It's the result of von Neumann architecture, found in almost every processor over the last six decades, wherein a processor's memory and computing units are separate, connected by a bus. This setup has advantages, including flexibility, adaptability to varying workloads, and the ability to easily scale systems and upgrade components. That makes this architecture great for conventional computing, and it won't be going away any time soon.
But for AI computing, whose operations are simple, numerous, and highly predictable, a conventional processor ends up working below its full capacity while it waits for model weights to be shuttled back and forth from memory. Scientists and engineers at IBM Research are working on new processors, like the AIU family, which use various strategies to break down the von Neumann bottleneck and supercharge AI computing.
The von Neumann bottleneck is named for mathematician and physicist John von Neumann, who first circulated a draft of his idea for a stored-program computer in 1945. In that paper, he described a computer with a processing unit, a control unit, memory that stored data and instructions, external storage, and input/output mechanisms. His description didn't name any specific hardware - likely to avoid security clearance issues with the US Army, for whom he was consulting. Almost no scientific discovery is made by one individual, though, and von Neumann architecture is no exception. Von Neumann's work was based on the work of J. Presper Eckert and John Mauchly, who invented the Electronic Numerical Integrator and Computer (ENIAC), the world's first digital computer. In the time since that paper was written, von Neumann architecture has become the norm.
"The von Neumann architecture is quite flexible, that's the main benefit," said IBM Research scientist Manuel Le Gallo-Bourdeau. "That's why it was first adopted, and that's why it's still the prominent architecture today."

Source	RSS or Atom Feed
Feed Location	https://soylentnews.org/index.rss
Feed Title	SoylentNews
Feed Link	https://soylentnews.org/
Feed Copyright	Copyright 2014, SoylentNews