Intel Unveils Big Processor Architecture Changes
This week Intel unveiled what senior vice president and general manager Raja Koduri called the company's biggest processor advances in a decade. They included two new x86 CPU core architectures-the straightforwardly-named Performance-core (P-core) and Efficient-core (E-core). The cores are integrated into Alder Lake, a "performance hybrid" family of processors that includes new tech to let the upcoming Windows 11 OS run CPUs more efficiently.
With traditional ways of increasing the density of logic on a chip losing steam, processor architecture-basically, how a computer actually goes about its business-will have to carry more of the load. "This is an awesome time to be a computer architect," says Koduri. The new architectures and SoCs Intel unveiled "demonstrate how architecture will satisfy the crushing demand for more compute performance as workloads from the desktop to the data center become larger, more complex, and more diverse than ever," he says.
The two new x86 cores are aimed at different roles, but carry out the same set of instruction and are meant to be combined as they will be in the upcoming Alder Lake line of CPUs. On its face this combination looks similar to Arm's battery-saving big.LITTLE architecture, where low-priority work is handled by a small low-power processor cores while demanding computation is taken by a higher-performing core. But Intel says that the way Alder Lake uses the mix of cores is oriented more toward boosting performance by using all the cores for workloads with lots of threads. (A thread is the smallest bit of a program that can be assigned to a resource in a processor.) The Efficient cores can process one thread at a time, while the Performance core can perform multithreading.
The desktop, mobile, and ultramobile [left to right] have different mixes of power [dark blue] and efficiency cores [light blue].Image: Intel
Many of the innovations in both cores had to do with speeding up the part of the core that deals with instructions. They decode more instructions per cycle, keep frequently used ones close by to save time, and better predict which instructions will come next in a program.
These along with a host of other technologies lead to a core that's 40 percent more efficient at the same frequency or 40 percent better performing at the same power consumption over Intel's current SkyLake core for a single thread. Those figures grow to 80 percent when compared to four Efficient-cores working on four threads versus two Sklylake cores working on four threads.
The new Performance-core architecture is designed to uncover more opportunities for parallelization while reducing latency. It leads to an average of 19 percent better performance on a suite of benchmark tests versus a Cypress Cove core when both are clocked at 3.3 gigahertz.
With Alder Lake, the new cores were put together in three different configurations meant span desktop through "ultramobile" applications-those consuming 125 watts to 9 watts. The desktop SoC operates up to 8 P-cores and 8 E-cores, handles up to 24 threads at once, and contains up to 30 megabytes of cache memory. The mobile version has up to 6 P-cores and 8 E-cores and the ultramobile 2 P-cores and 8 E-cores. The SoCs are made using the Intel 7 process technology.
To make this combination of cores operate together best, work needs to be assigned to each in a way that maximizes performance under whatever conditions the CPU is experiencing. The operating system's thread scheduler kernel is usually charged with that task, but today it does this with little information about the state of cores and it works at a fairly simple level, such as whether a task is foreground like a game or background like checking for new email. The operating system's scheduler kernel's decisions "have a huge impact on user-perceived performance and power consumption," says Mehmet Iyigun, partner development manager at Microsoft.
Intel got together with Microsoft to design a hardware-based scheduler that would give Windows 11-due out later this year-much more granular and dynamic control. The result, Intel Thread Director, monitors the mix of instructions in each thread and the state of each core at the nanosecond level, provides the OS with feedback while programs are running, and it adapts the guidance it gives the OS according to thermal and power limits, explains Rajshree Chabukswar, client architect at Intel.
In addition to the x86 developments, at Architecture Day this week, Intel also detailed Sapphire Rapids, the next generation of Intel's Xeon data center CPUs; Alchemist, Intel's first standalone GPU; and Ponte Vecchio, a monster system-in-package designed for the Aurora supercomputer and heavily reliant on Intel's advanced packaging technologies.