by ashilov@gmail.com (Anton Shilov) from Latest from Tom's Hardware on (#70DJC)
Nvidia has introduced Rubin CPX, a specialized GPU designed to accelerate compute-heavy context phase of long-context inference in large AI models, enabling more efficient handling of million-token workloads by offloading this task from 'Big' GPUs with HBM memory to smaller GPUs with GDDR7 memory.