GPUs are monolithic; manufacturers create the entire GPU as one large chip. It’s becoming harder to create smaller transistors, so the days of massive GPU chips may soon end. Instead, the future might be with the Multi-Chip Module (MCM).
What Is an MCM GPU?
The concept of an MCM GPU is simple. Instead of one large GPU chip containing all the processing elements, you have multiple smaller GPU units connected to each other using an extremely high-bandwidth connection system, sometimes referred to as a “fabric.” This allows the modules to speak to each other as if they were part of a monolithic GPU.
Making smaller GPU modules and binding them together is advantageous over the monolithic approach. For starters, you’d expect to get better yields from every silicon wafer since a flaw would only ruin one module rather than an entire GPU. This could lead to cheaper GPUs and make performance scaling much easier. If you want a faster graphics card, just add more modules!
Does Anyone Remember SLI and Crossfire?
The idea of using multiple chips to boost isn’t new. You may remember a time when the fastest gaming PCs used multiple graphics cards connected to each other. NVIDIA’s solution was known as SLI (Scalable Link Interface), and AMD had Crossfire.
The performance scaling was never perfect, with the second card adding perhaps 50-70% of performance on average. The main issue was finding a way to split the rendering load between two or more GPUs. This is a complex task, and both SLI and Crossfire were bandwidth-constrained.
There were also various graphical glitches and performance issues that resulted from this approach. Micro-stutters were rampant in the heydays of SLI. Nowadays, you won’t find this feature on consumer GPUs, and thanks to how render pipelines work in games, SLI isn’t feasible anymore. Top-end cards like the RTX 3090 still have NVLink, to connect multiple cards together, but this is for special GPGPU workloads rather than real-time rendering.
MCM GPUs present to software such as games or graphics software as a single monolithic GPU. All of the load-balancing and coordination is handled at the hardware level, so the bad old days of SLI should not have a comeback.
It’s Already Happened to CPUs
If the idea of an MCM sounds familiar, it’s because this type of technology is already common in CPUs. Specifically, AMD is known for pioneering “chiplet” designs where their CPUs are made from multiple modules connected by “infinity fabric.” Intel has also been creating chiplet-based products since 2016.
Does Apple Silicon Have An MCM GPU?
The latest Apple Silicon chips contain multiple independent GPUs, so it’s not incorrect to think of them as an example of MCM GPU technology. Consider the Apple M1 Ultra, which is literally two M1 Max chips glued together by a high-bandwidth interconnect. Although the Ultra contains two M1 Max GPUs, they present a single GPU to any software that runs on your Mac.
This approach of making bigger, better chips by gluing multiple SoC (System on a Chip) modules together has proven quite effective for Apple!
MCM Technology Is Upon Us
At the time of writing, the upcoming generation of GPUs is RDNA 3 from AMD and the RTX 40-series from NVIDIA. Leaks and rumors indicate a strong chance that RDNA 3 will be an MCM GPU and the same rumors abound about NVIDIA’s future GPUs.
This means that we may be on the cusp of a major leap in GPU performance, coupled with a potential drop in price, as yields improve, driving down the price of flagship cards or high-end cards.