Xeon Platinum 9200 at Scale: Penguin Computing’s new 7616 Cores-Per-Rack Solutionby Dr. Ian Cutress on August 3, 2020 11:00 AM EST
Some aspects of computing rely on density, and need to pack as many compute processing elements in the smallest space possible. Intel’s Xeon Platinum 9200 range was created to solve these problems, however uptake seems to be limited due to the high power consumption, suited only for those with deep pockets and the ability to deploy. Penguin Computing has introduced a new Xeon Platinum 9200 platform, called TundraAP, to enable better power efficiency and higher compute density.
Advanced Performance = Rack Density
Ever since Intel announced its ‘Advanced Performance’ computing platform, combining two high powered silicon dies in a single package for up to 56 cores and 400 W in a single socket, a lot of users have been skeptical as to the benefits. Aside from a lack of pricing, the servers were only available from Intel’s partners in a few configurations designed from Intel. Our discussions with system integrators at Supercomputing2019 indicated that there was little interest in the product, and aside from a pair of supercomputers in the TOP500 list, volumes seem to be somewhat small.
The chip is very big, and BGA only, but in the right environment, allows for 224 cores in a 1U server. According to Penguin Computing, due to power efficiency and density issues, in most 1U configurations those cores are being underutilized, and the excess idle power goes to waste or they are thermally limited. The new TundraAP platform from Penguin Computing, and its first server the Relion XO1122eAP, fits two of Intel’s S9200WK notes into a 1U system but implements it in a power disaggregation design.
This power disaggregation design removes the standard PSU location inside the server and moves it to dedicated power shelves and centralized DC busbars, such that the server receives the power when it is installed into the rack with simplified power distribution similar to back-plane, but for power. With the power supplies effectively moved elsewhere, they are no longer dumping extra heat into each of the two blades in the 1U form factor, allowing for optimized power delivery and better thermal management. This also allows the power to be managed at a rack level, compared to a per-node level.
This also benefits the types of cooling used on these chips, with Penguin Computing providing a custom direct-to-chip liquid system that enables better thermal coverage of the components that put out the most heat. It also allows cooling to be distributed and monitored at the rack level as well. The TundraAP platform makes use of the Open Compute Project form factor, to which Penguin Computing claims it can provide 15% more nodes per rack due to its better power efficiency.
These systems are set to be deployed from September. As always with Xeon Platinum 9200, the question is which customers are buying them, and exactly how much is it? Intel still refuses to put a comparative price on the Xeon 9200 series parts, instead stating that it’s a solution level product.