Nvidia announces the dgx

The supercomputer presented by NVIDIA, the DGX-2, builds on the previous DGX-1 in several ways but with twice the performance at an exorbitant price. First, it introduces NVIDIA's new NVSwitch, which enables 300GB / s chip-to-chip communication at 12 times the speed of a PCIe connection. This, with NVLink2, allows sixteen GPUs to be grouped into a single system, bringing the total bandwidth to over 14TB / s. Adding a pair of Xeon CPUs, 1.5TB of RAM and 30TB of NVMe storage capacity, we get a system that consumes 10 kW, weighs 350 lbs, but easily offers twice the performance of the DGX-1, according to NVIDIA.

DGX-2 is 2 times more powerful than DGX-1

NVIDIA is also taking the chest out of the 2 performance PFLOPs when tensor cores are used.

The green company has used a double stack system. The concept photo indicates that there are actually 12 NVSwitches (216 ports) in the system to maximize the amount of available bandwidth between GPUs. With 6 ports per Tesla V100 GPU, each running on 32GB of HBM2 memory, it means that Tesla alone would take up 96 of those ports if NVIDIA had them fully wired to maximize the bandwidth of each GPU.

The DGX-2's design means that all 16 GPUs can share memory in a unified way, albeit with the usual pros and cons of chip abandonment. Unlike the Tesla V100's increased memory capacity, one of NVIDIA's goals in this case is to create a system capable of holding memory workloads that would be too large for an 8 GPU cluster.

DGX-2 is being launched for companies that are focused on deep learning and can make a really big investment. The price of the system is $ 400, 000, instead of the $ 150, 000 of the original DGX-1.

Anandtech font

Table of contents:

DGX-2 is 2 times more powerful than DGX-1

Editor's choice