Introducing the Cloud-Native Supercomputing Architecture – HPCwire

Historically, supercomputers were designed to run a single application and were confined to a small set of well-controlled users. With AI and HPC becoming primary compute environments for wide commercial use, supercomputers now need to serve a broad population of users and to host a more diverse software ecosystem, delivering non-stop services dynamically. New supercomputers must be architected to deliver bare-metal performance in a multi-tenancy environment.

The design of a supercomputer focuses on its most important mission: maximum performance with the lowest overhead. The goal of the cloud-native supercomputer architecture is to maintain these performance characteristics while meeting cloud services requirements: least-privilege security policies and isolation, data protection, and instant, on-demand AI and HPC services.

The data processing unit, or DPU, is an infrastructure platform thats architected and designed to deliver infrastructure services for supercomputing applications while maintaining their native performance. The DPU handles all provisioning and management of hardware and virtualization of servicescomputing, networking, storage, and security. It improves overall performance of multi-user supercomputers by optimizing the placement of applications and by optimizing network traffic and storage performance, while assuring quality of service.

DPUs also support protected data computing, making it possible to use supercomputing services to process highly confidential data. The DPU architecture securely transfers data between client storage and the cloud supercomputer, executing data encryption on behalf of the user.

The NVIDIA BlueField DPU consists of the industry-leading NVIDIA ConnectX network adapter, combined with an array of Arm cores; purpose-built, high-performance-computing hardware acceleration engines with full data-center-infrastructure-on-a-chip programmability; and a PCIe subsystem. The combination of the acceleration engines and the programmable cores enables migrating the complex infrastructure management and user isolation and protection from the host to the DPU, simplifying and eliminating overheads associated with them, as well as accelerating high-performance communication and storage frameworks.

By migrating the infrastructure management, user isolation and security, and communication and storage frameworks from the untrusted host to the trusted infrastructure control plane that the DPU is a part of, truly cloud-native supercomputing is possible for the first time. CPUs or GPUs can increase their compute availability to the applications and operate in a more synchronous way for higher overall performance and scalability.

The BlueField DPU enables a zero-trust supercomputing domain at the edge of every node, providing bare-metal performance with full isolation and protection in a multi-tenancy supercomputing infrastructure.

The BlueField DPU can host untrusted multi-node tenants and ensure that supercomputing resources used by one tenant will be handed over clean to a new tenant. As part of this process, the BlueField DPU protects the integrity of the nodes, reprovisions resources as needed, clears states left behind, provides a clean boot image for a newly scheduled tenant, and more.

HPC and AI communication frameworks such as Unified Communication X (UCX), Unified Collective Communications (UCC), Message Passing Interface (MPI), and Symmetrical Hierarchical Memory (SHMEM) provide programming models for exchanging data between cooperating parallel processes. These libraries include point-to-point and collective communication semantics (with or without data) for synchronization, data collection, or reduction purposes. These libraries are latency and bandwidth sensitive and play a critical role in determining application performance. Offloading the communication libraries from the host to the DPU enables parallel progress in the communication periods and in the computation periods (that is, overlapping) and reduces the negative effect of system noise.

BlueField DPUs include dedicated hardware acceleration engines (for example, NVIDIA In-Network Computing engines) to accelerate parts of the communication frameworks, such as data reduction-based collective communications and tag matching. The other parts of the communication frameworks can be offloaded to the DPU Arm cores, enabling asynchronous progress of the communication semantics. One example is leveraging BlueField for MPI non-blocking, All-to-All collective communication. The MVAPICH team at Ohio State University (OSU) and the X-ScaleSolutions team have migrated this MPI collective operation into the DPU Arm cores with the OSU MVAPICH MPI and have demonstrated 100 percent overlapping of communication and computation, which is 99 percent higher than using the host CPU for this operation.

Parallel Three-Dimensional Fast Fourier Transforms (P3DFFT) is a library used for large-scale computer simulations in a wide range of fields, including studies of turbulence, climatology, astrophysics, and material science. P3DFFT is written in Fortran90 and is optimized for parallel performance. It uses MPI for interprocessor communication and greatly depends on the performance of MPI All-to-All. Leveraging the OSU MVAPICH MPI over BlueField, the OSU and X-ScaleSolutions teams have demonstrated a 1.4X performance acceleration for P3DFFT.

1The performance tests were conducted by Ohio State University on the HPC-AI Advisory Councils Cluster Center, with the following system configuration: 32 servers with dual-socket Intel Xeon 16-core CPUs E5-2697A V4 @ 2.60GHz (total of 32 processors per node), 256GB DDR4 2400MHz RDIMMs memory, and 1TB 7.2K RPM SATA 2.5 hard drive per node. The servers were connected with NVIDIA BlueField-2 InfiniBand HDR100 DPUs and NVIDIA Quantum QM7800 40-port HDR 200Gb/s InfiniBand switch.

Extracting the highest possible performance from supercomputing systems while achieving efficient utilization has traditionally been incompatible with the secured, multi-tenant architecture of modern cloud computing. A cloud-native supercomputing platform provides the best of both worlds for the first time, combining peak performance and cluster efficiency with a modern zero-trust model for security isolation and multi-tenancy.

Learn more about the NVIDIA Cloud-Native Supercomputing Platform.

2021 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, BlueField, ConnectX, DOCA, and Magnum IO are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated. All other trademarks are property of their respective owners.

ARM, AMBA and ARM Powered are registered trademarks of ARM Limited. Cortex, MPCore and Mali are trademarks of ARM Limited. ARM is used to represent ARM Holdings plc; its operating company ARM Limited; and the regional subsidiaries ARM Inc.; ARM KK; ARM Korea Limited.; ARM Taiwan Limited; ARM France SAS; ARM Consulting (Shanghai) Co. Ltd.; ARM Germany GmbH; ARM Embedded Technologies Pvt. Ltd.; ARM Norway, AS and ARM Sweden AB.

More:
Introducing the Cloud-Native Supercomputing Architecture - HPCwire

Related Posts

Comments are closed.