HOME > Introduction > Cluster Computing System

Cluster Computing System : HEP Apollo 2000 and DL380

Cluster Computing System

The Cluster Computing System comprises 104 CPU nodes of HPE Apollo 2000 and 9 GPU nodes of HPE DL380.

Hardware (CPU Nodes)

The CPU nodes consist of three types of HPE Apollo 2000 with different installed memory (total 104 nodes). The numbers of nodes with 256GB, 512GB and 1,024GB memory are 44, 40, and 20 respectively. Each compute node is equipped with two sockets of Intel 64bit Xeon processor G6348 2.6GHz, each having 28 cores, for a total of 56 cores.

The system architecture of the cluster minimizes the impact of node failures on the overall system and offers good maintainability.

Apollo2000
CPU nodeApollo2000
CPUIntel Xeon Gold 6348 (Ice Lake)
Clock frequency2.6GHz
L3 Cache42MB
Number of cores per a node56
CPU peak performance per a node4.65TFLOPS
Memory per a nodeDDR4-3200 256GBDDR4-3200 512GBDDR4-3200 1,024GB
Storage per a node480GB SATA RI SSD
OSRed Hat Enterprise Linux 8.7
InterconnectBetween nodes:
InfiniBand HDR200 (200Gbps)
Number of nodes444020
CPU peak performance205TFLOPS186TFLOPS93.2TFLOPS
Batch queueSMALL, APC

Hardware (GPU Nodes)

The GPU nodes consist of DL380G11 (9 nodes).

The DL380G11 is a compute node equipped with two sockets of Intel 64bit Xeon processor P8462Y+ 2.8GHz, each having 32 cores, for a total of 64 cores.
It has 1,024GB of memory and is equipped with two NVIDIA H100 PCIe (80GB memory) GPU boards.

When using GPU nodes, a dedicated batch queue (APG queue) must be used.

Note that the login nodes also have GPUs, but those boards are NVIDIA A100 PCIe (80GB memory), so please be aware.

DL380
GPU nodeDL380G11
CPU ModelIntel Xeon Platinum 8462Y+ (Sapphire Rapids)
Base Clock Frequency2.8GHz
L3 Cache60MB
Number of cores per a node64
Theoretical CPU Performance per Node5.73TFLOPS
Main Memory per NodeDDR5-4800 1,024GB
Internal Disk per Node480GB SATA RI SSD x2 (RAID1)
GPU per NodeNVIDIA H100 PCIe 80GB x2
OSRed Hat Enterprise Linux 8.7
InterconnectNode-to-Node:
InfiniBand HDR200 (200Gbps)
Number of Nodes9
Total Theoretical CPU Performance51.6TFLOPS
Batch QueueAPG
GPUNVIDIA H100 (PCIe)
Compute Capability9.0
Micro architectureNIVIDA Hopper
TF32 Tensor Cores (TFlops)378
756 (with sparsity)
FP64 Tensor Cores (TFlops)51
FP32 (TFlops)51
FP64 (TFlops)26
Number of Tensor Cores456
Number of FP32 Cores14,592
Interconnect InterfacePCI Express Gen5
Interconnect Bandwidth128GB/s
GPU Memory80GB HBM2e
GPU Memory Bandwidth2,000 GB/s

Software

The operating system running on Apollo 2000 and DL380 is Red Hat Enterprise Linux (RHEL), the same as HPE Superdome Flex (hereafter SDF).
Additionally, GPU nodes equipped with GPU cards can also run GPU‑enabled applications.