HPC Cluster Hardware Resources: Difference between revisions

From HPC Docs
Jump to navigation Jump to search
No edit summary
No edit summary
Line 34: Line 34:
| osg-login || virtual || 4 || 0 || 4G || 0 || n/a || n/a || Dedicated for '''Open Science Grid''' job submissions
| osg-login || virtual || 4 || 0 || 4G || 0 || n/a || n/a || Dedicated for '''Open Science Grid''' job submissions
|-
|-
|   || Intel Broadwell || 20 || 0 || 256G || 1 || GTX 1080 || Currently unused
| dev-node001 || Intel Broadwell || 20 || 0 || 256G || 1 || GTX 1080 || dev || '''GTX1080 (low-end)''' remote visualization option
|-
|-
| node001 || AMD Epyc Genoa || 190 || 2 || 1,536G || 0 || n/a || amdtest <!-- short, normal, long, interactive, nolimit --> || &nbsp;
| node001 || AMD Epyc Genoa || 190 || 2 || 1,536G || 0 || n/a || amdtest <!-- short, normal, long, interactive, nolimit --> || &nbsp;
Line 212: Line 212:
| node087 || AMD EPYC2 || 62 || 2 || 512G || 0 || n/a || amd || &nbsp;
| node087 || AMD EPYC2 || 62 || 2 || 512G || 0 || n/a || amd || &nbsp;
|-
|-
| gpu-node001 || Intel Broadwell || 20 || 0 || 256G || 4 || GTX 1080 || gpu || &nbsp;
| gpu-node001 || Intel Broadwell || 19 || 1 || 256G || 4 || GTX 1080 || gpu || &nbsp;
|-
|-
| gpu-node002 || Intel Broadwell || 20 || 0 || 256G || 4 || GTX 1080 || gpu || &nbsp;
| gpu-node002 || Intel Broadwell || 19 || 1 || 256G || 4 || GTX 1080 || gpu || &nbsp;
|-
|-
| gpu-node003 || Intel Broadwell || 20 || 0 || 256G || 4 || GTX 1080 || gpu || &nbsp;
| gpu-node003 || Intel Broadwell || 19 || 1 || 256G || 4 || GTX 1080 || gpu || &nbsp;
|-
|-
| gpu-node004 || Intel Broadwell || 20 || 0 || 256G || 4 || GTX 1080 || gpu || &nbsp;
| gpu-node004 || Intel Broadwell || 19 || 1 || 256G || 4 || GTX 1080 || gpu || &nbsp;
|-
|-
| gpu-node005 || Intel Broadwell || 20 || 0 || 256G || 4 || GTX 1080 || shortgpu || &nbsp;
| gpu-node005 || Intel Broadwell || 19 || 1 || 256G || 4 || GTX 1080 || shortgpu || &nbsp;
|-
|-
| gpu-node006 || Intel Skylake Silver || 20 || 0 || 192G || 8 || GTX 1080Ti || gpu || &nbsp;
| gpu-node006 || Intel Skylake Silver || 19 || 1 || 192G || 8 || GTX 1080Ti || gpu || &nbsp;
|-
|-
| gpu-node007 || Intel Skylake Silver || 20 || 0 || 192G || 8 || GTX 1080Ti || gpu || &nbsp;
| gpu-node007 || Intel Skylake Silver || 19 || 1 || 192G || 8 || GTX 1080Ti || gpu || &nbsp;
|-
|-
| gpu-node008 || Intel Skylake Silver || 20 || 0 || 192G || 8 || GTX 1080Ti || gpu || &nbsp;
| gpu-node008 || Intel Skylake Silver || 19 || 1 || 192G || 8 || GTX 1080Ti || gpu || &nbsp;
|-
|-
| gpu-node009 || Intel Skylake Gold|| 32 || 0 || 384G || 4 || RTX 2080 || gpu || &nbsp;
| gpu-node009 || Intel Skylake Gold|| 31 || 1 || 384G || 4 || RTX 2080 || gpu || &nbsp;
|-
|-
| gpu-node010 || Intel Skylake Gold|| 32 || 0 || 384G || 4 || RTX 2080 || gpu || &nbsp;
| gpu-node010 || Intel Skylake Gold|| 31 || 1 || 384G || 4 || RTX 2080 || gpu || &nbsp;
|-
|-
| gpu-node011 || Intel Skylake Gold|| 32 || 0 || 384G || 4 || RTX 2080 || gpu || &nbsp;
| gpu-node011 || Intel Skylake Gold|| 31 || 1 || 384G || 4 || RTX 2080 || gpu || &nbsp;
|-
|-
| gpu-node012 || Intel Skylake Gold|| 32 || 0 || 384G || 4 || RTX 2080 || gpu || &nbsp;
| gpu-node012 || Intel Skylake Gold|| 31 || 1 || 384G || 4 || RTX 2080 || gpu || &nbsp;
|-
|-
| gpu-node013 || Intel Skylake Gold|| 32 || 0 || 384G || 4 || RTX 2080 || gpu || &nbsp;
| gpu-node013 || Intel Skylake Gold|| 31 || 1 || 384G || 4 || RTX 2080 || gpu || &nbsp;
|-
|-
| gpu-node014 || Intel Skylake Gold|| 32 || 0 || 384G || 4 || RTX 2080 || gpu || &nbsp;
| gpu-node014 || Intel Skylake Gold|| 31 || 1 || 384G || 4 || RTX 2080 || gpu || &nbsp;
|-
|-
| gpu-node015 || Intel Skylake Gold|| 32 || 0 || 384G || 4 || RTX 2080 || gpu || &nbsp;
| gpu-node015 || Intel Skylake Gold|| 31 || 1 || 384G || 4 || RTX 2080 || gpu || &nbsp;
|-
|-
| gpu-node016 || Intel Skylake Gold|| 32 || 0 || 384G || 4 || RTX 2080 || gpu || &nbsp;
| gpu-node016 || Intel Skylake Gold|| 31 || 1 || 384G || 4 || RTX 2080 || gpu || &nbsp;
|-
|-
| gpu-node017 || Intel Skylake Gold|| 32 || 0 || 384G || 4 || RTX 2080 || gpu || &nbsp;
| gpu-node017 || Intel Skylake Gold|| 31 || 1 || 384G || 4 || RTX 2080 || gpu || &nbsp;
|-
|-
| gpu-node018 || Intel Skylake Gold|| 32 || 0 || 384G || 4 || RTX 2080 || gpu || &nbsp;
| gpu-node018 || Intel Skylake Gold|| 31 || 1 || 384G || 4 || RTX 2080 || gpu || &nbsp;
|-
|-
| viz-node001 || Intel Skylake Gold|| 40 || 0 || 768G || 1 || Tesla V100 || remoteviz || &nbsp;
| gpu-node019 || AMD Epyc Genoa || 30 || 2 || 512G || 4 || L40S || gputest || &nbsp;
|-
| gpu-node020 || AMD Epyc Genoa || 30 || 2 || 512G || 4 || L40S || gputest || &nbsp;
|-
| gpu-node021 || AMD Epyc Genoa || 30 || 2 || 512G || 4 || L40S || gputest || &nbsp;
|-
| viz-node001 || Intel Skylake Gold|| 38 || 2 || 768G || 1 || Tesla V100 || remoteviz || '''Tesla V100 (high-end)''' remote visualization option
|-
| viz-node002 || Intel Skylake Gold|| 38 || 2 || 768G || 1 || Tesla V100 || remoteviz || '''Tesla V100 (high-end)''' remote visualization option
|-
|-
| viz-node002 || Intel Skylake Gold|| 40 || 0 || 768G || 1 || Tesla V100 || remoteviz || &nbsp;
|}
|}

Revision as of 16:16, 6 September 2024

Networking/Interconnects

All nodes in the chart listed below on this page contain multiple network connections. All nodes except gpu-node006 through gpu-node008 contain 25Gbps Ethernet network interfaces. The gpu-node006 through gpu-node008 nodes contain 10Gbps Ethernet network interfaces.

Network Storage

The ELSA cluster utilized network based storage that is shared among all nodes for storing personal files, project/research files, course files and the HPC applications. There are two pairs of identical storage servers. One pari is located in the STEM cluster room and another in the Green Hall datacenter. Data from the STEM storage servers are regularly (i.e. nightly) replicated to the ones in Green Hall. The is a total of approximately 6.3PB of raw storage available. The storage servers are Linux-based utilizing the ZFS on Linux filesystem and NFS for sharing with the cluster nodes.

Data Transfer Node

A data transfer node (DTN) is used for transferring large files in and out of a cluster. It is designed to handle high-speed, high-volume transfers. The ELSA DTN contains 122.9TB (raw) of SSD storage for temporarily holding large file transfers. It also has a 100Gbps Ethernet interface to maximize throughput.

PerfSONAR

PerfSONAR is a network performance testing and monitoring system. It regularly runs tests bandwidth and latency tests and if issues arise, it helps pinpoint the location in the network path causing the issue.

Node Configurations

The following describes the contents of columns in the tables below.

  • Node Name = name of the node server. Login nodes (login001 & login002) are accessible via the elsa.hpc.tcnj.edu load-balancer (e.g. using SSH) from the campus network (wired or wireless) or via the TCNJ VPN. Other nodes are not meant to be directly accessed.
  • Processor Family = the generation of processor in the node Skylake Gold > Skylake Silver > Broadwell
  • Available Cores = these are the processing cores that compute jobs can use
  • Reserved Cores = these cores are reserved for system use and not available to user jobs
  • RAM Memory = how much memory the node contains
  • GPU Count = number of GPU accelerators in the node
  • NVIDIA GPU Type = the model of the GPU accelerators in the node
  • Queue Membership = which queues (SLURM partitions) this node is a member of. Nodes can be a member of multiple queues. Note some queues are used for internal purposes (e.g. remoteviz, interactive) and should not be used for submitting your jobs except under certain conditions. Please see the SLURM Partitions page for more information on the specification of these queues/partition.

Note: New nodes from NSF MRI 2018 grant are not yet listed on this chart.

Node
Name
Processor
Family
Available
Cores
Reserved
Cores
RAM
Memory
GPU
Count
NVIDIA
GPU Type
Queue
Membership(s)
Notes
login001 virtual 8 0 8G 0 n/a n/a Public hostname
elsa.hpc.tcnj.edu
login002 virtual 8 0 8G 0 n/a n/a Public hostname
elsa.hpc.tcnj.edu
osg-login virtual 4 0 4G 0 n/a n/a Dedicated for Open Science Grid job submissions
dev-node001 Intel Broadwell 20 0 256G 1 GTX 1080 dev GTX1080 (low-end) remote visualization option
node001 AMD Epyc Genoa 190 2 1,536G 0 n/a amdtest  
node002 AMD Epyc Genoa 190 2 1,536G 0 n/a amdtest  
node003 AMD Epyc Genoa 190 2 1,536G 0 n/a amdtest  
node004 AMD Epyc Genoa 190 2 1,536G 0 n/a amdtest  
node005 AMD Epyc Genoa 190 2 1,536G 0 n/a amdtest  
node006 AMD Epyc Genoa 190 2 1,536G 0 n/a amdtest  
node061 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node062 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node063 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node064 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node065 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node066 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node067 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node068 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node069 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node070 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node071 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node072 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node073 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node074 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node075 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node076 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node077 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node078 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node079 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node080 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node081 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node082 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node083 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node084 Intel Skylake Gold 30 2 192G 0 n/a short, normal, long, interactive, nolimit  
node085 Intel Skylake Gold 30 2 192G 0 n/a interactive  
node086 AMD EPYC2 62 2 512G 0 n/a amd  
node087 AMD EPYC2 62 2 512G 0 n/a amd  
gpu-node001 Intel Broadwell 19 1 256G 4 GTX 1080 gpu  
gpu-node002 Intel Broadwell 19 1 256G 4 GTX 1080 gpu  
gpu-node003 Intel Broadwell 19 1 256G 4 GTX 1080 gpu  
gpu-node004 Intel Broadwell 19 1 256G 4 GTX 1080 gpu  
gpu-node005 Intel Broadwell 19 1 256G 4 GTX 1080 shortgpu  
gpu-node006 Intel Skylake Silver 19 1 192G 8 GTX 1080Ti gpu  
gpu-node007 Intel Skylake Silver 19 1 192G 8 GTX 1080Ti gpu  
gpu-node008 Intel Skylake Silver 19 1 192G 8 GTX 1080Ti gpu  
gpu-node009 Intel Skylake Gold 31 1 384G 4 RTX 2080 gpu  
gpu-node010 Intel Skylake Gold 31 1 384G 4 RTX 2080 gpu  
gpu-node011 Intel Skylake Gold 31 1 384G 4 RTX 2080 gpu  
gpu-node012 Intel Skylake Gold 31 1 384G 4 RTX 2080 gpu  
gpu-node013 Intel Skylake Gold 31 1 384G 4 RTX 2080 gpu  
gpu-node014 Intel Skylake Gold 31 1 384G 4 RTX 2080 gpu  
gpu-node015 Intel Skylake Gold 31 1 384G 4 RTX 2080 gpu  
gpu-node016 Intel Skylake Gold 31 1 384G 4 RTX 2080 gpu  
gpu-node017 Intel Skylake Gold 31 1 384G 4 RTX 2080 gpu  
gpu-node018 Intel Skylake Gold 31 1 384G 4 RTX 2080 gpu  
gpu-node019 AMD Epyc Genoa 30 2 512G 4 L40S gputest  
gpu-node020 AMD Epyc Genoa 30 2 512G 4 L40S gputest  
gpu-node021 AMD Epyc Genoa 30 2 512G 4 L40S gputest  
viz-node001 Intel Skylake Gold 38 2 768G 1 Tesla V100 remoteviz Tesla V100 (high-end) remote visualization option
viz-node002 Intel Skylake Gold 38 2 768G 1 Tesla V100 remoteviz Tesla V100 (high-end) remote visualization option