Cmts1

From TurnKey+MediaWiki
Jump to navigation Jump to search

It's a cluster of the project COMTACTS.

Hardware Specifications[edit]

1 frontend node 2x AMD EPYC Rome 16C 2.8GHz, 2 SSD 3.84TB (raid1 sw), 5 SSD 7.8GB (raid5 sw), 512GB DDR4-3200, 2x10Gb/s, 1x Alveo U280, Infiniband dual port 2xHDR100

9 backend nodes 2x AMD EPYC Rome 16C 2.8GHz, 2 SSD 3.84TB (raid1 sw), 512GB DDR4-3200, 2x10Gb/s, 1x Alveo U280, 1x NVIDIA A100 40G, Infiniband dual port 2xHDR100

Interconnection: 1 port 10Gbit/s (data network), 2xHDR100 (compute network), 1 port 1Gbit/s (IPMI)

Software[edit]

Compilers: gcc/g++/gfortran, icc

Mpi distributions: intelmpi, mpich, openmpi

IA: horovod, tensorflow, keras, pytorch (with python3.7 and openmpi)

CUDA

NVIDIA HPC SDK

Intel OneAPI

Vitis and XRT

Scheduler[edit]

For launch jobs you must use the scheduler SLURM. You can see some examples here:

https://help.rc.ufl.edu/doc/Sample_SLURM_Scripts

Procedure[edit]

If you're a user of COMTACTS project and want to reserve a machine you must send an email to:

support@gap.upv.es

indicating the days