3. High Performance Computing (HPC)

Researchers at MPSD have access to multiple HPC compute resources, which are hosted at the Max Planck Compute and Data Facility (MPCDF), the GWDG, and at MPSD itself.

  1. Raven (since 2020)

    Based on Intel Xeon IceLake-SP processors and Nvidia A100 GPUs. 1592 CPU compute nodes, 114,624 CPU-cores, 375 TB RAM (DDR4), 7.5 PFlop/s theoretical peak performance (FP64), 192 GPU-accelerated nodes providing 768 Nvidia A100 GPUs, 30 TB GPU RAM (HBM2). Shared MPG resource.

    See Supercomputing services for details.

  2. Cobra (since 2018)

    Based on Intel Xeon Skylake-SP processors and Nvidia GPUs (V100, RTX5000): 3424 compute nodes, 136,960 CPU-cores, 128 Tesla V100-32 GPUs, 240 Quadro RTX 5000 GPUs, 529 TB CPU RAM (DDR4), 7.9 TB GPU RAM HBM2, 11.4 PFlop/s peak (FP64) + 2.64 PFlop/s peak (FP32). Shared MPG resource.

    See Supercomputing services for details.

  3. Ada

    Dedicated GPU-based HPC machine for PKS and MPSD with 72 GPU nodes. Each GPU node hosts 4 A100 GPUs, two Intel Xeon IceLake-SP 8360Y CPUs (72 cores in total) and 1 TB RAM.

    Expect machine to become operational during spring 2022.

  4. MPSD HPC

    Hardware resources located at MPSD. The remainder of this section refers to this installation.

3.1. Login nodes

The following documentation is for the MPSD HPC hardware. (There is dedicated documentation elsewhere for Raven, Cobra and Ada.)

Login nodes are mpsd-hpc-login1.desy.de and mpsd-hpc-login2.desy.de.

3.2. Job submission

Job submission is via Slurm.

The following partitions are available to all (partial output from sinfo):

PARTITION     TIMELIMIT  NODES  NODELIST
express         6:00:00      2  mpsd-hpc-ibm-[019-020]
interactive     8:00:00      1  mpsd-hpc-ibm-021
bigmem       28-00:00:0      8  mpsd-hpc-hp-[001-008]
gpu          28-00:00:0      2  mpsd-hpc-gpu-[001-002]
mpsd         28-00:00:0     22  mpsd-hpc-ibm-[022-030,035-036,043-049,053-062]

Please use the two gpu machines only if your code supports nvidia-cuda ;)

Resources per node:

  • express / interactive / mpsd partitions

    • 8 physical cores (16 with hyperthreading)

    • 64GB RAM

    • only 10GB ethernet for MPI communication, so better avoid running multi-node jobs in this partition

  • bigmem

    • 96 physical cores (192 with hyperthreading)

    • 2T RAM

    • fast FDR infiniband for MPI communication

  • gpu

    • 16 physical cores (32 with hyperthreading)

    • 1.5T RAM

    • fast FDR infiniband for MPI communication

    • 8 Tesla V100 GPUs

3.3. Software

We are working on an upgrade of software stack and documentation on the local HPC cluster. In the meantime, the following may be sufficient or a starting point before you contact us for furthe support (see below).

Check the available modules afterwards via module avail, and load the modules of your choice via module add <modulename>.

More modules become available with this command:

$ module use /opt/easybuild/modules/all/

For additional software, you can try to install it using Spack. Or ask help (see below).

Todo

under-construction More material and details to be added.

3.4. Jupyter notebooks

You can use a Jupyter notebook on a dedicated HPC node as follows:

  1. Ensure you are at MPSD or have the DESY VPN set up.

  2. login to a login node (for example ssh mpsd-hpc-login1.desy.de)

  3. request a node for interactive use. For example, 1 node for 600 minutes from the mpsd partition:

    srun -t 600 -n 1 -p mpsd --pty bash

  4. If this was successful, you are now in a terminal (bash) session on your dedicated node. We need to fix one thing before you can start Jupyter by typing this command:

    export XDG_RUNTIME_DIR=/tmp/${USER}

  5. You can install Jupyter yourself, or you activate an installed version with the following commands:

    module use /opt/easybuild/modules

    module load all/Anaconda3/5.1.0

  6. start the Jupyter notebook server on that node with

    jupyter-notebook --ip=${HOSTNAMe}.desy.de

    Watch the output displayed in your terminal. There is a line similar to this one:

    http://mpsd-hpc-ibm-055.desy.de:8888/?token=8814fea339b8fe7d3a52e7d03c2ce942a3f35c8c263ff0b8

    which you can paste as a URL into your browser (on your laptop/Desktop), and you should be connected to the Notebook server on the compute node.

3.5. Usage questions and support

For questions, including installation of software, please contact the Henning and Hans at ssu-cs@mpsd.mpg.de or via the #computing stream in Zulip.