Platform for Scientific Computing

WR Cluster Usage


Content


Accounts

Users have NIS accounts that are valid on all cluster nodes. Passwords can be changed with the passwd command on wr0. It takes some time (up to several minutes) until such a change will be seen by all nodes. Please be aware that account names are the same as with your university accounts, but that the accounts are different, including passwords.


File Systems

Each node (server as well as cluster nodes) has its own operating system on a local disc. Certain shared directory subtrees are exported via NFS to all cluster nodes. This includes user data (e.g. $HOME = /home/username) as well as commonly used application software (e.g. /usr/local).

The /tmp directory is on all nodes guaranteed to be located on a fast node-local filesystem. The environment variable $TMPDIR within a batch job contains a name to a job-private fast local directory (somewhere in /tmp on a node. This directory is generated on each job start with a job-specific name and removed an job termination. See additional description here). If possible, use this dynamically set environment variable to write to and read from temporary files used only in one job run. The /scratch directory can/should be used for larger amounts of temporary data that needs to be available longer than a batch job run. Th/scratch directory is shared between all nodes and access to it is slow. Please be aware that data on /tmp filessystems may be deleted without any notice. And be aware that there is no backup for the scratch file system!
mount point located purpose shared on all nodes daily backup default soft quota
/ local operating system no no -
/home remote server user data yes yes 80GB
/usr/local remote server application software yes yes 40GB
/tmp local node-local temporary data no no 10GB
/scratch remote server shared temporary data yes no 5TB

We have established quotas on file systems. Users can ask for their own quota with the command quota -s. The output shows the internal device names which can be translated to user filesystems as follows:
internal name filesystem
/dev/md127 / (including /tmp)
/dev/md0 /home
/dev/md1 /usr/local
wr1:/raid1/scratch /scratch
The maximum number of files is per default restricted to 1 Mio. / 2 Mio. (soft / hard limit) files per file system. For the /scratch filesystem, the numbers are 5 Mio. / 6 Mio. If you have special demands larger than the default quota please contact the system administrator .

File and Directory Names

Don't use spaces or umlauts in file or directory names, e.g. on copying from a MS Windows system. Otherwise you get in trouble getting results files from the batch system (result code 2).


Software Packages

Beside a set of standard software packages, a user can extend his/her package list with additional software packages or package versions. This needs to be done by a user itself using the module command with several possible subcommands. A software environment is called a module. Loading a module means usually that internally the search paths for commands, libraries etc. are extended.

Usage

Here is a short overview of some (sub-)commands: A module may exist in several versions where the user has the possibility to work with one specific version of choice. If no version is specified during the load a default version is used. It is a good practice always to use the default version of a module even if the concrete version behind the default may change over the time. Most modules are downward compatible such that no problems should exist in this case and you will always get the most advanced, fast and with the least errors version of a module at any time.

Example: Instead of


user@wr0: module use gcc/8.1.0
just use

user@wr0: module use gcc

Examples


user@wr0: module avail

----------------------------------------------------------- /usr/local/modules/modulesfiles -----------------------------------------------------------
cmake/3.11.1           hwloc/1.11.10          java/10.0.1            octave/default         python3/3.6.5
dinero4/4.7            hwloc/2.0.1            java/default           ompp/0.8.5             python3/default
dinero4/default        hwloc/default          libFHBRS/3.1           ompp/default           sage/8.2
ffmpeg/4.0             intel-compiler/2018    libFHBRS/default       openmpi/gnu            sage/default
ffmpeg/default         intel-compiler/default likwid/4.3.2           openmpi/test           slurm/17.11.6
gcc/8.1.0              intel-mpi/2018         likwid/default         papi/5.6.0             texlive/2018
gcc/default            intel-mpi/default      metis/5.1.0-32         papi/default           texlive/default
gnuplot/5.2.3          intel-tools/2018       metis/5.1.0-64         python/2.7.15          valgrind/3.13.0
gnuplot/default        intel-tools/default    octave/4.4.0           python/default         valgrind/default

----------------------------------------------------------- /usr/share/Modules/modulefiles ------------------------------------------------------------
dot         module-git  module-info modules     null        use.own

------------------------------------------------------------------ /etc/modulefiles -------------------------------------------------------------------
mpi/compat-openmpi16-x86_64  mpi/mpich-x86_64             mpi/mvapich2-2.2-psm2-x86_64 mpi/mvapich2-psm-x86_64
mpi/mpich-3.0-x86_64         mpi/mvapich2-2.0-psm-x86_64  mpi/mvapich2-2.2-psm-x86_64  mpi/mvapich2-x86_64
mpi/mpich-3.2-x86_64         mpi/mvapich2-2.0-x86_64      mpi/mvapich2-2.2-x86_64      mpi/openmpi-x86_64

user@wr0: module whatis gcc
gcc                  : GNU compiler suite version 8.1.0

# check current compiler version (system default without loading a module)
user@wr0: gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)

# load default version
user@wr0: module load gcc
user@wr0: gcc --version
gcc (GCC) 8.1.0

# unload default version
user@wr0: module unload gcc
user@wr0: gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)

Available Modules

A list of selected software packages (eventually with sub-versions) that are handled using the module command is:
name purpose
cmake CMake system
cuda CUDA development and runtime environment
gcc GNU compiler suite
gnuplot plot program
hwloc detect hardware properties
intel-compiler Intel Compiler enviroment
intel-mpi Intel MPI
intel-tools Intel development tools
java Oracle Java environment
likwid development tools
matlab Matlab mathematical software with toolboxes
metis graph partitioning package
octave GNU octave
ompp OpenMP tool
opencl OpenCL
openmpi OpenMPI environment
papi Papi performance counter library
pgi PGI compiler suite
python Python 2
python3 Python 3
sage Mathematical software system sage
slurm batch system
texlive TeX distribution
valgrind Valgrind software analysis tool

Initial Module Enviroment Setup

If you need always the same modules, you may include the load commands in your .bash_profile (once per session executed) or .bashrc (once per shell executed) file in your home directory. Example $HOME/.bashrc file:

module load intel-compiler openmpi/intel


Using the Batch System

A batch system is used on HPC systems to manage the work of many users on such a system. Users submit their requests for computational work and (hardware) requirements that are necessary for the execution of their requests. Then, the batch system looks for resources that fulfill the requirements and starts the job as soon as such resources get available. This might be immediately or later. We use Slurm as a batch system and we ask you to use the batch system for all your work on all cluster nodes other than wr0. Slurm has a command line interface and additionally a X11 based graphical interface to display certain batch system state. To work with batch jobs, a user usually does a sequence of steps described below.

Usual Steps

1) Specify What Should Be Done

The first thing to do is to specify the work that has to be done by the job. This specification is done with a shell script (a file). Such a batch job script is a shell script that is submitted to and started by the batch system. In a batch script you specify all actions that should be done in your job either sequentially or parallel. The execution of the script later starts in the same directory where you submitted the job.

Sequential Job

An example of such a batch script /home/user/job_sequential.sh is:

#!/bin/sh
# start sequential program
./test_sequential.exe
# change directory and execute another sequential program
cd subdir
./another_program.exe

OpenMP Job

An example of such a batch script /home/user/job_openmp.sh is:

#!/bin/sh
# set the number of threads
export OMP_NUM_THREADS=16
# start OpenMP program
./test_openmp.exe

MPI Job

An example of such a batch script /home/user/job_mpi.sh is:

#!/bin/sh
# load the OpenMPI environment
module load openmpi/gnu

# start here your MPI program
mpirun ./test_mpi.exe

2) Specify Which Resources You Need

Additionally, at the begin of a job script a description is given which resources are needed for the execution. The syntax for that is a sequence of lines starting with #SBATCH (which is a special form of a shell comment). In each line a certain part of the request can be specified. See the documentation of Slurm sbatch for a list of all options that are available to specify. Here, only an example is given. More options are given in a summary later. An example for such a resource request is:
 
#!/bin/bash
#SBATCH --partition=any          # partition (queue)
#SBATCH --nodes=4                # number of nodes
#SBATCH --ntasks-per-node=32     # number of cores per node
#SBATCH --mem=4G                 # memory per node in MB (different units with suffix K|M|G|T)
#SBATCH --time=2:00              # total runtime of job allocation (format D-HH:MM:SS; first parts optional)
#SBATCH --output=slurm.%j.out    # filename for STDOUT (%N: nodename, %j: job-ID)
#SBATCH --error=slurm.%j.err     # filename for STDERR

# here comes the part with the description of the computational work, for example:
# load the OpenMPI environment
module load openmpi/gnu

# start here your MPI program
mpirun ./test_mpi.exe
The meaning of the lines in this example are: A job can be started only, if all requested resource specifications can be fulfilled. For the example: 4 nodes in the partition any are available with 32 cores and 4 GB memory each, for 2 minutes.

Instead of requesting 4 nodes with 32 cores, it is also possible to request a certain number of cores / hardware threads but that may be spread arbitrary over several nodes. The example given above adapted to that is:
 
#!/bin/bash
#SBATCH --partition=any          # partition (queue)
#SBATCH --tasks=80               # number of tasks     <---------- this is different to above
#SBATCH --mem=4G                 # memory per node in MB (different units with suffix K|M|G|T)
#SBATCH --time=2:00              # total runtime of job allocation ((format D-HH:MM:SS; first parts optional)
#SBATCH --output=slurm.%j.out    # filename for STDOUT (%N: nodename, %j: job-ID)
#SBATCH --error=slurm.%j.err     # filename for STDERR

# here comes the part with the description of the computational work, for example:
# load the OpenMPI environment
module load openmpi/gnu

# start here your MPI program
mpirun ./test_mpi.exe
In this example, 80 parallel execution units are requested. This can be fulfilled by 4 x 20-core nodes. But this request may also be fulfilled by one node with 80 cores or 80 nodes with one core used on each (and other cores on a node left for other jobs). This specification gives more freedom to the batch system to find resources. But the programming model is (usually) restricted to MPI as a program run may be spread over several nodes.

3) Submit the Job

After you specified in a file the requested resources and the work that should be done, you submit this job script to the batch system. This is done with the sbatch command using the job script filename as an argument. Example:

user@wr0: sbatch jobscript.sh
If the system accepts the request (i.e., no syntax error in the script etc.) the batch system prints a job ID that may be used to refer to this job.

4) Check Job Status

After submission you may check the status of your/all your jobs with several commands depending on the amount of information you want.
  1. You can view the batch status of all batch jobs in a web brower ( link). The page gets updated periodically.
  2. You can show the status of all of your non-finished jobs in a shell window with the squeue command.
    
    user@wr0: squeue
                 JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                    55       any test2.sh user     PD       0:00      2 (Resources)
                    56       any test3.sh user     PD       0:00      2 (Priority)
                    54       any test1.sh user      R       0:08      2 wr[50,51]
    
    In the example the user has 3 jobs submitted that are either running or still waiting. The column ST marks the job state (R=running, PD=waiting).

5) Get Results

Output to stdout / stderr in your program is redirected to 2 files you find in the directory where you submitted the job after the job finished. The file names can be specified in the job script with the options shown above. It is a good idea, to include at least the job ID in the filename.
Example:

user@wr0: ls -l
-rw------- 1 user fb02    316 Mar  9 07:27 slurm.51.out
-rw------- 1 user fb02  11484 Mar  9 07:27 slurm.52.err

Selected Batch System Commands

A summary of useful commands is given in the following table. See appropriate man pages or the Slurm documentation for all available options and a full description.
command meaning
sbatch <shell-script> submit the shell-script to the batch system
scancel <jobid> delete a job with the given job ID, that may be either in running or waiting state
squeue show the state of own jobs in queues
sinfo [options] show the state of partitions or nodes
scontrol show job <jobid> show more details for the job

Partitions and Limits

We have established several partitions with different behaviour and restrictions. See the output of the sinfo command for a list of available partitions. With each partition are associated certain policies (hardware properties, maximum number of jobs in queue, maximum runtime per job, scheduling priority, maximum physical memory, special hardware features).

Resource Limits

As part of a job submit, you can specify a request for main memory above the default 1 GB. Be aware that on nodes not all main memory as given in the hardware overview table can be allocated for your job. For example, the operating system needs some memory for itself, for the efficient communication with a GPU memory is pinned etc. For example, it might happen that on a system with 128 GB main memory only 120 GB are available for a job. Therefore the advice is, that you should specify resource requests that fit to you job's needs and do not request the maximum available resources of a node.

A list of the most important queues is:

queue name maximum time per job usable memory default virt.memory/process nodes used
any 72 hours (dependent on node) 1 GB any node
hpc 72 hours (dependent on node) 1 GB wr20-wr42, wr50-wr99
hpc3 72 hours 185 GB 1 GB wr50-wr99
hpc2 72 hours 120 GB 1 GB wr28-wr42
hpc1 72 hours 120 GB 1 GB wr20-wr27
gpu 72 hours 185 GB 1 GB wr15-wr19
gpu4 72 hours 185 GB 1 GB wr15
wr14 72 hours 120 GB 1 GB wr14
wr13 72 hours 85 GB 1 GB wr13
wr43 72 hours 750 GB 1 GB wr43

Environment Variables and Modules

The batch system defines certain environment variables that you may use in your batch job script.

variable name purpose example
$SLURM_SUBMIT_DIR working directory where the job was submitted /home/user/testdir
$SLURM_JOB_ID Job ID given to the job 65
$SLURM_JOB_NAME Job name given to the job testjob
$SLURM_JOB_NUM_NODES number of nodes assigned to this job 2
$SLURM_JOB_CPUS_PER_NODE number of cores per node assigned to this job 32(x5) (32 cores, on 5 nodes)
$SLURM_JOB_NODELIST node names of assigned nodes wr[50,51]

Special Requests

Hyperthreading
If you do not want to use Hyperthreads (i.e. only real cores), specify in your job request additionally: #SBATCH --ntasks-per-core=1
Example: 1 hpc3-node is requested that has 32 cores / 64 hardware threads. The program starts with 32 OpenMP (software) threads spread over all cores and not using hHyperthreading.
 
#!/bin/bash
#SBATCH --partition=hpc3         # partition
#SBATCH --nodes=1                # number of nodes
#SBATCH --ntasks-per-core=1      # use only real cores
#SBATCH --time=2:00              # total runtime of job allocation

export OMP_NUM_THREADS=32
./test_openmp.exe
Hybrid Programming Models
If you want to use hybrid programming models (e.g. MPI+OpenMP), you can influence the mapping of MPI processes to the requested hardware in several ways, including

Temporary Files

For a batch job an environment variable $TMPDIR gets defined with a name of a temporary directory (with fast access) that should be used for fast temporary file storage within a job scope. The directory is created on job start and deleted when the job finished. Example on how to use the environment variable within a program:

char *basedir = getenv("TMPDIR");
if(basedir != NULL)
  {
    char filename = "test.dat";
    char allname[1024];
    sprintf(allname, "%s/%s", basedir, filename);
    FILE *f = fopen(all, "w");
  }


Interactice Development

To fasten development cycles, you can use some nodes interactively. The nodes available for that are: Additionally, you can use the srun command: srun --x11 --pty /bin/bash with additinal options that are possible.


Compiler

All main development tools are available. Among them are compilers (C, C++, Java, Fortran) and parallel programming environments (OpenMP, MPI, CUDA, OpenCL, OpenACC). Application software is in the responsibility of users.
compiler name module command documentation safe optimization debug option compiler feedback version
GNU C cc / gcc - man gcc -O2 -g -ftree-vectorizer-verbose=2 --version
Intel C icc module load intel-compiler man icc -O2 -g –vec-report=2 (or higher) --version
PGI C pgcc module load pgi man pgcc -O2 -g -Minfo=vec --version
GNU C++ g++ - man g++ -O2 -g -ftree-vectorizer-verbose=2 --version
Intel C++ icpc module load intel-compiler man icpc -O2 -g –vec-report=2 (or higher) --version
PGI C++ pgc++ module load pgi man pgc++ -O2 -g -Minfo=vec --version
GNU Fortran gfortran - man gfortran -O2 -g -ftree-vectorizer-verbose=2 --version
Intel Fortran ifort module load intel-compiler man ifort -O2 -g –vec-report=2 (or higher) --version
PGI Fortran pgfortran module load pgi man pgfortran -O2 -g -Minfo=vec --version
Oracle Java javac module load java   -O -g n.a. -version

Examples:

On wr14 there is additionally the whole PGI Compiler infrastructure with compilers and the profiler pgprof installed. Documentation is available under /usr/local/PGI/. The tool infrastructure can be used only on wr5. The generated Code may be executed on all nodes. Exception: if you use the accelerator functionality of the PGI compiler, the code can be executed only on nodes with a GPU.


Base Software

The following base software is installed:

Intel MKL

The Intel Math Kernel Library (MKL) is installed. You can use this software after a module load intel-compiler which expands include file search paths and library search paths accordingly. It should be used preferably on Intel-based systems, but works also on AMD systems. The library contains basic mathematical functions (BLAS, LAPACK, FFT,...). If you use any of the Intel compilers, just add the flag -mkl as a compiler and linker flag. Otherwise, check this page for the appropriate version and correspondings flags. Example for Makefile:

CC      = icc
CFLAGS  = -mkl
LDLIBS  = -mkl
By default MKL uses all available cores. You can restrict this number with the environment variable MKL_NUM_THREADS, e.g.

export MKL_NUM_THREADS=1
before you start a MKL-based program.


Parallel Programming

There are different approaches for parallel programming today: shared memory parallel programming based on OpenMP, distributed memory programming based on MPI, and GPGPU computing based on CUDA, OpenCL, OpenACC or OpenMP 4++.

OpenMP

compiler name module command documentation version
GNU OpenMP C/C++ gcc/g++ -fopenmp - man gcc --version
Intel OpenMP C/C++ icc/icpc -qopenmp module load intel-compiler man icc / icpc --version
PGI OpenMP C/C++ pgcc/pgCC -mp module load pgi man pgcc /pgCC --version
Intel OpenMP Fortran ifort -qopenmp module load intel-compiler man ifort --version
GNU OpenMP Fortran gfortran -fopenmp - man gfortran --version
PGI Fortran pgfortran -mp module load pgi man pgfortran --version

Example: Compile and run an OpenMP C file:


module load intel-compiler
icc -qopenmp -O2 t.c
export OMP_NUM_THREADS=8
./a.out

MPI

compiler name module command documentation version
MPI C (based on gcc) mpicc module load openmpi/gnu see gcc --version
MPI C++ (based on gcc) mpic++ module load openmpi/gnu see g++ --version
MPI Fortran (based on gfortran) mpif90 module load openmpi/gnu see gfortran --version
MPI C (based on icc) mpiicc module load openmpi/intel see icc --version
MPI C++ (based on icpc) mpiicpc module load openmpi/intel see icpc --version
MPI Fortran (based on ifort) mpiifort module load openmpi/intel see ifort --version

Which MPI-compilers are used can be influenced through the module command: with module load openmpi/gnu you can use the GNU compiler environment (gcc, g++, gfortran), and with module load openmpi/intel you can use the Intel compiler environment (icc, icpc, ifort). Be aware that using module load openmpi/intel the MPI compiler names mpicc etc. are mapped to the GNU compilers. To use an Intel compiler you need to specify Intel's own names for that, i.e., mpiicc, mpiicpc, mpiifort.

All options discussed in the compiler section also apply here, e.g. optimization.

Example: Compile a MPI C file and generate optimised code:


module load openmpi/intel
mpiicc -O2 t.c

The MPI implementation we use (OpenMPI) has options to influence the communication medium used. Within one node, MPI processes can communicate through shared memory, Omni-Path, or Ethernet with TCP/IP. Between nodes, Omni-Path or Ethernet with TCP/IP is possible. OpenMPI ususally chooses the most appropriate medium which means you don't need to specify anything. But if you want to choose a specific and applicable medium you may specify this in the call to mpirun through the --mca btl specifier: mpirun --mca btl communication-channels ... where communication-channels is a list of comma separated communication mediums. Possible values are: sm for shared memory, openib for Omni-Path / Infiniband, and tcp for Ethernet. The last specifier must be self.

Example:

mpirun --mca btl tcp,self -np 4 mpi.exe

OpenCL and CUDA

The nodes wr14-wr27 have a NVIDIA Tesla card installed (V100, K80, K20m). Program development can be done interactively on wr14 (i.e. ssh wr14) as there are all necessary drivers installed locally on that system. Production runs on any Tesla card should be done using the appropriate batch queues. Use module load cuda to load the CUDA environment. Use module load opencl/nvidia or module load opencl/intel to load the OpenCL environment, for Nvidia GPUs or Intel processors, respectively. With both modules, the standard environment variables CPATH for inlude files and LIBRARY_PATH for libraries are set accordingly to be used e.g. in a makefile. Be aware that the Cuda environment requires certain gcc versions. Usually the latest gcc version is not supported by Nvidia. Then, for example do a module load gcc/7.3.0 to load an older version (in this example 7.3.0).

To compile an OpenCL program on a node with the appropriate software environment installed proceed as follows:

module load opencl
cc opencltest.c -lOpenCL
./a.out
To compile a CUDA project use the following Makefile template:

# defines
CC              = cc
CUDA_CC         = nvcc
LDLIBS          = -lcudart

# default rules based on suffices
#       C
%.o: %.c
        $(CC) -c $(CFLAGS) -o $@ $<

#       CUDA
%.o: %.cu
        $(CUDA_CC) -c $(CUDA_CFLAGS) -o $@ $<

myprogram.exe: myprogram.o kernel.o
        $(CC) -o $@ $^ $(LDLIBS)
Here the CUDA kernel and host part is in a file kernel.c and the non-CUDA part of your program is in a file myprogram.c.

OpenACC

Directive-based GPU programming is available through the PGI compiler. See /usr/local/PGI/doc for documentation. Use wr14 only interactively to compile such programs. The generated code can be executed on wr14-wr27. You can specify the compute capability as a compiler option. Important: the PGI compiler generates per default debug code that in general is very slow. If you want fast code add the nodebug option. Example:

module load pgi
pgcc -acc -ta=nvidia,cc3.5,nodebug openacctest.c
./a.out
where 3.5 corresponds to the compute capability of the target GPU.


Tools

See this document .


Resource Requirements

If you want to find out the memory requirements of a non-MPI job, use:

/usr/bin/time -f "%M KB" command
which prints out the peak memory consumption in kilobytes of the command execution.


Usage Examples

Sequential C program

C-program named test.c


#include <stdio.h>
int main(int argc, char **argv) {
    printf("Hello world\n");
    return 0;
}

Makefile


CC     = cc
CFLAGS = -O

#default rules
%.o: %.c
        $(CC) $(CFLAGS) -c $<
%.exe: %.o
        $(CC) -o $@ $< $(LDLIBS)

default:: test_sequential.exe

Batch script


#!/bin/bash
#SBATCH --output=slurm.%j.out    # STDOUT
#SBATCH --error=slurm.%j.err     # STDERR
#SBATCH --partition=any          # partition (queue)
#SBATCH --ntasks=1               # use 1 task
#SBATCH --mem=100                # memory per node in MB (different units with suffix K|M|G|T)
#SBATCH --time=2:00              # total runtime of job allocation ((format D-HH:MM:SS; first parts optional)

# start program
./test_sequential.exe

OpenMP C program

C-program named test_openmp.c


#include <stdio.h>
#include <omp.h>
int main(int argc, char **argv) {
#pragma omp parallel
    printf("I am the %d. thread of  %d threads\n", omp_get_thread_num(), omp_get_num_threads());
    return 0;
}

Makefile


CC     = gcc -fopenmp
CFLAGS = -O

#default rules
%.o: %.c
        $(CC) $(CFLAGS) -c $<
%.exe: %.o
        $(CC) -o $@ $< $(LDLIBS)

default:: test_openmp.exe

Batch script


#!/bin/bash
#SBATCH --output=slurm.%j.out    # STDOUT (%N: nodename, %j: job-ID)
#SBATCH --error=slurm.%j.err     # STDERR
#SBATCH --partition=any          # partition (queue)
#SBATCH --nodes=1                # number of tasks/cores
#SBATCH --ntasks-per-node=32     # number of cores per node
#SBATCH --mem=1G                 # memory per node in MB (different units with suffix K|M|G|T)
#SBATCH --time=2:00              # total runtime of job allocation ((format D-HH:MM:SS; first parts optional)

# start program (with 24 threads, in total 32 threads were requested by the job)
export OMP_NUM_THREADS=24
./test_openmp.exe

MPI C program

C-program named test_mpi.c :


#include <stdio.h>
#include <unistd.h>
#include <mpi.h>
int main(int argc, char **argv) {
  int size, rank;
  char hostname[80];

  MPI_Init(&argc, &argv);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  gethostname(hostname, 80);
  printf("Hello world from %d (on node %s) of size %d!\n", rank, hostname, size);
  MPI_Finalize();
  return 0;
}

Makefile


CC     = mpicc
CFLAGS = -O

#default rules
%.o: %.c
        $(CC) $(CFLAGS) -c $<
%.exe: %.o
        $(CC) -o $@ $< $(LDLIBS)

default:: test_mpi.exe

Batch script


#!/bin/bash
#SBATCH --output=slurm.%j.out    # STDOUT (%N: nodename, %j: job-ID)
#SBATCH --error=slurm.%j.err     # STDERR
#SBATCH --partition=any          # partition (queue)
#SBATCH --nodes=5                # number of tasks/cores
#SBATCH --ntasks-per-node=32     # number of cores per node
#SBATCH --mem=4G                 # memory per node in MB (different units with suffix K|M|G|T)
#SBATCH --time=2:00              # total runtime of job allocation ((format D-HH:MM:SS; first parts optional)

module load gcc openmpi/gnu

# start program (with maximum parallelism as specified in job request, for this example 5*32=160)
mpirun ./test_mpi.exe


Applications

For some of the application programs installed a brief description is given here how to use them.

Matlab

Beside the basic Matlab program there are several Matlab toolboxes installed.

Using Matlab interactively

To run Matlab interactively on wr0 you have to do the following: Usage:
user@wr0: module load matlab
user@wr0: matlab
This starts the Matlab shell. If you logged in from a X-Server capable computer and used ssh -Y username@wr0.wr.inf.h-brs.de to login to wr0 the graphical panel appears on your computer instead of the text panel (see here for details of X-Server usage).

Using Matlab with the Batch System

Inside your batch job start Matlab without display:
    ...
    module load matlab
    matlab -nodisplay -nosplash -nodesktop -r "m-file"
where m-file is the name of your Matlab script with the suffix .m

Pitfalls Using Matlab

Matlab is very sensible with memory allocation / administration.

OpenFOAM

As there are several groups of OpenFOAM users we try to bring together these to coordinate the installation of one (or several) OpenFOAM versions. Please contact us if you are interested.


X11 applications

X11 applications are possible only on wr0. To use X11 applications that open a display on your local X-server (e.g. xterm, ...) you need to redirect the X11 output to your local X11 server and to allow another computer to open a window on your computer.
  1. The easiest way to enable this is to login to the WR-cluster with ssh and use the ssh option -Y (or with older ssh versions also -X ) that enables X11 tunneling through your ssh connection. If your login path goes over multiple computers please be sure to use the -Y option for every intermediate host on the path.
    Example:
    user@another_host:  ssh -Y user@wr0.wr.inf.h-brs.de
    
    On your local computer (i.e. where the X-server is running) you must allow wr0 to open a window. Execute on your local computer in a shell: xhost +
  2. Another possibility it to set the DISPLAY variable on the cluster and to allow other computers (i.e. the WR cluster) to open a window on your local X-Server.
    Example: Please be aware that newer versions of X-Servers don't support by default IP-Ports but rather Unix ports and therefore this second version doesn't work.
You can test your X11 setup executing in an ssh shell window on wr0 xterm. A window on your local computer must pop up with a shell on wr0.


FAQ