Frequently Asked Questions
General
- How do I get an account on the Infiniband
Cluster?
- How do I reserve compute time on the
cluster?
- What is the username and password
for the reservation system?
- How do I login to the Cluster?
- How do I avoid entering my password when
I login to a computer node?
- How do I transfer files to and from the
cluster?
- Is there a backup system for user data?
- Do I have to subscribe and read the
cluster mailing list ibcusers?
Software
- What Software is available on the Cluster?
- How can I use the PGI Compiler Suite?
- How can I use the Intel Compiler Suite?
MPI
- How do I compile and run MPI
applications?
- How do I avoid entering my password on
each node when I run MPI
programs?
- How can I kill a certain process on all nodes?
How do I get an account on the Infiniband
Cluster?
How do I reserve compute time on the
cluster?
Please follow the instructions on this page.
What is the username and password
for the reservation system?
Login to the gateway node and have a closer look at the login message.
How do I login to the Cluster?
In order to connect to the gateway use the following command:
ssh infinicluster.informatik.tu-muenchen.de
From there, use "ssh" to login to a node of the cluster, following
the
usage
model.
How do I avoid entering a password when
I login to a node?
You can create a ssh keypair to avoid entering you password:
First, login to any node of the Cluster,
then enter the following commands:
$ ssh-keygen -t rsa
[when prompted, just press enter, i.e. use default values and empty passphrase]
$ cd ~/.ssh
$ cat id_rsa.pub >> authorized_keys
$ chmod 600 authorized_keys
I M P O R T A N
T : Do not use empty passphrases to access the login node!
You should
only use this method for ssh-connections within the cluster.
How do I transfer files to and from the
cluster?
Use "scp" or "sftp" from any computer that can reach
"infinicluster.informatik.tu-muenchen.de".
Is there a backup system for user data?
No, there is no backup for user data,
i.e. all data in your home directory can be lost in case of a system
failure and files cannot be recovered if you accidentally delete them!
We urge you to keep a copy of all your important data on a different
computer.
Do I have to subscribe and read the
cluster mailing list ibcusers?
Yes! Once your account is created, you are automatically subscribed to
the ibcusers mailing list. Important information regarding the cluster
is announced here (downtimes, changing of usage model, hardware
problems, etc.). If you unsubscribe from the mailing list, it is
assumed that you no longer need your account.
What Software is available on the Cluster?
Take a look here.
How can I use the PGI Compiler Suite?
You have to set some environment variables:
Opteron nodes:
export PGI=/sw/compiler/pgi
export PATH=/sw/compiler/pgi/linux86-64/8.0/bin:$PATH
export MANPATH=$MANPATH:/sw/compiler/pgi/linux86-64/8.0/man
Now you can simply use the
different compilers:
# pgcc hello.c
# pgCC hello.cpp
# pgf77 hello.f
How can I use the Intel Compiler Suite?
You have to set some environment variables. Intel provides shell
scripts,
which can be sourced. On Itanium
nodes:
. /sw/compiler/intel/cc/9.1/bin/iccvars.sh
. /sw/compiler/intel/fc/9.1/bin/ifortvars.sh
. /sw/compiler/intel/idb/9.1/bin/idbvars.sh
On Opteron nodes:
Intel Compiler Version 10:
. /sw/compiler/intel/fce/10.1.008/bin/ifortvars.sh
. /sw/compiler/intel/cce/10.1.008/bin/iccvars.sh
. /sw/compiler/intel/idbe/10.1.008/bin/idbvars.sh
Intel Compiler Version 11:
. /sw/compiler/intel/fce/11.0.074/bin/ifortvars.sh intel64
. /sw/compiler/intel/cce/11.0.074/bin/iccvars.sh intel64
Now these compilers and tools are available:
# icc hello.c
# icpc hello.cpp
# ifort hello.f
# idb a.out (Debugger)
How do I compile and run MPI
applications?
MPI is supported on the Opteron nodes. There are two MPI
implementations available, OpenMPI and MVAPICH.
OpenMPI
- Append OpenMPI-binaries to your path and libraries to library path:
PATH=/sw/mpi/openmpi/bin:$PATH
LD_LIBRARY_PATH=/sw/mpi/openmpi/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH
export PATH
/sw/mpi/openmpi is
only a symbolic link. It always points to a OpenMPI version
which is linked with gcc.
Please have a look at /sw/mpi
for different version with
different compilers. Don't forget to make the other compilers available
in your
path environment variable before using mpicc oder mpif77 (see FAQ Software 2 and 3)
- Compile using "mpicc" or "mpif77"
# mpicc -o appl appl.c
- Use "mpirun"
to run mpi programs:
mpirun -np N h1,h2,...,hN a.out args
mpirun -np 4 -host opt01,opt02,opt03,opt04 ./appl
Alternatively, you can use a hostfile in which the nodes you want to
use are listed:
mpirun -np 16 -hostfile ./hostfile ./appl
A list of nodes is in hostfile, one per line. If you want to start
multiple processes on an node, use
slots:
opt18 slots=4
opt19 slots=4
opt20 slots=4
opt21 slots=4
MVAPICH
At the moment MVAPICH is not
stable, please use OpenMPI instead.
- Append the MVAPICH-binary directory to your path.
PATH=/sw/mpi/mvapich/bin:$PATH
export PATH
/sw/mpi/mvapich is
only a symbolic link. It always points to a MVAPICH version
which is linked with gcc.
Please have a look at /sw/mpi
for different version with
different compilers. Don't forget to make the other compilers available
in your
path environment variable before using mpicc oder mpif77 (see FAQ Software 2 and 3)
- Compile using "mpicc" or "mpif77"
# mpicc -o appl appl.c
- Use "mpirun_rsh"
to run mpi programs:
mpirun_rsh -np N h1 h2 ... hN a.out args
mpirun_rsh -np 4 opt01 opt02 opt03 opt04 ./appl
Alternatively, you can use a hostfile in which the nodes you want to
use are listed:
mpirun_rsh -np 4 -hostfile ./hostfile ./appl
A list of nodes is in hostfile, one per line. For example,
opt01
opt02
opt03
opt04
Listing a single node multiple
times will cause multiple
processes to run on that node.
How do I avoid entering my password on
each node when I run MPI
programs?
Please have a look here.
How can I kill a certain process on all nodes?
Axel Rimanek provided a script doing that. Thanks!
location: /sw/tools/killer
Quick and dirty solution to kill a.out on all opteron nodes:
for i in `seq -w 36`; do ssh opt$i killall -9 a.out; done