From: <¨®¨¦ Microsoft Internet Explorer 5 ¡À¡ê¡ä?> Subject: An introduction to PORTABLE BATCH SYSTEM (PBS) : Submitting MPI parallel jobs Date: Sat, 12 Nov 2005 10:12:16 +0800 MIME-Version: 1.0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Location: http://hpc.sissa.it/pbs/pbs-3.html X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1506
The PBS batch system can be used to manage the nodes allocation in a = cluster=20 of hosts. For example, using a particular job script, it's possible to=20 communicate to the MPI launcher program (mpirun) the number and the list = of=20 nodes that PBS has allocated for the whole job as requested from the = user. The=20 PBS server will not run more jobs on the busy nodes until the end of the = current=20 job. Here is an example of script to do this over a Myrinet network = using a=20 implementation of MPICH over GM (a proprietary protocol developed by = Myricom);=20 in such a script normally you have only to change the number of nodes = required,=20 the working directory and the name executable MPI program.=20
#!/bin/sh
#! example of job file to submit parallel MPI applications
#! lines starting with #PBS are options for the qsub command
=20
#! Number of nodes (in this case I require 4 nodes with 2 CPU each)
#! The total number of nodes passed to mpirun will be nodes*ppn=20
#PBS -l nodes=3D4:ppn=3D2
=20
#! Name of output files for std output and error;
#! if non specified defaults are <job-name>.o<job number> =
and <job-name>.e<job-number>
#PBS -e test.err
#PBS -o test.log
#! Mail to user when job terminate or abort
#PBS -m ae
#!change the working directory (default is home directory)
cd <working directory>
=20
echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
echo This jobs runs on the following processors:
echo `cat $PBS_NODEFILE`
=20
#! Counts the number of processors
NPROCS=3D`wc -l < $PBS_NODEFILE`
echo This job has allocated $NPROCS nodes
=20
#! Create a machine file for Myrinet
echo $NPROCS >$PBS_JOBID.nodefile
awk '{if ($0 in vett) print $0 " " 7; else print $0 " " 6 ; =
vett[$0]=3D"x"}' $PBS_NODEFILE >>$PBS_JOBID.nodefile
=20
#! Run the parallel MPI executable (change the default a.out)
/usr/local/mpi-myri/bin/mpirun.ch_gm --gm-v --gm-f $PBS_JOBID.nodefile =
--gm-kill 30 -np $NPROCS a.out
=20
rm $PBS_JOBID.nodefile=20
A better solution is to substitute the standard MPI launcher (mpirun) = which=20 uses the rsh mechanism to run the application on the nodes with new = launcher=20 program using the task manager library of PBS to spawn copies of the = executable=20 on all the nodes. The goals of a such program are:=20
One implementation of this scheme for the Myricom net is the program = mpiexe which = integrates PBS with=20 the MPICH implementation over GM. In this case the example script can be = simplified:=20
#!/bin/sh #! example of job file to submit with qsub=20 #! lines starting with #PBS are options for the qsub command #! Number of nodes (8 in this case) #PBS -l nodes=3D4:ppn=3D2 #! Name of output files for std output and error; #! if non specified defaults are <job-name>.o<job number> = and <job-name>.e<job-number> #PBS -e test.err #PBS -o test.log #! Mail to user when job terminate or abort #PBS -m ae #! This job's working directory echo Working directory is $PBS_O_WORKDIR #!cd <working directory>=20 echo Running on host `hostname` echo Time is `date` echo Directory is `pwd` echo This jobs runs on the following processors: echo `cat $PBS_NODEFILE` #! option to kill all the processes if one of them dies export GMPIRUN_KILL=3D1 # or in csh: setenv GMPIRUN_KILL 1=20 export GMPIRUN_VERBOSE=3D1 # or in csh: setenv GMPIRUN_VERBOSE 1 =20 #! Run the parallel MPI executable - it's possible to redirect = stdin/stdout of all processes #! using "<" and ">" - including the double quotes=20 /usr/local/bin/mpiexec -bg a.out