Cambridge-Cranfield HPCF: Introductory User Guide

Back to Pt3 (Job Submission)

Franklin: Running Jobs

When logging into Franklin you will actually get connected to the front-end machine, franklin-1. The frontend can be used for short interactive jobs and compilation. The frontend has 8 UltraSPARC IIICu 900Mz processors and 16Gb memory. Batch jobs are submitted to the queuing system and processed on the computation nodes. There should be no need to log into the computation nodes directly.

Interactive Usage

Small test jobs, both serial and parallel, can be run on the front-end machine. There is a five minute limit on interactive jobs. To run an MPI job on 2 CPU's in the background, whilst redirecting output to the file out.dat, use;

franklin-1> mprun -n -np 2 ./mycode.mpi > out.dat &
franklin-1>

Batch Jobs

Franklin uses the Sun Grid Engine Batch queuing system. There are queues s<n>, t<n> and u<n>, where <n> is the number of processors and may be 4, 6, 8, 12, 24, 32, 48, 64, 84 or 96. The 's' queues have a 24 hours real-time limit, the 't' ones a 2 hour limit and the 'u' ones a 10 minute limit. (The ``s'' queues are for production runs, ``t'' and ``u'' are for testing and development)

Generally, the s96 logical queue is the best one for production work, but users with heavily memory- and communication-bound jobs may get better throughput out of the s64 logical queue.

The actual number of CPUs allocated one of 6, 12, 24, 48 or 96. Logical queues of other sizes will reserve 6*N CPU's but execute on a subset of these CPU's. For example a job in the t8 queue will reserve 12 CPU's. It should be noted that your resource usage is calculated from the number of CPU's reserved, so a 2hour 64 CPU job will ``cost'' the same as a 2 hour 96 CPU job (as it is not possible to use the remaining 28 CPU's).

If your program is CPU bound, you should use the maximum number of CPU's (ie a queue of size 6*N). The logical queues with 4, 8, 16, 32 and 64 CPUs use only the `main' CPUs and may provide 50% more memory bandwidth and memory per CPU; if your program is memory bound, you should use a queue of size 4*N. If in doubt, try both and use the one with the smaller wall-clock time, or ask for advice.

The x84 logical queues are mainly for people doing performance analysis, as they have exactly the same number of CPUs on each board, where the x96 ones don't.

The 64 processor queue is different from the others in that it only uses the 'main' CPUs and will provide 50% more memory bandwidth and memory per CPU; if your program is memory bound, you should use this queue rather than the 96 processor queues. If in doubt, try both and use the one with the smaller wall-clock time, or ask for advice. All of the 64, 84 and 96 processor queues give exclusive use of a whole node.

To submit a job, type qsub -Q queue test.job on one of the frontend machines, where the test.job may be a script or executable. It will be scheduled and run in the directory (and most of the environment) that it was submitted from and NOT as if it were a fresh login. Note that the -Q option is a local feature. To run a program using MPI on 12 processors a suitable job submission script is

franklin-1 [1] cat test.job
#!/bin/sh
date
mpirun -np 12 myprog.mpi

and this can be submitted to the 24hr queue with

franklin [2] qsub -Q s12 test.job
your job 2352 ("job.go") has been submitted

The command qstatu can be used to monitor the status of the job

franklin [3] qstatu
job-ID prior name       user    state  submit/start at    queue  master
-----------------------------------------------------------------------
2352    0   job.go      spqr1    r    03/15/2003 15:52:42 b-s96  MASTER

The state of a job is usually one of (r)unning, (q)ueued, (d)eleted or (t)hreaded (ie about to be executed).

The command qstatj job-ID gives more information about the state of a particular job. The qstat command also gives detailed information, consult its man page for more details.

To delete a job use the qdel command

franklin [4] qdel 2352
spqr1 has deleted job 2352

Both stdout and stderr are returned in files such as job.go.o2352 and job.go.e2352.

Job Scheduling and Priority Access

Franklin runs Sun's Grid Engine queuing system. Grid Engine schedules jobs in an attempt to share the available resources between different users and projects. The command qstats provides information on job scheduling and a user's resources.

franklin-1> qstats
The pts is the percentage share of the resources the project should get
The pas is the percentage share of the resources the project has had
The usage is a time decayed past cpu usage of the user
JID   user   project  state sub/start at shares  Q pts   pas    usage
2611  qwer1   hpcf2-pay r   1/3  13:18:24 2470 s96 0.08  6.99  4154154
2583  sprq1   hpcf2     r  29/2  15:02:27    5 s96 1.42 25.96  1244804
2498  asdf2   hpcf10    qw 20/2  17:24:29  500 s96 0.19  2.42  1244804
2614  spqr1   hpcf2     qw  1/3  13:57:05  476 s96 1.42 25.96  6951406

Projects with the suffix ``pay'' are granted a larger share of the computing resources under the priority access scheme (generally these are projects which have contributed towards the running costs of the CCHPCF). Users should contact their project leader for more details.

Users who belong to the priority access scheme may use the local options to qsub, -priority or -nopriority to select or deselect the priority mechanism for their jobs. The default will usually be ``priority''. As an example to submit a priority job to the 96 processor production queue;

franklin [2] qsub -priority -Q s96 job.go
your job 2353 ("job.go") has been submitted

Users can check their monthly charged and uncharged usage using the command myusage

franklin [3]  myusage
Charged         76987.2 CPUh  £10778.21

The -l option to myusage will report the usage from the previous month,

franklin [3]  myusage -l
Last month
Not Charged      5719.9 CPUh

There is further information for project leaders on the CCHPCF website.

Compilation

We give a quick example of fortran programming with MPI. Further details will be found on the CCHPCF website.

On the CCHPCF, the relevant commands have been surrounded by wrapper scripts, so that the default is to provide a reasonable set of defaults for the CCHPCF systems and some detection of potential pitfalls. This is optional, and can be disabled entirely, but you are advised not to do that unless you know what you are doing.

By default the environment variable HPCF_MODE is set to ``yes''. This will set a good level of optimisation and link to the Sun Performance Libraries, which contain optimised versions of BLAS, LAPACK, FFT routines etc.

To compile an MPI program you should set the environment variable HPCF_MPI to ``yes''. For (ba)sh

franklin-1> export HPCF_MPI=yes

This will set up the correct paths and libraries. A complete example of compiling a Fortran MPI program is given below:

franklin-1> cat hello.f90
program hello
  include 'mpif.h'
  integer npe,mype,ierr

  call mpi_init(ierr)
  if (ierr.ne.0) stop 'MPI initialisation error'

  call mpi_comm_rank(mpi_comm_world, mype, ierr)
  call mpi_comm_size(mpi_comm_world, npe, ierr)

  write(*,101) mype,npe
101 format(' Hello parallel world, I am process ',I3,' out of ',I3)

  call mpi_finalize(ierr)
end program hello

franklin-1> export HPCF_MPI=yes
franklin-1> mpf90 hello.f90
franklin-1>
franklin-1> mprun -np 4 ./a.out
 Hello parallel world, I am process   3 out of   4
 Hello parallel world, I am process   1 out of   4
 Hello parallel world, I am process   2 out of   4
 Hello parallel world, I am process   0 out of   4
franklin-1>