Stella Job Submission Guide
This page is a guide, application by application, of the best way to run jobs on Stella.
Introduction
Stella has 5 compute nodes totalling 44 2.4GHz cores, with 2GB RAM per core:
- comp00: 8 cores
- comp01: 4 cores
- comp02: 8 cores
- comp03: 8 cores
- comp04: 8 cores
- comp05: 8 cores
Users of comp00 should therefore be ready to share the cpus with other users, and Grid Engine's queues are configured to use comp01-comp05 in priority for non-interactive jobs.
The following queues are configured:
- serial.q: allocation of comp01-comp05's cores based on load, leading to a round-robin-like allocation. When these 5 nodes are full, we start using comp00.
- interactive.q: 8 slots available on comp00. When used, it puts on hold every serial.q job running on comp00.
- parallel.q: not well configured yet, but is using comp01-comp05 and has a reservation scheme to be able to allocate any number of cores on specific nodes in spite of serial.q's round-robin allocation.
The following commands are useful:
- qsub: Submit a batch job to Grid Engine. You can read the man pages ("man qsub") for the following useful options: -sync y, -cwd
- qstat: Show the status of Grid Engine jobs and queues
- qhost: Show the status of Grid Engine hosts, queues, jobs
- qrsh: Start an interactive session. As interactive slots are limited to the 8 cores of comp00, please only use this mode for short experiments. Interactive jobs are automatically killed after 60 minutes.
In this section you can learn how to submit:
- Java applications
- Verilog simulations
- Standard Matlab
- Matlab Distributed Computing Engine
- MPI applications
Java
Warning: It is very easy to submit a parallel job improperly and unintentionally bypass the scheduling system, with the consequence of having your job clashing with other users' jobs and running at a non-optimal speed.
The most common mistake is to submit a threaded Java program to a serial queue: Your script will be assigned one core on one compute node, but the Java virtual machine will automatically use all the cores available on that node without telling the Grid Engine system.
You might end up sharing the compute node with other users while other nodes in the cluster are unused.
The following JDKs are installed and accessible on the front-end and compute nodes (installed by Kostas):
- JDK 1.4.2_10 in /usr/bin only on the front-end
- JDK 1.5.0_09 in /usr/local/appl/jdk1.5.0
- JDK 1.6.0_rc1 in /usr/local/appl/jdk1.6.0
user@stella:~> export PATH=/usr/local/appl/jdk1.6.0/bin:${PATH}
As highlighted in the previous warning, one difficulty with the Java Virtual Machine is that it automatically uses as many processors as possible. If your Java program has 2 user threads, it will automatically use 2 processors to execute it, and there is no way to restrict it to only one processor.
Note: there might be a (complex) way to do it. We will try to configure it in January.
Note2: the command 'top' only shows one Java process, even when it is using multiple processors. The only reliable way to see the total cpu consumption is to look at the idle percentage of time of the node: an 8 cores node showing 25% idle time means that 6 cores are fully used.
For this reason, we did set up 3 different queues for Java applications:
- Serial queue: reserve 1 core for Java programs known not to have any thread
- Full-node queue: reserve a full compute node (8 cores) for Java programs with an unknown number of threads
- Generic queue: if you know how many threads your Java program is using, you can specify it in this queue
Java - Serial Jobs
In order to submit a serial (i.e. with a single user thread) Java job to Grid Engine, a script similar to the following one can be used:#!/bin/bash
export PATH=/usr/local/appl/jdk1.6.0/bin:${PATH}
export CLASSPATH=/users/scripts/examples/java
java JavaSerialExample
Submit it using qsub:
user@stella:~> qsub /users/scripts/examples/java/JavaSerialExample.sh
Your job 1200 ("JavaSerialExample.sh") has been submitted
You can use qstat to check the status of your job (if qstat doesn't show anything, it means your job is finished executing).
The outputs (stdout and stderr) of the script are redirected to files in your home directory (see Getting Started):
Note: Using qsub -cwd directs Grid Engine to execute your program in the current directory. This is useful as it generates the stdout and stderr files in the current directory instead of the home directory. Using that might also avoid having to define the CLASSPATH, therefore making your script more portable across directories.user@stella:~> cat ~/JavaSerialExample.sh.o1200 Iteration 0 Iteration 10000000 Iteration 20000000 Iteration 30000000 Iteration 40000000 Iteration 50000000 Iteration 60000000 Iteration 70000000 Iteration 80000000 Iteration 90000000
Java - Threaded Jobs with unknown/varying number of threads
In order to run properly a threaded Java program with an unknown or varying number of threads, and to avoid conflicts with other jobs, it is necessary to reserve a full compute node.It should be noted that running a Java program with fewer than 8 threads will waste resources, as it will block a full 8 cores compute node. However, as long as the cluster is not under heavy load, it won't be a problem. This is therefore an acceptable way to run Java applications for the moment.
You should use Grid Engine's Parallel Environment called java and specify the number of cores desired: 8.
You should also use the -R y option to activate the reservation system, which prevents new serial jobs from filling a particular node while you wait for your 8 cores to become available.
user@stella:~> qsub -R y -pe java 8 /users/scripts/examples/java/JavaThreadedExample.sh
Your job 1200 ("JavaThreadedExample.sh") has been submitted
user@stella:~> qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 1200 10.01000 JavaThread user r 12/06/2006 12:12:22 parallel.q@comp02.cs.man.ac.uk 8
Java - Threaded Jobs with known number of threads
If you know the exact maximum number of simultaneous threads use by your Java application, you can specify it to the Grid Engine. This will only wait for and allocate the resources you need. The same -R y and -pe java <number_of_threads> options as above should be used:user@stella:~> qsub -R y -pe java 6 /users/scripts/examples/java/JavaThreadedExample.sh
Your job 1201 ("JavaThreadedExample.sh") has been submitted
Verilog simulations with Synopsys VCS
Synopsys VCS is installed and accessible on the front-end and compute nodes in /usr/local/cadtools/synopsys/X-2005.06.
The setup script to configure the license server and paths is /users/scripts/internal/setup_verilog_simulators.
user@stella:~> source /users/scripts/internal/setup_verilog_simulators
This setup script also changes your PATH to use the old GCC 3.2.3 and binutils 2.14. This is to make the compilation of SystemC files compatible with VCS.
You can use the example from /users/scripts/examples/verilog (You need to copy it first, as VCS will need write permissions). This directory contains the submitted script called run_verilogtest.sh:
user@stella:~> cd ~
user@stella:~> cp -rf /users/scripts/examples/verilog .
user@stella:~> cd verilog
user@stella:~/verilog> qsub -cwd run_verilogtest.sh
Your job 1251 ("run_verilogtest.sh") has been submitted
The -cwd option tells Grid Engine to execute the script in the current directory.
Check with qstat if the job is running. After a few seconds, the output file will contain the result of the execution:
user@stella:~/verilog> cat run_verilogtest.sh.o1251
Chronologic VCS (TM)
Version X-2005.06 -- Wed Dec 6 13:17:36 2006
Copyright (c) 1991-2005 by Synopsys Inc.
ALL RIGHTS RESERVED
This program is proprietary and confidential information of Synopsys Inc.
and may be used and disclosed only as authorized in a license agreement
controlling such use and disclosure.
Parsing design file 'verilogtest.v'
Top Level Modules:
CommentaryDemo
No TimeScale specified
Starting vcs inline pass...
1 module and 0 UDP read.
recompiling module CommentaryDemo
if [ -x ../simv ]; then chmod -x ../simv; fi
gcc -o ../simv -m32 5NrI_d.o 5NrIB_d.o lMun_1_d.o SIM_l.o /usr/local/cadtools/synopsys/X-2005.06/linux/lib/libvirsim.a /usr/local/cadtools/sy
nopsys/X-2005.06/linux/lib/libvcsnew.so -ldl -lc -lm -ldl
../simv up to date
Chronologic VCS simulator copyright 1991-2005
Contains Synopsys proprietary information.
Compiler version X-2005.06; Runtime version X-2005.06; Dec 6 13:17 2006
$finish at simulation time 110
V C S S i m u l a t i o n R e p o r t
Time: 110
CPU Time: 0.010 seconds; Data structure size: 0.0Mb
Wed Dec 6 13:17:58 2006
CPU time: .028 seconds to compile + .084 seconds to link + .068 seconds in simulation
Instead of checking regularly with qstat whether your job has finished or not, you can use the -sync y option to cause qsub to wait for the job to complete before exiting:
user@stella:~/verilog> qsub -sync y -cwd run_verilogtest.sh
Your job 1253 ("run_verilogtest.sh") has been submitted
Job 1253 exited with exit code 0.
You can also use our more generic script /users/scripts/internal/vcs_computenode followed by the name of the verilog file to compile:
janinl@stella:~/verilog> qsub -sync y -cwd /users/scripts/internal/vcs_computenode verilogtest.v
Your job 1254 ("vcs_computenode verilogtest.v") has been submitted
Job 1254 exited with exit code 0.
Eventually, more submission techniques will be shown in the Remote Access section.
Standard Matlab: Distributing standard Matlab jobs on stella
The standard version of Matlab (with the university license and all the toolboxes) is available on Stella in /usr/local/appl/matlab_from_opt_linux/matlab6.5.
Note that you need to update your PATH in order to use it:
user@linuxbox:~> ssh stella
Password: ********
user@stella:~> export PATH=/usr/local/appl/matlab_from_opt_linux/matlab6.5:${PATH}
Here are a few steps to run a Matlab script concurrently on N computers:
1. Prepare your files so that you can execute everything from the command line, preferably with only one command.
For example, create the following Matlab script sge_matlab_test.m:
# sge_matlab_test.m 1+2 exit
2. Test it (either on stella or with your local Matlab):
user@stella:~> matlab -nodisplay -r sge_matlab_test
ans =
3
3. Prepare a shell script for the scheduler: sge_matlab_test.sh:
#!/bin/sh /usr/local/appl/matlab_from_opt_linux/matlab6.5/bin/matlab -nodisplay -r sge_matlab_test
4. Send it to the scheduler
The -t option defines the task index range.user@stella:~> qsub -cwd -t 1-5 sge_matlab_test.sh
The -cwd option instructs the scheduler to run the command in the current directory.
You can use qstat to check the status of your job (if qstat doesn't show anything, it means your job is finished executing).
The outputs (stdout and stderr) of the script are redirected to files in the current directory (see Getting Started):
user@stella:~> cat sge_matlab_test.sh.o1500.2
[...]
ans =
3
5. The previous script is executing the same command on 5 different nodes. What you probably want is to execute the same command on different data sets. For this, the ${SGE_TASK_ID} variable can be added to the sge_matlab_test.sh script: It corresponds to the task index issued by qsub's -t option.
New sge_matlab_test2.sh:
#!/bin/sh
/usr/local/appl/matlab_from_opt_linux/matlab6.5/bin/matlab -nodisplay -r "sge_matlab_test2(${SGE_TASK_ID})"
The matlab script should use this new argument: New sge_matlab_test2.m:
function sge_matlab_test2(x) 1+x exit
Running it with the same qsub command as above produces the expected results: each job returns a different result (2, 3, 4, 5 and 6).
Matlab Distributed Computing Engine
The distributed version of Matlab is available on Stella in /usr/local/appl/matlab.
This is a trial license which includes the Distributed Toolbox (the front-end) and the Distributed Computing Engine (the back-end, or "workers").
Because this is a trial license node-locked to Stella, you can only use it by running matlab on Stella itself:
user@linuxbox:~> ssh -X stella
Password: ********
user@stella:~> matlab
OR
user@stella:~> matlab -nodisplay
For the following, you also need to setup ssh to work without password between stella and the compute nodes:
user@stella:~> mkdir .ssh user@stella:~> cd .ssh user@stella:~/.ssh> ssh-keygen -t dsa user@stella:~/.ssh> ssh-keygen -t rsa user@stella:~/.ssh> cat id_rsa.pub id_dsa.pub > authorized_keys2 user@stella:~/.ssh> ssh stella # try the new setup, and agree to any question
Matlab has three programming schemes for describing concurrent operations: distributed tasks, parallel tasks, and parallel interactive mode.
You can learn everything about distributed and parallel programming with Matlab's Distributed Computing Toolbox from their website.
Here are three examples showing how to use each programming model on Stella:
Matlab's distributed tasks
Here is how to create 3 distributed tasks, which will be running concurrently without communicating with each other:>> job = createJob(sge_scheduler);
>> createTask(job,@sum,1,{[1 1]});
>> createTask(job,@sum,1,{[2 2]});
>> createTask(job,@sum,1,{[3 3]});
>> submit(job)
>> waitForState(job,'finished',60)
>> results = getAllOutputArguments(job)
results =
[2]
[4]
[6]
>> destroy(job)
Matlab's parallel tasks
>> job = createParallelJob(sge_scheduler,'Name','testparalleljob');
>> createTask(job, 'rand', 1, {3})
>> set(job,'MinimumNumberOfWorkers',3);
>> set(job,'MaximumNumberOfWorkers',3);
>> submit(job)
>> waitForState(job)
>> out = getAllOutputArguments(job)
out =
[3x3 double]
[3x3 double]
[3x3 double]
>> celldisp(out);
out{1} =
0.9501 0.4860 0.4565
0.2311 0.8913 0.0185
0.6068 0.7621 0.8214
out{2} =
0.6380 0.7406 0.6831
0.8886 0.8543 0.9105
0.9549 0.9518 0.0034
out{3} =
0.1172 0.0667 0.1247
0.6990 0.7878 0.2673
0.8149 0.8313 0.0756
>> destroy(job)
Matlab's interactive pmode
And the best is Matlab's pmode parallel mode: You invoke it by specifying the number of workers (also called "labs" in this case) you want to use, and an interactive session starts in matlab, where all the commands you type are executed in parallel on the workers:>> pmode start generic 8 Submitted parallel job to the scheduler, waiting for it to start. Connected to a parallel job with 8 labs. P>> mat = randn(4096,4096,distributor()); P>> tic; mat2=mat*mat; toc 1: Elapsed time is 5.890824 seconds. 2: Elapsed time is 5.821183 seconds. 3: Elapsed time is 5.819019 seconds. 4: Elapsed time is 5.819174 seconds. 5: Elapsed time is 5.830432 seconds. 6: Elapsed time is 5.849642 seconds. 7: Elapsed time is 5.875286 seconds. 8: Elapsed time is 5.833961 seconds. P>> pmode exit
The same command executed without the Distributed Toolbox takes 32.466660 seconds to complete.
Note: Using more than 8 workers will require processes to communicate between nodes. Best performance is therefore reached with 8 parallel workers.
MPI applications
Jobs are automatically submitted to the serial queue when using qsub without any specific queue option.
MPI applications should be sent using qsub's parallel environment option -pe mpich <number_or_requested_slots> the mpichsub command (see example below).
The two most common parallel jobs are MPI programs and Java programs (check pthread-ed programs such as raytracer).
Warning: It is very easy to submit a parallel job improperly and unintentionally bypass the scheduling system, with the consequence of having your job clashing with other users' jobs and running at a non-optimal speed.
You might end up sharing one compute node with other users while other nodes in the cluster are almost unused.
MPI parallel job example
Compilation of a C example using MPIuser@stella:~> mpicc -o mpi_example /users/scripts/examples/mpi/C_example.c
Compilation of a C example using MPI and the parallel Scalapack library:
user@stella:~> mpicc -o mpi_scalapack_example /users/scripts/examples/mpi/scalapack_C_example.c /users/janinl/scalapack_package_for_stella/libscalapack.a /users/janinl/scalapack_package_for_stella/blacsF77init_MPI-LINUX-0.a /users/janinl/scalapack_package_for_stella/blacs_MPI-LINUX-0.a /users/janinl/scalapack_package_for_stella/blacsF77init_MPI-LINUX-0.a /usr/lib64/liblapack.a /usr/lib64/libblas.a /usr/local/mpich2-GF90/lib/libmpich.a -lgfortran -I/users/janinl/scalapack_package_for_stella -lg2c
Submission to the scheduler:
Mpichsub's argument "2x2" can be replaced by "mxn" where m=number of nodes to use, and n=number of cores per node.user@stella:~> mpichsub 2x2 mpi_example
