If you want to execute a number of very simple tasks, in a sequence of LoadLeveler steps, tasks which do not
involve much, if any, shell scripting, you may prefer to use LoadLeveler's
own multiple job steps facility. That
facility is a little bit tricky, and, in particular, you should not try to mix LoadLeveler steps with your own
self-submitting shell scripts, because that may easily lead to confusion. In particular, remember, that if you do not
use the LoadLeveler keyword
#@executable, and thus, according to LoadLeveler's semantics, the LoadLeveler
script itself becomes the executable, when the script is passed to, say,
ksh for execution, all LoadLeveler
keywords will be stripped, and the whole script will be executed in one go, even if the user has separated portions
of the script with multiple
Consider the following LoadLeveler job description file:
# # Common definitions for all three steps # # @ output = $(job_name).$(step_name).out # @ error = $(job_name).$(step_name).err # @ job_type = serial # @ class = test # @ notification = always # @ environment = COPY_ALL # @ job_name = hello # # The first step: compile the program. # # @ step_name = compile # @ executable = /afs/ovpit.indiana.edu/@sys/gnu/bin/gcc # @ arguments = -o hello hello.c # @ queue # # The second step: run the program if the compilation was successful. # # @ step_name = run # @ dependency = compile == 0 # @ executable = /afs/ovpit.indiana.edu/@sys/gnu/bin/bash # @ arguments = -c "exec hello" # @ queue # # The third step: remove the binary if the run was successful. # # @ step_name = clean # @ dependency = run == 0 # @ executable = /afs/ovpit.indiana.edu/@sys/gnu/bin/rm # @ arguments = -e hello # @ queue
When this script is submitted to LoadLeveler, three jobs will be placed in the queue. Initially two of those jobs will wait until the first job finishes execution. Then the second job will commence execution and the third will continue waiting. Finally, the third job will run. I should add that the second and the third jobs will run only if their direct ancestor has exited without any problems, leaving the exit status set to 0 behind.
The script is conceptually divided into four chunks.
The first chunk is a preamble with definitions common to all three job steps.
The second chunk describes the first step: it invokes the GNU C compiler and compiles a C program
generating a binary
hello, if the compilation has been successful.
The third chunk describes the second step: it will run only if the first step has left exit status 0 behind. That's what the directive
# @ dependency = compile == 0is about. Observe a small complication. Instead of defining
# @ executable = helloI have defined
# @ executable = /afs/ovpit.indiana.edu/@sys/gnu/bin/bash # @ arguments = -c "exec hello"The reason for this is that when the script is originally submitted to LoadLeveler, the file
hellodoesn't exist yet. So if I defined here
#@executable = helloLoadLeveler would refuse the job and flag an error. All executables specified with the
#@executablekeyword must exist at the time the LoadLeveler script is submitted. The remedy is to specify my login shell as the executable instead, and then substitute (with
exec) the shell with the binary produced in the first step.
The fourth chunk describes the third step: it will run only if the second step has left exit status 0 behind. That's what the directive
# @ dependency = run == 0achieves. It is your responsibility, as a programmer, to ensure that this is indeed the case when your program exits cleanly.
This step removes the binary generated by the first step. The command
rm is invoked with the
-e option which will
leave a trace on the
rm: Removing helloCan the same be achieved with shell scripting? Although I have warned you about possible pitfalls when mixing scripting and LoadLeveler steps, it is OK to do so, as long as your script does not attempt to resubmit itself. You might even consider the latter, but in that case you must carefully scrutinise the logic of both the shell script and the overlaying LoadLeveler script. Things may become easily convoluted, but not necessarily incorrect! Also, you should remember that the first occurrence of the keyword
#@executablewill override the shell script for all consecutive steps. If a shell script is present in the LoadLeveler command file, all steps defined before the first occurrence of the keyword
#@executablewill see the same script. Consequently, the script itself must be able to recognise which particular step is being executed during its instantiation and differentiate its actions accordingly. That information can be obtained from the environmental variable
Here is an example of a 3-step LoadLeveler job, equivalent to the one discussed above, in which the actions are
specified entirely using a shell script rather than three different
# @ shell = /afs/ovpit.indiana.edu/@sys/gnu/bin/bash # @ output = $(job_name).$(step_name).out # @ error = $(job_name).$(step_name).err # @ job_type = serial # @ class = test # @ notification = never # @ environment = COPY_ALL # @ job_name = hello # # @ step_name = compile # @ queue # # @ step_name = run # @ dependency = compile == 0 # @ queue # # @ step_name = clean # @ dependency = run == 0 # @ queue # echo step: $LOADL_STEP_NAME case $LOADL_STEP_NAME in compile ) gcc -v -o hello hello.c 2>&1 ;; run ) hello ;; clean ) rm -e hello 2>&1 ;; esac