This page summarizes the basic commands to submit jobs on the batch system of Tier3.
Find more information on the PSI's CMS Tier3 twiki:
Say you have a python script in ~/test
that requires a CMSSW environment
and some input foo
and you would normally execute from the command line with
python myAnalysis.py foo
to produce some output file tree_foo.root
. Then the most straightforward way to submit it as a job is with a shell script of the form
#!/bin/bash # set variables INPUT=$1 OUTFILE="tree_${INPUT}.root" JOBDIR=$OUTFILE BASEDIR=$HOME/test OUTDIR=$BASEDIR/$JOBDIR TOPWORKDIR=/scratch/$USER WORKDIR=$TOPWORKDIR/$JOBDIR # set CMSSW environment for script source /afs/cern.ch/cms/cmsset_default.sh cd $BASEDIR/CMSSW_5_3_24/src eval `scram runtime -sh` # run script on working node's scratch mkdir -p $WORKDIR cd $WORKDIR python $BASEDIR/myAnalysis.py $INPUT # copy output back mkdir -p $OUTDIR cp $WORKDIR/$OUTFILE $OUTDIR/$OUTFILE rm -rf $WORKDIR exit 0
which you submit to run on the batch system with
qsub submitAnalysis.sh foo
Note for beginners: shell
variables are assigned without spaces, and can be accessed with the $
sign. Arguments passed to a shell script are accessed with $1
, $2
, … Example of basic syntax:
VAR="Hello" echo $VAR echo "$VAR, World" echo "${VAR}_World" MY_NAME=`whoami` echo "$MY_NAME, you are here: `pwd`"
Note for beginners: You need the CMSSW
environment to have ROOT
. To get some release, e.g. CMSSW_5_3_24
, and initiate it do
source $VO_CMS_SW_DIR/cmsset_default.sh cmsrel CMSSW_5_3_24 cd CMSSW_5_3_24/src cmsenv
Depending on how long your script needs to run, you might need to choose a different queue. Specify the queue with the -q
option. The default queue is all.q
.
qsub -q long.q submitAnalysis.sh
The jobs can be monitored with the command qstat
:
r
means running,qw
means waiting in queue,E
means in error state.
More details on a job can be found with qstat -j <jobid>
. If your jobs are named, you can also use qstat -j <jobname>
which might even contain a wildcard '*'.
Change the order of submission of the jobs waiting in the queue with qalter -js <jobshare> <jobid>
. The default job share is 0
and any integer value (e.g. 100
) will give the specified job a higher priority. The higher the value, the higher the priority.
Furthermore jobs can be deleted with qdel <jobid>
. To delete all your jobs, use qdel -u <username>
.
Protip: Quickly count the number of job that are running or in queue with grep
:
qstat | grep " r " | wc -l qstat | grep wq | wc -l
To learn more details about one specific job, use qstat -j <jobid>
. In case it is in an error state, use qstat -explain E -j <jobid>
.
The T3 TWiki has a page with information on debugging jobs interactively with the qlogin
command:
qlogin -q debug.q -l hostname=t3wn22 -l h_vmem=400M
If you want to isolate and save the standard output and the standard error streams (stdout and stderr) of your main script that would normally prompt in the Terminal window, you can redirect it as usual with »
and 2»
:
python $BASEDIR/myAnalysis.py $INPUT >> myout.txt 2>>myerr.txt
and then copy the text files back to where you want them.
By default, qsub
will create two log files; one with the stdout and one with stderr of the submission script. It will be saved in your home directory ~/
with names of the form <jobname>.o<jobid>
and <jobname>.e<jobid>
, respectively.
To save these log files in a custom location, use -o <path>
and -e <path>
, for example:
qsub -o $HOME/jobs_reports -e $HOME/jobs_reports submitAnalysis.sh foo
To keep your submission command short, you can also put the -o
and -e
options in your submit script instead, using the following syntax with #$
:
#$ -o /shome/myusername/jobs_reports/ #$ -e /shome/myusername/jobs_reports/
Nota bene!
Make sure these directory already exists before running the script with qsub
. Also, use absolute paths without anything that needs to be expanded or evaluated (like ~/
, $HOME
or `whoami`
).
Another qsub
option that is very useful to keep track of your logfiles is the -N <jobname>
option. This flag allows you to set the name of each job, which shows up in the qstat
output and in the name of the log files described above. By default, the jobname is the name of the shell script, submitAnalysis.sh
in the example above.
You can also copy large output files to the storage element using the copy command lcg-cp
or the recommended xrdcp
from XROOTD, see the TWiki on how To Acces SE or our page on the storage element.
USER_SE_HOME="srm://t3se01.psi.ch:8443/srm/managerv2?SFN=/pnfs/psi.ch/cms/trivcat/store/user/$USER" SERESULTDIR=$USER_SE_HOME/"analysis" lcg-cp -b -D srmv2 file:$WORKDIR/$OUTFILE $SERESULTDIR/$OUTFILE
or
USER_SE_HOME="root://t3dcachedb.psi.ch:1094//pnfs/psi.ch/cms/trivcat/store/user/$USER" SERESULTDIR=$USER_SE_HOME/"analysis" xrdcp -f $WORKDIR/$OUTFILE $SERESULTDIR/$OUTFILE
Note that in these examples you might need to create the necessary parent directories (analysis
in this example) on you SE home if it doesn't exist yet:
gfal-mkdir -p gsiftp://t3se01.psi.ch//pnfs/psi.ch/cms/trivcat/store/user/$USER/analysis
or more generally
gfal-mkdir -p gsiftp://t3se01.psi.ch/`echo $SERESULTDIR | grep -o '/pnfs/psi.ch/.*'`
A complete example of a bash script to submit the jobs is submitExample.sh
Variables to be configured:
The bash script can be run with qsub
:
qsub example_job.sh
If you want to split the events on several jobs you can do it manually as in the example example_splitjobs.py and run it with python. In this example the command line inputs are maxEvents, firstEvent, inputFileNames and the seed for PU simulation. This works if you first made the CMSSW python script configurable. That can be done following this link Command line option parsing.
Protip:You can see how busy the batch system is due to other users with this command:
qstat -u \* | tail -n +3 | awk '{if($5=="r"){r[$4]++} j[$4]++} END { for(n in j){ if(r[n]==""){ r[n]=0 } printf "%7s / %-5s - %s\n",r[n],j[n],n }}'