Running jobs at Hopper

In order to run jobs on NERSC's Hopper cluster you need to have job script to submit, (you can download a sample from here).
It looks like this:

#PBS -q debug
#PBS -l mppwidth=24
#PBS -l walltime=00:30:00
#PBS -N Job1
#PBS -e $PBS_JOBID.err
#PBS -o $PBS_JOBID.out
#PBS -V
cd $PBS_O_WORKDIR
echo "Changing to workdir $PBS_O_WORKDIR"
echo "listing workdir contents"
ls -ltr
aprun -n 24 ./nrlmol_exe > log.txt

The #PBS options you need to set are:

  • The -q option speficies on which queue to run the job in.
  • The first -l option specifies the number of requested processors (this number must be a multiple of 24).
  • The next -l option speficies the execution time limit (HH:MM:SS).
    You can find more information on the queue names, processor limits and time limits on the nersc webpage:
    http://www.nersc.gov/users/computational-systems/hopper/running-jobs/que...
  • The -N option is the job description (set this four you reference).
  • Leave the other options as they are:
    -e and -o speficify the output and error file names.
    The -V option speficies verbose output (this is good for tracking down errors).

The line that executes the program (also called binary file) is:
aprun -n 24 ./nrlmol.exe > log.txt
Make sure that the -n number is the same as the number given in the mppwidth option given a few lines before.
You may notive that the executable file name here is nrlmol.exe. It doesn't necessarily have to be, you may change it if you want.

  • You submit you job with:qsub job_hopper.txt
  • You monitor you job with:qstat -u username
  • you kill jobs with:qdel jobnumber

Don't forget to include your username when issuing qstat, otherwise you get the status of everyone's jobs.