Job submission and monitoring
Most work on Lovelace is done by submitting jobs to the queueing system. A job is defined by its submission script, which specifies what programs to run and what resources will be needed for them.
An example submission script:
#PBS -q compute
#PBS -N test_job
#PBS -l walltime=00:10:00
#PBS -l nodes=1:ppn=40
#PBS -A training2020
module load intel/compiler/64/2018/18.0.5
The #PBS lines tell the queueing system what resources are being requested for this job. In this case it is requesting a single 40-core node from the compute queue for 10 monutes. The cost of the job will be billed to the training2020 project code. The job is called "test_job".
The other lines are executed when the job runs. In this example the job changes to the directory the job was submitted in (PBS_O_WORKDIR), load a module and then runs a program call test_script.
Submitting and Cancelling Jobs
To submit a job use the qsub command. For example, if the job's submission script is called "job.sh":
$ qsub job.sh
When the queueing system accepts a job it assigns it a job number. In this example the job is 56908.
To cancel a job use the qdel command with the job's number.
$ qdel 56908
If the job has already completed or been cancelled then you may get an error message.
Once a job has been submitted it will sit in the queue until resources are available to run it. To list the running and queued jobs there are several commands available.
showq usually provides the best overview of what jobs are queued and running.