HPC Cluster Job Scheduler: Difference between revisions

From HPC Docs
Jump to navigation Jump to search
No edit summary
Line 29: Line 29:
''Content to be created.''
''Content to be created.''


=== Constraints ===
Available [[HPC_SLURM_Features|constraints]].
Available [[HPC_SLURM_Features|constraints]].


Example
Example:
<pre>
<pre>
#SBATCH --constraint=skylake
#SBATCH --constraint=skylake
</pre>
=== Node Exclusivity ===
Example:
<pre>
#SBATCH --exclusive
</pre>
=== Job Arrays ===
Example 1:
<pre>
#SBATCH --output=job.%A_%a.out
#SBATCH --array=1-100
</pre>
Example 2:
<pre>
#SBATCH --output=job.%A_%a.out
#SBATCH --array=1-100%5
</pre>
</pre>



Revision as of 18:07, 2 June 2019

This content is under construction. Check back often for updates.

Submitting Your First HPC Job

Content to be created.

Anatomy of a SLURM Sbatch Submit Script

Content to be updated.

!/bin/bash

#SBATCH --workdir=./                     # Set the working directory
#SBATCH --mail-user=nobody@tcnj.edu      # Who to send emails to
#SBATCH --mail-type=ALL                  # Send emails on start, end and failure
#SBATCH --job-name=pi_dart               # Name to show in the job queue
#SBATCH --output=job.%j.out              # Name of stdout output file (%j expands to jobId)
#SBATCH --ntasks=4                       # Total number of mpi tasks requested
#SBATCH --nodes=1                        # Total number of nodes requested
#SBATCH --partition=test  		 # Partition (a.k.a. queue) to use

# Disable selecting Infiniband
export OMPI_MCA_btl=self,tcp

# Run MPI program
echo "Starting on "`date`
mpirun pi_dartboard
echo "Finished on "`date`

Advanced Submit Script Options

Content to be created.

Constraints

Available constraints.

Example:

#SBATCH --constraint=skylake

Node Exclusivity

Example:

#SBATCH --exclusive

Job Arrays

Example 1:

#SBATCH --output=job.%A_%a.out
#SBATCH --array=1-100

Example 2:

#SBATCH --output=job.%A_%a.out
#SBATCH --array=1-100%5

Example Submit Scripts

Content to be created.

ELSA Job Partitions/Queues

Parition/Queue Name Max Time Limit Resource Type
short 6 hours CPU
normal 24 hours CPU
long 7 days CPU
nolimit* none CPU
shortgpu 6 hours GPU
gpu 7 days GPU

* - Use of the nolimit partition is restricted to approved cluster users. Faculty may request access for themselves and students by emailing ssivy@tcnj.edu.