Unity
Unity
About
News
Events
Docs
Contact Us
code
search
login
Unity
Unity
About
News
Events
Docs
Contact Us
dark_mode
light_mode
code login
search

Documentation

  • Requesting An Account
  • Get Started
    • Quick Start
    • Common Terms
    • HPC Resources
    • Theory of HPC
      • Overview of threads, cores, and sockets in Slurm for HPC workflows
    • Git Guide
  • Connecting to Unity
    • SSH
    • Unity OnDemand
    • Connecting to Desktop VS Code
  • Get Help
    • Frequently Asked Questions
    • How to Ask for Help
    • Troubleshooting
  • Cluster Specifications
    • Node List
    • Partition List
      • Gypsum
    • Storage
    • Node Features (Constraints)
      • NVLink and NVSwitch
    • GPU Summary List
  • Managing Files
    • Command Line Interface (CLI)
    • Disk Quotas
    • FileZilla
    • Globus
    • Scratch: HPC Workspace
    • Unity OnDemand File Browser
  • Submitting Jobs
    • Batch Jobs
      • Array Batch Jobs
      • Large Job Counts
      • Monitor a batch job
    • Helper Scripts
    • Interactive CLI Jobs
    • Unity OnDemand
    • Message Passing Interface (MPI)
    • Slurm cheat sheet
  • Software Management
    • Building Software from Scratch
    • Conda
    • Modules
      • Module Usage
    • Renv
    • Unity OnDemand
      • JupyterLab OnDemand
    • Venv
  • Tools & Software
    • ColabFold
    • R
      • R Parallelization
    • Unity GPUs
  • Datasets
    • AI and ML
      • AlpacaFarm
      • audioset
      • bigcode
      • biomed_clip
      • blip_2
      • coco
      • Code Llama
      • DeepAccident
      • DeepSeek
      • DINO v2
      • epic-kitchens
      • florence
      • gemma
      • glm
      • gpt
      • gte-Qwen2
      • ibm-granite
      • Idefics2
      • Imagenet 1K
      • inaturalist
      • infly
      • internLM
      • intfloat
      • lg
      • linq
      • Llama2
      • llama3
      • llama4
      • Llava_OneVision
      • Lumina
      • mims
      • mixtral
      • msmarco
      • natural-questions
      • objaverse
      • openai-whisper
      • Perplexity AI
      • phi
      • playgroundai
      • pythia
      • qwen
      • rag-sequence-nq
      • s1-32B
      • satlas_pretrain
      • scalabilityai
      • sft
      • SlimPajama
      • t5
      • Tulu
      • V2X
      • video-MAE
      • videoMAE-v2
      • vit
      • wildchat
    • Bioinformatics
      • AlphaFold3 Databases
      • BFD/MGnify
      • Big Fantastic Database
      • checkm
      • ColabFoldDB
      • Databases for ColabFold
      • dfam
      • EggNOG
      • EggNOG
      • GMAP-GSNAP database (human genome)
      • GTDB
      • Illumina iGenomes
      • Kraken2
      • MGnify
      • NCBI BLAST databases
      • NCBI RefSeq database
      • NCBI RefSeq database
      • Parameters of AlphaFold
      • Parameters of Evolutionary Scale Modeling (ESM) models
      • Parameters of Evolutionary Scale Modeling (ESM) models
      • PDB70
      • PINDER
      • PLINDER
      • Protein Data Bank
      • Protein Data Bank database in mmCIF format
      • Protein Data Bank database in SEQRES records
      • Tara Oceans 18S amplicon
      • Tara Oceans MATOU gene catalog
      • Tara Oceans MGT transcriptomes
      • Uniclust30
      • UniProtKB
      • UniRef100
      • UniRef30
      • UniRef90
      • Updated databases for ColabFold
    • Using HuggingFace Datasets

Documentation

  • Requesting An Account
  • Get Started
    • Quick Start
    • Common Terms
    • HPC Resources
    • Theory of HPC
      • Overview of threads, cores, and sockets in Slurm for HPC workflows
    • Git Guide
  • Connecting to Unity
    • SSH
    • Unity OnDemand
    • Connecting to Desktop VS Code
  • Get Help
    • Frequently Asked Questions
    • How to Ask for Help
    • Troubleshooting
  • Cluster Specifications
    • Node List
    • Partition List
      • Gypsum
    • Storage
    • Node Features (Constraints)
      • NVLink and NVSwitch
    • GPU Summary List
  • Managing Files
    • Command Line Interface (CLI)
    • Disk Quotas
    • FileZilla
    • Globus
    • Scratch: HPC Workspace
    • Unity OnDemand File Browser
  • Submitting Jobs
    • Batch Jobs
      • Array Batch Jobs
      • Large Job Counts
      • Monitor a batch job
    • Helper Scripts
    • Interactive CLI Jobs
    • Unity OnDemand
    • Message Passing Interface (MPI)
    • Slurm cheat sheet
  • Software Management
    • Building Software from Scratch
    • Conda
    • Modules
      • Module Usage
    • Renv
    • Unity OnDemand
      • JupyterLab OnDemand
    • Venv
  • Tools & Software
    • ColabFold
    • R
      • R Parallelization
    • Unity GPUs
  • Datasets
    • AI and ML
      • AlpacaFarm
      • audioset
      • bigcode
      • biomed_clip
      • blip_2
      • coco
      • Code Llama
      • DeepAccident
      • DeepSeek
      • DINO v2
      • epic-kitchens
      • florence
      • gemma
      • glm
      • gpt
      • gte-Qwen2
      • ibm-granite
      • Idefics2
      • Imagenet 1K
      • inaturalist
      • infly
      • internLM
      • intfloat
      • lg
      • linq
      • Llama2
      • llama3
      • llama4
      • Llava_OneVision
      • Lumina
      • mims
      • mixtral
      • msmarco
      • natural-questions
      • objaverse
      • openai-whisper
      • Perplexity AI
      • phi
      • playgroundai
      • pythia
      • qwen
      • rag-sequence-nq
      • s1-32B
      • satlas_pretrain
      • scalabilityai
      • sft
      • SlimPajama
      • t5
      • Tulu
      • V2X
      • video-MAE
      • videoMAE-v2
      • vit
      • wildchat
    • Bioinformatics
      • AlphaFold3 Databases
      • BFD/MGnify
      • Big Fantastic Database
      • checkm
      • ColabFoldDB
      • Databases for ColabFold
      • dfam
      • EggNOG
      • EggNOG
      • GMAP-GSNAP database (human genome)
      • GTDB
      • Illumina iGenomes
      • Kraken2
      • MGnify
      • NCBI BLAST databases
      • NCBI RefSeq database
      • NCBI RefSeq database
      • Parameters of AlphaFold
      • Parameters of Evolutionary Scale Modeling (ESM) models
      • Parameters of Evolutionary Scale Modeling (ESM) models
      • PDB70
      • PINDER
      • PLINDER
      • Protein Data Bank
      • Protein Data Bank database in mmCIF format
      • Protein Data Bank database in SEQRES records
      • Tara Oceans 18S amplicon
      • Tara Oceans MATOU gene catalog
      • Tara Oceans MGT transcriptomes
      • Uniclust30
      • UniProtKB
      • UniRef100
      • UniRef30
      • UniRef90
      • Updated databases for ColabFold
    • Using HuggingFace Datasets

On this page

  • Common terms
  • Submit a job with arguments
  • Job steps
  • Interactive jobs
  • Modify a job
  • Cancel a job
  • Check a running job
  • Check on recent job status
  • Check on older job status
  • Check on node status
  1. Unity
  2. Documentation
  3. Submitting Jobs
  4. Slurm cheat sheet

Slurm cheat sheet

Slurm is the job scheduler that we use in Unity. For an introduction on Slurm, see Introduction to Slurm: The Job Scheduler.

Common terms

The following is a list of common terms in Slurm (See also: an Overview with a more complete description.):

  • Node - a single computer.
  • Socket - a single CPU.
  • Core - a single compute unit inside a CPU.
  • CPU - one core, except on power9 where it is a Thread within a Core.
  • Job - a schedulable unit; an allocation of resources.
  • Job step - a set of related processes within a task. .batch is the script as submitted, .0...X are any srun invocations from within the script
  • Task - a process within a job step.

Submit a job with arguments

To submit a job in Slurm, use the sbatch command. If you want to set parameters for your job, there are many arguments available for you to add to your batch file. See the Introduction to batch jobs page for examples of how to create a batch file with arguments. Note that any arguments specified on the command line when submitting your job override those in the file.

The following table contains a list of arguments you can use to specify parameters for your batch job. For more detailed information on submitting jobs in Slurm, see the sbatch manpage.

ArgumentDescription
General
--time=, -tSet the worst case estimate of job run time in Days-Hours:Minutes:Seconds format.
--time-min=Set the minimum amount of time the job can usefully run for. This should be smaller than --time=, and may allow the scheduler to start the job sooner.
--job-name=, -JSet the name of the job; default name: script name
Use sacct --name=... to find it later.
--output=, -oSpecify the filename to place output. If an error occurs, it will be placed in output by default.
--error=, -eSpecify the filename to place error output. Only use this argument if you want a separate place to store errors.
--exclusiveRequest entire nodes. This results in better performance for jobs that can use multiple cores or most of the memory, but generally results in longer queue times. Recommended with --mem=0.
--mail-type=...Send an email when the job changes state. Usually FAIL,END,TIME_LIMIT_80 are the most useful. See sbatch manpage for a complete list of values.
Compute Resources
--nodes=<n>, -NSpecify the number of nodes to use; should be 1 unless it supports MPI.
--cpus-per-task=<n>, -cSpecify the number of cores per task to allocate.
--mem=<n>gSpecify the number of Gigabytes of memory per-node.
--mem-per-cpu=<n>gSpecify the number of Gigabytes of memory per core (alternative to --mem)
--ntasks=<n>, -nSpecify the number of tasks to allocate space for (MPI=number of processes).
--ntasks-per-node=<n>Specify the number of tasks per node (considered maximum when used with --ntasks).
--constraint=mpiWhen using -n without -N, ensure all the CPUs are the same model.
--constraint=...Specify the Constraints See also sbatch manpage for more information on syntax.
GPU Resources
--gpus=<n>,
--gres=gpu:<n>,
--gres=gpu:<type>:<n>
Specify the number of GPUs per Job. See Using GPUs.
--gpus-per-task=<n>,
--gpus-per-task=<type>:<n>
Specify the number of GPUs per Task. (Alternative to above)
--gpus-per-node=<n>,
--gpus-per-node=<type>:<n>
Specify the number of GPUs per Node. (Alternative to above)
--ntasks-per-gpu=<n>Specify the number of tasks per GPU allocated.
--mem-per-gpu=<n>gSpecify the number of Gigabytes of CPU memory per GPU (altenative to --mem; NOT VRAM).
--constraint="sm_XX&vrmamYY"Specify the Constraints for minimum compute capability level and minimum VRAM (GB) per GPU.
Related Jobs
--array=<indices>Create Array Job See also sbatch manpage for more information.
--dependency=...Configure dependencies between jobs. See sbatch manpage for more information.
Uncommon
--account=pi...Use a given account (most relevant for classes and Gypsum users).

Job steps

Inside of your batch file, use the srun command to specify the command to run across the nodes allocated.

It’s uncommon to need to specify other arguments with this command, but srun accepts most of the arguments from the arguments table if necessary, with the exceptions of --array and --dependency. See the srun manpage for more detailed information.

Interactive jobs

To start an interactive job, use the salloc command followed by arguments that specify details about your job.

Similar to srun, salloc takes the same arguments as sbatch, except --array. The Using SALLOC page has more information, as does the official salloc manpage.

Modify a job

It’s possible to change some job properties while they are pending, and a few after they start running with the scontrol modify jobid= command. Use <tab> completion to see all the various parameters that can be changed for pending jobs. The following is a list of the most common arguments:

ArgumentDescription
arraytaskthrottleAdjust the maximum number of array items that can run currently.
mailtypeChange the events that generate an email for this job.
mailuserEmail to send to ; uses account email by default.
timelimitAdjust the time limit for a job (while pending only).
partitionAdjust the list of partitions the job is submitted to.
qosSet the QOS to use for this job (currently only adding short makes sense).
niceLower the priority of a pending job.

You can use the separate command scontrol top <jobid_list> to give higher priority to specific jobs compared to your other jobs in the same partition. This command accepts a comma-separated list of job IDs. Note that this command only works for jobs within a single partition.

Cancel a job

To cancel running and pending jobs, use scancel jobid. To cancel a running step, use scancel jobid.step.

Check a running job

There are multiple ways to check the progress and efficiency of a running job. See Monitoring a Batch job for details.

Check on recent job status

To check on the status of recent jobs, use the squeue command followed by an argument that specifies what type of information you want to view. Note that only jobs that are currently running or finished in the last ~5 minutes are available with the squeue command. The following table shows common argument options. For more details, see the squeue manpage.

report
Do not run this command in a quick loop as it can take time to process.
ArgumentDescription
--meShow only your jobs.
--startShow the most pessimistic estimate of when a job can start, if available, and the reason it’s waiting.
In some cases the reason may not be available or may be wrong.
-j <jobid>Show the job specified.
--account=pi..., -AShow only jobs from a list of PI groups.
--state=pd,r,f, -tShow only jobs in the pending, running, or failed state.

Check on older job status

To check on the status of an older job, use the sacct command followed by an argument that specifies what type of information you want to view. The following table shows common argument options. For more detailed information, see the sacct manpage.

ArgumentDescription
--user=usernameList jobs from another user (defaults to your own jobs only).
-A, --account=pi...List all jobs from a given group.
--start=<date/time>
--end=<date/time>
Show only jobs started or running between these times. Formats can be YYYY-MM-DDThh:mm:ss (i.e., literal T between) YYYY-MM-DD, MMDD, or hh:mm. --end defaults to now, and --start defaults to previous midnight.
--state=...Limit jobs to only a list of states. Must specify --end for this to work. requeue requires specifying --duplicates. States include completed, failed, running, pending, node_fail, requeue, timeout.
--name=Limit result to jobs with a given name or list of names.

Check on node status

To check on the status of slurm nodes and partitions, use the sinfo command followed by an argument that specifies what type of information you want to view. The following table shows common argument options. For more details, see the sinfo manpage.

ArgumentDescription
--summary, -sShow summary statistics of nodes (Allocated/Idle/Other/Total).
--partition=, -pLimit display to a list of partitions.
Last modified: Monday, April 14, 2025 at 1:10 PM. See the commit on GitLab.
University of Massachusetts Amherst University of Massachusetts Amherst University of Rhode Island University of Rhode Island University of Massachusetts Dartmouth University of Massachusetts Dartmouth University of Massachusetts Lowell University of Massachusetts Lowell University of Massachusetts Boston University of Massachusetts Boston Mount Holyoke College Mount Holyoke College Smith College Smith College
search
close