Unity
Unity
About
News
Events
Docs
Contact Us
code
search
login
Unity
Unity
About
News
Events
Docs
Contact Us
dark_mode
light_mode
code login
search

Documentation

  • Requesting An Account
  • Get Started
    • Quick Start
    • Common Terms
    • HPC Resources
    • Theory of HPC
      • Overview of threads, cores, and sockets in Slurm for HPC workflows
    • Git Guide
  • Connecting to Unity
    • SSH
    • Unity OnDemand
    • Connecting to Desktop VS Code
  • Get Help
    • Frequently Asked Questions
    • How to Ask for Help
    • Troubleshooting
  • Cluster Specifications
    • Node List
    • Partition List
      • Gypsum
    • Storage
    • Node Features (Constraints)
      • NVLink and NVSwitch
    • GPU Summary List
  • Managing Files
    • Command Line Interface (CLI)
    • Disk Quotas
    • FileZilla
    • Globus
    • Scratch: HPC Workspace
    • Unity OnDemand File Browser
  • Submitting Jobs
    • Batch Jobs
      • Array Batch Jobs
      • Large Job Counts
      • Monitor a batch job
    • Helper Scripts
    • Interactive CLI Jobs
    • Unity OnDemand
    • Message Passing Interface (MPI)
    • Slurm cheat sheet
  • Software Management
    • Building Software from Scratch
    • Conda
    • Modules
      • Module Usage
    • Renv
    • Unity OnDemand
      • JupyterLab OnDemand
    • Venv
  • Tools & Software
    • ColabFold
    • R
      • R Parallelization
    • Unity GPUs
  • Datasets
    • AI and ML
      • AlpacaFarm
      • audioset
      • biomed_clip
      • blip_2
      • blip_2
      • coco
      • Code Llama
      • DeepAccident
      • DeepSeek
      • DINO v2
      • epic-kitchens
      • florence
      • gemma
      • gpt
      • gte-Qwen2
      • ibm-granite
      • Idefics2
      • Imagenet 1K
      • inaturalist
      • infly
      • instruct-blip
      • intfloat
      • LAION
      • linq
      • llama
      • Llama2
      • llama3
      • llama4
      • Llava_OneVision
      • Lumina
      • mixtral
      • msmarco
      • natural-questions
      • objaverse
      • openai-whisper
      • pythia
      • qwen
      • R1-1776
      • rag-sequence-nq
      • red-pajama-v2
      • s1-32B
      • satlas_pretrain
      • scalabilityai
      • SlimPajama
      • t5
      • Tulu
      • V2X
      • video-MAE
      • videoMAE-v2
      • vit
      • wildchat
    • Bioinformatics
      • AlphaFold3 Databases
      • BFD/MGnify
      • Big Fantastic Database
      • checkm
      • ColabFoldDB
      • dfam
      • EggNOG
      • EggNOG
      • gmap
      • GMAP-GSNAP database (human genome)
      • GTDB
      • igenomes
      • Kraken2
      • MGnify
      • NCBI BLAST databases
      • NCBI RefSeq database
      • NCBI RefSeq database
      • Parameters of Evolutionary Scale Modeling (ESM) models
      • params
      • PDB70
      • PDB70 for ColabFold
      • PINDER
      • PLINDER
      • Protein Data Bank
      • Protein Data Bank database in mmCIF format
      • Protein Data Bank database in SEQRES records
      • Tara Oceans 18S amplicon
      • Tara Oceans MATOU gene catalog
      • Tara Oceans MGT transcriptomes
      • Uniclust30
      • UniProtKB
      • UniRef100
      • UniRef30
      • UniRef90
      • Updated databases for ColabFold
    • Using HuggingFace Datasets

Documentation

  • Requesting An Account
  • Get Started
    • Quick Start
    • Common Terms
    • HPC Resources
    • Theory of HPC
      • Overview of threads, cores, and sockets in Slurm for HPC workflows
    • Git Guide
  • Connecting to Unity
    • SSH
    • Unity OnDemand
    • Connecting to Desktop VS Code
  • Get Help
    • Frequently Asked Questions
    • How to Ask for Help
    • Troubleshooting
  • Cluster Specifications
    • Node List
    • Partition List
      • Gypsum
    • Storage
    • Node Features (Constraints)
      • NVLink and NVSwitch
    • GPU Summary List
  • Managing Files
    • Command Line Interface (CLI)
    • Disk Quotas
    • FileZilla
    • Globus
    • Scratch: HPC Workspace
    • Unity OnDemand File Browser
  • Submitting Jobs
    • Batch Jobs
      • Array Batch Jobs
      • Large Job Counts
      • Monitor a batch job
    • Helper Scripts
    • Interactive CLI Jobs
    • Unity OnDemand
    • Message Passing Interface (MPI)
    • Slurm cheat sheet
  • Software Management
    • Building Software from Scratch
    • Conda
    • Modules
      • Module Usage
    • Renv
    • Unity OnDemand
      • JupyterLab OnDemand
    • Venv
  • Tools & Software
    • ColabFold
    • R
      • R Parallelization
    • Unity GPUs
  • Datasets
    • AI and ML
      • AlpacaFarm
      • audioset
      • biomed_clip
      • blip_2
      • blip_2
      • coco
      • Code Llama
      • DeepAccident
      • DeepSeek
      • DINO v2
      • epic-kitchens
      • florence
      • gemma
      • gpt
      • gte-Qwen2
      • ibm-granite
      • Idefics2
      • Imagenet 1K
      • inaturalist
      • infly
      • instruct-blip
      • intfloat
      • LAION
      • linq
      • llama
      • Llama2
      • llama3
      • llama4
      • Llava_OneVision
      • Lumina
      • mixtral
      • msmarco
      • natural-questions
      • objaverse
      • openai-whisper
      • pythia
      • qwen
      • R1-1776
      • rag-sequence-nq
      • red-pajama-v2
      • s1-32B
      • satlas_pretrain
      • scalabilityai
      • SlimPajama
      • t5
      • Tulu
      • V2X
      • video-MAE
      • videoMAE-v2
      • vit
      • wildchat
    • Bioinformatics
      • AlphaFold3 Databases
      • BFD/MGnify
      • Big Fantastic Database
      • checkm
      • ColabFoldDB
      • dfam
      • EggNOG
      • EggNOG
      • gmap
      • GMAP-GSNAP database (human genome)
      • GTDB
      • igenomes
      • Kraken2
      • MGnify
      • NCBI BLAST databases
      • NCBI RefSeq database
      • NCBI RefSeq database
      • Parameters of Evolutionary Scale Modeling (ESM) models
      • params
      • PDB70
      • PDB70 for ColabFold
      • PINDER
      • PLINDER
      • Protein Data Bank
      • Protein Data Bank database in mmCIF format
      • Protein Data Bank database in SEQRES records
      • Tara Oceans 18S amplicon
      • Tara Oceans MATOU gene catalog
      • Tara Oceans MGT transcriptomes
      • Uniclust30
      • UniProtKB
      • UniRef100
      • UniRef30
      • UniRef90
      • Updated databases for ColabFold
    • Using HuggingFace Datasets

On this page

  • Storage types
    • High performance storage
    • Mid performance, large storage
    • Archival storage
    • Storage summary and mountpoint table
  • I need more storage!
    • Storage expansion options
    • Storage expansion flowchart
  • Snapshots
    • Accessing /project Snapshots
    • Restore files from a snapshot
  1. Unity
  2. Documentation
  3. Cluster Specifications
  4. Storage

Storage

Storage types

Unity provides access to a variety of storage methods for different use cases, including high performance storage, mid performance/large storage, and archival storage.

High performance storage

Unity’s /home, /work, and /scratch directories use high performance VAST DataStore. This storage is suitable for Job I/O (reading and writing to files during a job), which requires a fast, parallel filesystem for best job performance. You must store or stage all data for Job I/O in a high performance storage location. While it is possible to purchase additional /work space, we strongly advise using /scratch via HPC Workspace or a lower performance storage option (see below) if possible.

warning
Snapshots
The /home, /work, and now the /project directories are snapshotted on a 3-day rolling basis.
Other directories on Unity, including /scratch, DO NOT have snapshots! We can’t restore data lost from these directories.

Mid performance, large storage

Often, researchers need “warm” data storage that’s larger than their high performance storage group quotas. We recommend storing the bulk of your data in /project and staging the portions you need for a particular workload in /work or /scratch as needed. While the location of /project directories varies by institution, most are housed on the Northeast Storage Exchange (NESE)’s Disk Storage. Storing your data in /project is a cost-effective way to house data in a location with excellent transfer speeds to Unity’s high performance storage. To request /project space, email hpc@umass.edu. Most campuses provide a base allocation free of charge to research groups upon request.

In addition to NESE Disk, Unity researchers can request access to the Open Storage Network (OSN) S3 storage pods. UMass Amherst and URI own pods with storage available upon request (cost varies), or researchers can request an allocation of 10T through 50T through the NSF’s ACCESS program.

To request /project or OSN storage, email hpc@umass.edu.

Archival storage

Researchers who need to store data long-term (several years) can purchase archival tape storage through NESE’s Tape Storage. NESE Tape is extremely cost-effective, high-capacity storage meant to house data that’s not often used or modified. To request NESE Tape storage, email hpc@umass.edu.

Storage summary and mountpoint table

MountpointNameTypeBase quotaNotes
/homeHome directoriesHDD50 GBHome directories should be used only for user init files.
/work/pi_Work directoriesSSD1 TBWork is the primary location for running cluster jobs. This is a shared folder for all users in the PI group.
/projectProject directoriesHDDAs NeededProject directories are available to PIs upon request. A common use case is generating job output in /work and copying to /project afterwards. Not for job I/O
/scratchScratch spaceSSDN/ASee the HPC Workspace scratch documentation
/neseNESE mountsHDDVaryingDEPRECATED: Legacy location for mounts from the Northeast Storage Exchange (NESE). Not for job I/O
/nasBuy-in NAS mountsVaryingVaryingDEPRECATED: Location for legacy buy-in NAS hardware.
/gypsumGypsum devicesHDDVaryingDEPRECATED: Storage from the former UMass Amherst CICS Gypsum cluster.

I need more storage!

To request additional storage on Unity:

  1. Check out our storage management information to determine if you can reduce storage use without storage expansion.
  2. Determine the amount, duration, and type of storage needed using our handy flowchart and our storage descriptions.
  3. If you’re requesting a storage expansion that requires payment (see the storage expansion options table), identify the appropriate institution payment method (e.g. speedtype, Chartfield string, etc) for your payment source and the name and email of the finance representative within your department. If you’re unsure what to use, contact your institution’s representative for institution-specific information.
  4. Email hpc@umass.edu. If you’re not the PI (head) of your research group, this must be done by your PI or with your PI’s consent.
stylus_note
Storage purchasing via grants
You can also write Unity storage purchases into grants. See our grant page for grant boilerplate and information.

Storage expansion options

ResourceFree Tier ThresholdNotes
PI group work directories1TFree tier: automatically allocated on PI account creation.
Purchasing: available in 1T increments on 6 month intervals, up to 3 years at a time.
PI group project directories5T (URI, UMassD threshold may vary)Free tier: allocated upon request via the storage form.
Purchasing: available in 5T increments on 1 year intervals, up to 5 years at a time.
Scratch space50T soft capNo purchasing necessary, see our scratch documentation.
NESE TapeN/AFree tier: none available
Purchasing: available in 10T increments on 5 year intervals.
OpenStorageNetwork S3
from URI and UMass Amherst
TBDPurchasing: TBD

Storage expansion flowchart

The following flowchart is intended to help decide what type of storage you need or whether your existing data is ideally placed.

flowchart TD
    start("`We need more storage!`")
    quotaCheck("`My group can't reduce space without an increase.`")
    active("`Are the data needed for active jobs?`")
    frequent("`Can you stage subsets of this data in high performance storage as needed for active jobs?`")
    longtermLimited("`Do you need to archive data for a long time without frequent access or modification?`")
    sharing("`Do you need to share this data publicly?`")
    tape("`Request/Purchase NESE Tape archival storage.`")
    osn("`Request/Purchase OSN S3 storage or NESE /project space.`")
    intermediate("`Do you need additional storage for workflows that create temporary intermediate files?`")
    inactiveWorkData("`Does your group have inactive data in /work that could be moved to other storage?`")
    scratch("`Try Unity's scratch space: HPC Workspace.`")
    publicData("`Do you need additional storage to store a public, open-access dataset?`")
    email("`Email hpc@umass.edu about /datasets.`")
    purchaseWork("`Purchase additional /work storage.`")

    start --> quotaCheck
    quotaCheck --> intermediate
    intermediate -- NO --> active
    intermediate -- YES --> scratch
    active -- YES --> publicData
    active -- NO --> longtermLimited
    frequent -- NO --> purchaseWork
    frequent -- YES --> osn
    longtermLimited -- YES --> sharing
    sharing -- YES --> osn
    sharing -- NO --> tape
    longtermLimited -- NO --> osn
    inactiveWorkData -- YES --> osn
    inactiveWorkData -- NO --> frequent
    publicData -- YES --> email
    publicData -- NO --> inactiveWorkData

    click scratch "/documentation/managing-files/hpc-workspace/" "Scratch space link"
    click email "mailto:hpc@umass.edu" "Help email"
    click quotaCheck "/documentation/managing-files/quotas/" "Space management link"
    click osn "#mid-performance-large-storage" "Mid performance storage"
    click purchaseWork "#high-performance-storage" "High performance storage"
    click tape "#archival-storage" "Tape storage"

Snapshots

Backups are not available on the Unity cluster.
There are temporary snapshots created each day at 5am UTC.
Snapshots older than three days are deleted.
Self-directed restores are accomplished by accessing read-only snapshots (see table below).

FilesystemNameSnapshot location
/home/<username>Home directory/snapshots/home/unity_<timestamp>/<username>
/work/pi_<pi-username>Work directory/snapshots/work/unity_<timestamp>/pi_<pi-username>
/project/pi_<pi-username>Project directory/snapshots/project/<organization>-nesepool@<timestamp>

Accessing /project Snapshots

Snapshots for /project directories are stored under /snapshots/project/, but you need to know which organization the project belongs to. Inside /snapshots/project/, go to the correct nesepool:

  • 5col = Non-UMass Five Colleges
  • corp = Corporate partners
  • uma = UMass Amherst
  • umb = UMass Boston
  • umd = UMass Dartmouth
  • uml = UMass Lowell
  • uri = University of Rhode Island

Restore files from a snapshot

The following code sample shows how to restore a specific directory from a snapshot. The example restores to a restore directory first to ensure that changes aren’t overwritten.

mkdir ~/restore
cp -a /snapshot/home/unity_2023-02-08_05_00_00_UTC/<username>/path/to/file/or/directory ~/restore/
Last modified: Thursday, March 13, 2025 at 10:08 AM. See the commit on GitLab.
University of Massachusetts Amherst University of Massachusetts Amherst University of Rhode Island University of Rhode Island University of Massachusetts Dartmouth University of Massachusetts Dartmouth University of Massachusetts Lowell University of Massachusetts Lowell University of Massachusetts Boston University of Massachusetts Boston Mount Holyoke College Mount Holyoke College Smith College Smith College
search
close