Unity
Unity
About
News
Events
Docs
Contact Us
code
search
login
Unity
Unity
About
News
Events
Docs
Contact Us
dark_mode
light_mode
code login
search

Documentation

  • Requesting An Account
  • Get Started
    • Quick Start
    • Common Terms
    • HPC Resources
    • Theory of HPC
      • Overview of threads, cores, and sockets in Slurm for HPC workflows
    • Git Guide
  • Connecting to Unity
    • SSH
    • Unity OnDemand
    • Connecting to Desktop VS Code
  • Get Help
    • Frequently Asked Questions
    • How to Ask for Help
    • Troubleshooting
  • Cluster Specifications
    • Node List
    • Partition List
      • Gypsum
    • Storage
    • Node Features (Constraints)
      • NVLink and NVSwitch
    • GPU Summary List
  • Managing Files
    • Command Line Interface (CLI)
    • Disk Quotas
    • FileZilla
    • Globus
    • Scratch: HPC Workspace
    • Unity OnDemand File Browser
  • Submitting Jobs
    • Batch Jobs
      • Array Batch Jobs
      • Large Job Counts
      • Monitor a batch job
    • Helper Scripts
    • Interactive CLI Jobs
    • Unity OnDemand
    • Message Passing Interface (MPI)
    • Slurm cheat sheet
  • Software Management
    • Building Software from Scratch
    • Conda
    • Modules
      • Module Usage
    • Renv
    • Unity OnDemand
      • JupyterLab OnDemand
    • Venv
  • Tools & Software
    • ColabFold
    • R
      • R Parallelization
    • Unity GPUs
  • Datasets
    • AI and ML
      • AlpacaFarm
      • audioset
      • biomed_clip
      • blip_2
      • blip_2
      • coco
      • Code Llama
      • DeepAccident
      • DeepSeek
      • DINO v2
      • epic-kitchens
      • florence
      • gemma
      • gpt
      • gte-Qwen2
      • ibm-granite
      • Idefics2
      • Imagenet 1K
      • inaturalist
      • infly
      • instruct-blip
      • intfloat
      • LAION
      • linq
      • llama
      • Llama2
      • llama3
      • llama4
      • Llava_OneVision
      • Lumina
      • mixtral
      • msmarco
      • natural-questions
      • objaverse
      • openai-whisper
      • pythia
      • qwen
      • R1-1776
      • rag-sequence-nq
      • red-pajama-v2
      • s1-32B
      • satlas_pretrain
      • scalabilityai
      • SlimPajama
      • t5
      • Tulu
      • V2X
      • video-MAE
      • videoMAE-v2
      • vit
      • wildchat
    • Bioinformatics
      • AlphaFold3 Databases
      • BFD/MGnify
      • Big Fantastic Database
      • checkm
      • ColabFoldDB
      • dfam
      • EggNOG
      • EggNOG
      • gmap
      • GMAP-GSNAP database (human genome)
      • GTDB
      • igenomes
      • Kraken2
      • MGnify
      • NCBI BLAST databases
      • NCBI RefSeq database
      • NCBI RefSeq database
      • Parameters of Evolutionary Scale Modeling (ESM) models
      • params
      • PDB70
      • PDB70 for ColabFold
      • PINDER
      • PLINDER
      • Protein Data Bank
      • Protein Data Bank database in mmCIF format
      • Protein Data Bank database in SEQRES records
      • Tara Oceans 18S amplicon
      • Tara Oceans MATOU gene catalog
      • Tara Oceans MGT transcriptomes
      • Uniclust30
      • UniProtKB
      • UniRef100
      • UniRef30
      • UniRef90
      • Updated databases for ColabFold
    • Using HuggingFace Datasets

Documentation

  • Requesting An Account
  • Get Started
    • Quick Start
    • Common Terms
    • HPC Resources
    • Theory of HPC
      • Overview of threads, cores, and sockets in Slurm for HPC workflows
    • Git Guide
  • Connecting to Unity
    • SSH
    • Unity OnDemand
    • Connecting to Desktop VS Code
  • Get Help
    • Frequently Asked Questions
    • How to Ask for Help
    • Troubleshooting
  • Cluster Specifications
    • Node List
    • Partition List
      • Gypsum
    • Storage
    • Node Features (Constraints)
      • NVLink and NVSwitch
    • GPU Summary List
  • Managing Files
    • Command Line Interface (CLI)
    • Disk Quotas
    • FileZilla
    • Globus
    • Scratch: HPC Workspace
    • Unity OnDemand File Browser
  • Submitting Jobs
    • Batch Jobs
      • Array Batch Jobs
      • Large Job Counts
      • Monitor a batch job
    • Helper Scripts
    • Interactive CLI Jobs
    • Unity OnDemand
    • Message Passing Interface (MPI)
    • Slurm cheat sheet
  • Software Management
    • Building Software from Scratch
    • Conda
    • Modules
      • Module Usage
    • Renv
    • Unity OnDemand
      • JupyterLab OnDemand
    • Venv
  • Tools & Software
    • ColabFold
    • R
      • R Parallelization
    • Unity GPUs
  • Datasets
    • AI and ML
      • AlpacaFarm
      • audioset
      • biomed_clip
      • blip_2
      • blip_2
      • coco
      • Code Llama
      • DeepAccident
      • DeepSeek
      • DINO v2
      • epic-kitchens
      • florence
      • gemma
      • gpt
      • gte-Qwen2
      • ibm-granite
      • Idefics2
      • Imagenet 1K
      • inaturalist
      • infly
      • instruct-blip
      • intfloat
      • LAION
      • linq
      • llama
      • Llama2
      • llama3
      • llama4
      • Llava_OneVision
      • Lumina
      • mixtral
      • msmarco
      • natural-questions
      • objaverse
      • openai-whisper
      • pythia
      • qwen
      • R1-1776
      • rag-sequence-nq
      • red-pajama-v2
      • s1-32B
      • satlas_pretrain
      • scalabilityai
      • SlimPajama
      • t5
      • Tulu
      • V2X
      • video-MAE
      • videoMAE-v2
      • vit
      • wildchat
    • Bioinformatics
      • AlphaFold3 Databases
      • BFD/MGnify
      • Big Fantastic Database
      • checkm
      • ColabFoldDB
      • dfam
      • EggNOG
      • EggNOG
      • gmap
      • GMAP-GSNAP database (human genome)
      • GTDB
      • igenomes
      • Kraken2
      • MGnify
      • NCBI BLAST databases
      • NCBI RefSeq database
      • NCBI RefSeq database
      • Parameters of Evolutionary Scale Modeling (ESM) models
      • params
      • PDB70
      • PDB70 for ColabFold
      • PINDER
      • PLINDER
      • Protein Data Bank
      • Protein Data Bank database in mmCIF format
      • Protein Data Bank database in SEQRES records
      • Tara Oceans 18S amplicon
      • Tara Oceans MATOU gene catalog
      • Tara Oceans MGT transcriptomes
      • Uniclust30
      • UniProtKB
      • UniRef100
      • UniRef30
      • UniRef90
      • Updated databases for ColabFold
    • Using HuggingFace Datasets

On this page

  • Building software from scratch
    • Part 1: Obtaining the code
    • Part 2: Building the code
    • Part 3: Installing the program
    • Notes on certain build systems
  1. Unity
  2. Documentation
  3. Software Management
  4. Building Software from Scratch

Building software from scratch

Building software from scratch is the most flexible method, but also the most involved. In some cases, it requires deep knowledge of the software in question, including its dependencies and other requirements.

report
Download code only from trusted sources. Especially with open source software there can be multiple versions (sometimes called “forks”) maintained by different people, so it’s important to find the correct version and check that it hasn’t been maliciously altered.

While any well maintained software should provide some installation instructions, usually in a README or INSTALL file, there are many that assume the user has some knowledge of common build systems. This is particularly true for code that is mostly in C/C++ or Fortran.

The following content is a general guide. Please read it completely before starting, but be aware that a single guide can not capture all of the potential complexity.

Part 1: Obtaining the code

When fetching the code, we recommend using compressed release bundles instead of doing a git clone. Compressed release bundles are:

  • Usually single files with an extension of .tar.gz, .tar.bz2, .tar.xz, or .zip.
  • Often contain a “build ready” set of files with more content than is in the repository itself.

If a Release file is available:

  1. Download it with a command such as wget https:/.... This command saves it with the same filename as the end of the URL.
    • In some cases this might be just v4.1.1.tar.gz, which is not descriptive. You may provide a name with wget -O somepackage-v4.1.1.tar.gz https://....
  2. Extract files with .tar.*, .tbz2 or similar with tar xf filename.tar.gz.
    • We recommend checking the contents with tar tf filename.tar.gz to see if it’s going to extract everything to a new subdirectory, or put files in the current directory as well.
    • For .zip files, use unzip -l filename.zip to check the directory structure first.

If a Git project doesn’t have a Release file:

  1. Check to see if it has “tagged” version available.
  2. Download the .zip of the “tagged” version or checkout that version with a command such as: git clone https://..../project ; cd project ; git checkout v4.1.1.
    • If there are no tags at all, then this is probably very untested software and you should scrutinize it carefully before proceeding.

Part 2: Building the code

Once you have the code, the next step is to identify the build system. Look at the files at the top level of the code. Different programming languages use different tools. The following are the basic commands, though most of them require extra arguments.

LanguageFileBuild Command
Javapom.xmlmvn
build.xmlant
Pythonenvironment.ymlconda
requirements.txtpip
pyproject.tomlpoetry
setup.pypython setup.py
C/C++CMakeLists.txtcmake -Bbuild && cmake --build build
Makefilemake
configure./configure && make
autogen.sh./autogen.sh && ./configure && make

Part 3: Installing the program

Before building, you should decide where you want the software to end up after installing. By default, many packages may try to install to a location your user account can’t write to.

We recommend placing the software under the PI /work folder so that you can share the software with other members of your group. Installing in /home is possible, but has a much smaller quota. Building in /project will be much slower, and running from there may also have run-time performance impacts. Since /scratch is temporary, software installed there will be lost.

Each build system has its own way of specifying the location.

  • For Java, you may need to edit the .xml file or at least examine it to determine how to override its default.
  • For Python, use either Conda or venv.
  • For C/C++ and Fortran projects, consult the following table to specify a destination directory:
Build fileCommand
CMakeLists.txtcmake -Bbuild -DCMAKE_INSTALL_PREFIX=/work/...
configure./configure --prefix=/work/...
MakefileEdit the Makefile

After building, run the install command:

Build fileCommand
CMakeLists.txtcmake --install build
configure/Makefilemake install

Not all Makefile based projects provide an install target, so you may just end up with a binary somewhere under the project directory, maybe at the top level, maybe not.

Notes on certain build systems

  • With cmake, you can use ccmake for an interactively view of all of the possible parameters.
    • This is sometimes necessary to turn on or off certain features, or to specify dependency location information.
  • For “autobuild” style packages, there is a cascade of commands to run.
    • If there is an ./autogen.sh file, it can create a ./configure file from a ./configure.in file. However, only do this if ./configure doesn’t already exist.
    • If there is a ./configure.am but no ./autogen.sh, you can try aclocal && autoconf && automake && libtoolize --force.
Last modified: Tuesday, April 15, 2025 at 11:38 AM. See the commit on GitLab.
University of Massachusetts Amherst University of Massachusetts Amherst University of Rhode Island University of Rhode Island University of Massachusetts Dartmouth University of Massachusetts Dartmouth University of Massachusetts Lowell University of Massachusetts Lowell University of Massachusetts Boston University of Massachusetts Boston Mount Holyoke College Mount Holyoke College Smith College Smith College
search
close