The purpose of this documentation is to describe how to run Relion calculations on SCU resources, including some general suggestions regarding how to achieve better performance. It is NOT intended to teach you how to run Relion–there is already an excellent Relion tutorial that serves this purpose.
General Environment Setup to Work with SCU Resources
Setup Slurm
In order to use SCU resources, you'll first need to setup your environment to work properly with our implementation of the job scheduler, Slurm. This is described here: Using Slurm
Setup Spack
Once this is done, you'll need to setup your environment to use Spack, our cluster-wide package manager. This is described here: Spack
Relion-specific Environment Setup
In order to use the Relion graphical user interface (GUI), you'll need to do log on to curie,
the Slurm submission node (see Using Slurm for more details), and add this to your ~/.bashrc
:
export RELION_QSUB_EXTRA_COUNT=2 export RELION_QSUB_EXTRA1="Number of nodes:" export RELION_QSUB_EXTRA1_DEFAULT="1" export RELION_QSUB_EXTRA2="Number of GPUs:" export RELION_QSUB_EXTRA2_DEFAULT="0" export RELION_CTFFIND_EXECUTABLE=/softlib/apps/EL7/ctffind/ctffind-4.1.8/bin/ctffind
After editing your ~/.bashrc
, log out of curie
.
Relion Versions Installed
Log on to curie, with X11, and request an interactive node
ssh -Y pascal # if off campus, use ssh -Y pascal.med.cornell.edu ssh -Y curie
The -Y
enables the use of the GUI.
In addition to pascal,
you can also use the other gateway nodes, aphrodite
and aristotle.
If you plan on running calculations on a desktop/workstation (e.g. a system not allocated by Slurm), then you'll need to log in there (make sure you use -Y
in each of your ssh
commands)
For Cluster/reserved nodes-only: to request an interactive session, use this command (for more information, see: Using Slurm)
srun -n1 --pty --x11 --partition=cryo-cpu --mem=8G bash -l
Seeing which Relion versions are available
To see what's available, use this command (for more information on spack command, see Spack):
spack find -l -v relion
Here is output from the above command that is current as of 5/3/19:
==> 14 installed packages. -- linux-centos7-x86_64 / gcc@4.8.5 ----------------------------- pbsqju2 relion@2.0.3 build_type=RelWithDebInfo ~cuda cuda_arch= +double~double-gpu+gui purpose=cluster ua3zs52 relion@2.0.3 build_type=RelWithDebInfo +cuda cuda_arch=60 +double~double-gpu+gui purpose=cluster djy46i6 relion@2.1 build_type=RelWithDebInfo +cluster+cuda cuda_arch=60 ~desktop+double~double-gpu+gui cchnbyc relion@2.1 build_type=RelWithDebInfo ~cuda cuda_arch= +double~double-gpu+gui xxisr7j relion@2.1 build_type=RelWithDebInfo +cuda cuda_arch=60 +desktop+double~double-gpu+gui lzd4ktq relion@2.1 build_type=RelWithDebInfo +cuda cuda_arch=60 +double~double-gpu+gui v6jckz3 relion@2.1 build_type=RelWithDebInfo +cuda cuda_arch=60 +double~double-gpu+gui purpose=cluster 6cpdlsc relion@2.1 build_type=RelWithDebInfo +cuda cuda_arch=60 +double~double-gpu+gui purpose=desktop ltaf3x6 relion@2.1 build_type=RelWithDebInfo +cuda cuda_arch=70 +double~double-gpu+gui purpose=cluster u6dzm4v relion@3.0_beta build_type=RelWithDebInfo ~cuda cuda_arch= +double~double-gpu+gui purpose=cluster olcttts relion@3.0_beta build_type=RelWithDebInfo +cuda cuda_arch=60 +double~double-gpu+gui purpose=cluster f2fjevh relion@3.0_beta build_type=RelWithDebInfo +cuda cuda_arch=60 +double~double-gpu+gui purpose=desktop opdhbcs relion@3.0_beta build_type=RelWithDebInfo +cuda cuda_arch=70 +double~double-gpu+gui purpose=cluster 2ilkf4r relion@develop build_type=RelWithDebInfo +cuda+double~double-gpu+gui
There's a lot in the above output, so let's break it down!
First, in the above, we have several production-ready Relion versions:
relion@2.0.3
relion@2.1
relion@3.0_beta
In addition, these versions of Relion can run on the following platforms:
purpose=desktop
# used on workstationspurpose=cluster
# used on our Slurm-allocated cluster
Some of these Relion installations are intended for use on nodes/workstations with GPUs, whereas others are intended for CPU-only nodes/workstations:
+cuda
# Relion installation that supports GPU-use-cuda
# Relion installation that does not support GPU use
Do NOT try to run a +cuda installation of Relion on a CPU-only node–this will result in errors, as these versions of Relion expect CUDA to be present (which it is not on the CPU-only nodes).
Finally, we have versions of Relion that are specific to different GPU models:
cuda_arch=60
# For use on nodes with Nvidia P100scuda_arch=70
# For use on nodes with Nvidia V100s
What Relion version should I use?!?
Which version of Relion you use (2.0.3, 2.1, or 3.0_beta) is more of a scientific question than technical one; however, in general, most users seem to be using 3.0_beta, unless they have legacy projects that were started with an older version (consult with PI if you have questions about this).
After the specific version of Relion is selected, selecting for the other installation parameters is straightforward:
To load Relion 3 beta the cluster (or reserved nodes) on CPU-only nodes (i.e. no GPUs):
spack load -r relion@3.0_beta~cuda purpose=cluster
This is the command you want if you wish to launch the Relion GUI–don't worry that this version doesn't use GPUs. The GUI is simply used for creating the job and either submitting it directly to Slurm, or writing out a submission script. This Relion installation is only for running the GUI or for running jobs on CPU-only nodes.
The following spack load
command are for loading Relion with the intent to submitting jobs via a script, or running the GUI on a desktop/workstation.
To run relion3.0_beta on the cluster (or reserved nodes) with GPUs (P100s):
spack load -r relion@3.0_beta+cuda purpose=cluster cuda_arch=60
To run relion3.0_beta on the cluster (or reserved nodes) with GPUs (V100s):
spack load -r relion@3.0_beta+cuda purpose=cluster cuda_arch=70
To run relion3.0_beta on a desktop/workstation with GPUs (P100s):
spack load -r relion@3.0_beta +cuda cuda_arch=60 purpose=desktop
Launching the Relion GUI
Note for Cluster/reserved nodes-only: This assumes you are already in an interactive session (as described in the previous section)–if not, request an interactive session!
If you haven't already, load relion@3.0_beta
with this command:
spack load -r relion@3.0_beta~cuda purpose=cluster
This is the command you want if you wish to launch the Relion GUI–don't worry that this version doesn't use GPUs. The GUI is simply used for creating the job and either submitting it directly to Slurm, or writing out a submission script. This Relion installation is only for running the GUI or for running jobs on CPU-only nodes.
If you are running the GUI on a desktop/workstation, load the appropriate version of Relion, as described in the previous section.
Next, change to the directory where you wish to keep your Relion files for a given project and execute this command:
relion
This should launch the Relion GUI.
Relion generates a lot of subdirectories and metadata; to keep everything organized, we recommend that each Relion project (i.e. analysis workflow for a given set of data) is given its own directory.
Submitting Jobs from the GUI:
Should I run jobs locally or submit jobs to the Slurm queue?
If you are RUNNING jobs on a workstation (i.e. actually using your workstation to PERFORM the calculation, and not just submitting from the workstation) AND that workstation is not managed by Slurm, just run all jobs locally.
Some jobs should just be run from the interactive session, and NOT submitted to the Slurm queue. These are jobs are generally lightweight in terms of computational demand and/or require the Relion GUI. Here are a few examples of jobs that should be run from the GUI (and not submitted to the Slurm queue):
- Import
- Manual picking
- Subset selection
- Join star files
Here are some jobs that should never be run directly (i.e. these jobs should always be submitted to the Slurm queue, unless running on a local workstation):
- Motion correction
- CTF estimation
- Auto-picking
- Particle Sorting
- 2D/3D classification
- 3D initial model
- 3D auto-refine
- 3D multi-body
In general, if the job takes a long time to run or requires a lot of compute resources, it should be submitted to the Slurm queue (or run on a local workstation).
How to run jobs locally?
This is very straight-forward. Set up your calculation in the Relion GUI (see the Relion tutorial for calculation-specific settings). Under the "Running" tab for a given calculation, just make sure the option "Submit to queue?" is set to "No". Otherwise, follow the procedure as described in the Relion tutorial.
How to submit jobs to the Slurm queue?
Set up your calculation in the Relion GUI (see the Relion Tutorial for calculation-specific settings).
- Under the "Running" tab for a given calculation, set the option "Submit to queue?" to "Yes".
- You'll need to set "Queue name:" to the Slurm partition you wish to use:
- If you want to only run on CPUs, then set this to "cryo-cpu"
- If you want to run on the shared cryoEM GPUs, then set this to "cryo-gpu"
- If you want to run on a lab-reserved node, then set this to the appropriate partition (e.g. "blanchard_reserve")
- Set "Queue submit command" to "sbatch"
- Set the number of nodes/GPUs to the desired values
Relion requires a set of template Slurm submission scripts that enables it to submit jobs. You'll need to set "Standard submission script" to the path where the needed template script is. A set of template submission scripts can be found here:
/softlib/apps/EL7/slurm_relion_submit_templates # select the template script that is appropriate for your job
- Give the "Current job" an alias if desired and click "Run!" # assumes everything else is set. Checking job status (and other Slurm functionality) is described here and here.
Related articles