Table of Contents | ||
---|---|---|
|
What is Slurm?
Slurm (previously Simple Linux Utility for Resource Management), is a modern, open source job scheduler that is highly scaleable and customizable; currently, Slurm is implemented on the majority of the TOP500 supercomputers. Job schedulers enable large numbers of users to fairly and efficiently share large computational resources.
Cluster prerequisites
Before being able to take advantage of our computational resources, you must first set up your environment. This is pretty straightforward, but there are a few steps:
SSH access setup
You need to have your SSH keys set up to access cluster resources. If you haven't done this already, please set up your ssh keys.
Environment setup
While still logged in to an SCU login node, run the following:
Code Block | ||||
---|---|---|---|---|
| ||||
cat - >> ~/.bashrc <<'EOF'
if [ -n "$SLURM_JOB_ID" ]
then
source /etc/slurm/slurm_source/slurm_source.sh
fi
alias squeue_long='squeue -o "%.18i %.9P %.8j %.8u %.8T %.10M %.11l %.6b %.6D %R"'
EOF
source ~/.bashrc |
This command simply references a Slurm environment script (if resources have been requested and allocated), and also provides an alias for a more informative squeue
command
SCU clusters and job partitions
Available SCU HPC resources
The SCU uses Slurm to manage the following resources:
General purpose cluster:
- The
panda
cluster (33 nodes): CPU-only cluster intended for general use
CyroEM cluster:
- The
cryoEM
cluster (18 nodes): 15 CPU-only nodes, 3 GPU (P100) nodes. Available only for analysis of cryoEM data
PI-specific clusters:
- The
Edison
cluster (9 GPU nodes): 5 k40m and 4 k80 nodes reserved for the H. Weinstein lab node178
: GPU (p100) node reserved for the Accardi and Huang labsnode179
: GPU (p100) node reserved for the Boudker labnode180
: GPU (p100) node reserved for the Blanchard labcantley-node0[1-2]
(2 nodes): GPU (V100) nodes reserved for the Cantley lab
All jobs, except those submitted the Edison
cluster, should be submitted via our Slurm submission node: curie.pbtech
. Jobs submitted to the Edison
cluster should be submitted from its submission node, edison-mgmt.pbtech
.
...
Table of Contents | ||
---|---|---|
|
...
What is Slurm?
Slurm (previously Simple Linux Utility for Resource Management), is a modern, open source job scheduler that is highly scaleable and customizable; currently, Slurm is implemented on the majority of the TOP500 supercomputers. Job schedulers enable large numbers of users to fairly and efficiently share large computational resources.
Please see About SCU for more information about our HPC infrastructure.
Slurm partitions
Slurm groups nodes into sets referred to as 'partitions'. The above resources belong to one or more Slurm partitions, with each partition possessing its own unique job submission rules. Some nodes belong to multiple partitions because this affords the SCU the configuration flexibility needed to ensure fair allocation of managed resources.
Panda
cluster partitions:
...
...
Slurm partitions - BRB Cluster
BRB
SCU cluster partitions:
- scu-cpu: 22 cpu nodes, 7-day runtime limit
- scu-gpu: 6 gpu nodes, 2-day runtime limit
CryoEM partitions:
- cryo-cpu: 14 CPU-only nodes, 7-day runtime limit, only 50 jobs allowed to run concurrently
panda_array
: 33 CPU-only nodes, 7-day runtime limit, up to 100,000 jobs allowed to run concurrently
Warning |
---|
Jobs submitted to |
CryoEM
cluster:
cryo-cpu
: 15 CPU-only nodes, 2-day runtime limitcryo-gpu
3 GPU nodes (P100), 2-day runtime limit
Edison
cluster:
edison
: 9 GPU nodes, 2-day runtime limitedison_k40m
: 5 GPU (k40m) nodes, 2-day runtime limitedison_k80
: 4 GPU (k80) nodes, 2-day runtime limit
PI-specific cluster partitions:
accardi_huang_reserve
: node178, GPU node, 7-day runtime limitboudker_reserve
: node179, GPU (P100) node- cryo-gpu: 6 GPU nodes, 2-day runtime limit
- cryo-gpu-v100: 2 GPU, 2-day runtime limit
- cryo-gpu-p100: 3 GPU, 2-day runtime limit
PI-specific cluster partitions:
- accardi_gpu: 4 GPU nodes, 2-day runtime lim
- accardi_cpu: 1 CPU node, 7-day runtime limit
boudker_gpu: 2 GPU nodes, 7-day runtime limit
- boudker_gpu-p100: 3 GPU nodes, 7-day runtime limit
- boudker_cpu: 2 CPU nodes, 7-day runtime limit
- sackler_ cpu: 1 CPU node, 7-day runtime limit
- sackler_ gpu: 1 GPU node, 7-day runtime limit
- hwlab-rocky_gpu: 12 GPU nodes, 7-day runtime limit
blanchard_reserve
: - node180, GPU (P100) node, eliezer-gpu: 1 GPU node, 7-day runtime limit
cantley-gpu
: 2 GPU (V100) nodes,
Other specific cluster partitions:
- scu-res: 1 GPU, 7-day runtime limit
Of course, the above will be updated as needed; regardless, to see an up-to-date description of all available partitions, using the command sinfo
on curie scu-login02. For a description of all the nodes' # CPU cores, memory (in Mb), runtime limits, and partition, use this command:
Code Block |
---|
sinfo -N -o "%25N %5c %10m %15l %25R" |
Or if you just want to see a description of the nodes in a given partition:
...
Interactive Session
Code Block |
---|
srun -n1 --pty --partition=scu-cpu --mem=8G bash -i |
To request specific numbers of GPUs, you should add your request to your srun/sbatch:
Below is an example of requesting 1 GPU - can request up to 4 GPUs on a single node
Code Block |
---|
--gres=gpu:1 |
...
A simple job submission example
...
Code Block | ||||
---|---|---|---|---|
| ||||
#! /bin/bash -l #SBATCH --partition=pandascu-cpu # cluster-specific #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --job-name=hello_slurm #SBATCH --time=00:02:00 # HH/MM/SS #SBATCH --mem=1G # memory requested, units available: K,M,G,T #SBATCH --output hello_slurm-%j.out #SBATCH --error hello_slurm-%j.err source ~/.bashrc echo "Starting at:" `date` >> hello_slurm_output.txt sleep 30 echo "This is job #:" $SLURM_JOB_ID >> hello_slurm_output.txt echo "Running on node:" `hostname` >> hello_slurm_output.txt echo "Running on cluster:" $SLURM_CLUSTER_NAME >> hello_slurm_output.txt echo "This job was assigned the temporary (local) directory:" $TMPDIR >> hello_slurm_output.txt exit |
...
Next, there are several #SBATCH
lines. These lines describe the resource allocation we wish to request:
--partition=pandascu-cpu
:
Cluster resources (such as specific nodes, CPUs, memory, GPUs, etc) can be assigned to groups, called partitions. Additionally, the same resources (e.g. a specific node) may belong to multiple cluster partitions. Finally, partitions may be assigned different job priority weights, so that jobs in one partition move through the job queue more quickly than jobs in another partition.
Every job submission script must request a specific partition--otherwise, the default is used. To see what partitions are available on your cluster, click here, or execute the command:
sinfo
...
the number of concurrently running tasks. Tasks can be thought of as processes; this is explained in more detail in Advanced Job Submissions. For this simple serial job, we only need 1 concurrently-running task/process. Also, by default, each task is allocated a single CPU core. For additional information on parallel/multicore environments, click here.
--cpus-per-task=1
:
the number of allocated CPUs.
--job-name=test_job
:
The job's name--this will appear in the job queue, and is publicly-viewable.
...
The number of concurrently running tasks. Tasks can be thought of as processes; this is explained in more detail in Advanced Job Submissions. For this simple serial job, we only need 1 concurrently-running task/process. Also, by default, each task is allocated a single CPU core. For additional information on parallel/multicore environments, click here.
...
Code Block | ||||
---|---|---|---|---|
| ||||
ssh -X pascal ssh -X curiecwid@scu-login01.med.cornell.edu srun --x11 -n1 --pty --partition=pandascu-cpu --mem=8G bash -i |
To test the session, try the following command:
...