Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

While still logged in to an SCU login node, run the following:

Code Block
languagebash
titleSetting up the slurm environment
cat - >> ~/.bashrc <<'EOF'

if [ -n "$SLURM_JOB_ID" ]
then
        source /etc/slurm/slurm_source/slurm_source.sh
fi

alias squeue_long='squeue -o "%.18i %.9P %.8j %.8u %.8T %.10M %.11l %.6b  %.6D %R"'

EOF
source ~/.bashrc

This command simply references a Slurm environment script (if resources have been requested and allocated), and also provides an alias for a more informative squeue command

SCU clusters and job partitions

Available SCU HPC resources

The SCU uses Slurm to manage the following resources:

General purpose cluster:

  • The panda cluster (35 nodes): CPU-only cluster intended for general use

CyroEM cluster: 

  • The cryoEM cluster (18 nodes): 15 CPU-only nodes, 3 GPU (P100) nodes. Available only for analysis of cryoEM data

PI-specific clusters:

  • The Edison cluster (9 GPU nodes):  5 k40m and 4 k80 nodes reserved for the H. Weinstein lab
  • node178: GPU (p100) node reserved for the Accardi and Huang labs
  • node179: GPU (p100) node reserved for the Boudker lab
  • node180: GPU (p100) node reserved for the Blanchard lab
  • cantley-node0[1-2] (2 nodes): GPU (V100) nodes reserved for the Cantley lab 

All jobs, except those submitted the Edison cluster, should be submitted via our Slurm submission node: curie.pbtech. Jobs submitted to the Edison cluster should be submitted from its submission node, edison-mgmt.pbtech.

Warning

Note: Unless you perform cryoEM analysis, or otherwise have specific PI-granted privileges, you will only be able to submit jobs to the panda cluster.

Please see About SCU for more information about our HPC infrastructure.

Slurm partitions

Slurm groups nodes into sets referred to as 'partitions'. The above resources belong to one or more Slurm partitions, with each partition possessing its own unique job submission rules. Some nodes belong to multiple partitions because this affords the SCU the configurational flexibility needed to ensure fair allocation of managed resources.

Panda cluster partitions:

  • panda: 35 CPU-only nodes, 7-day runtime limit, only 100 jobs allowed to run concurrently

CryoEM cluster:

  • cryo-cpu: 15 CPU-only nodes, 2-day runtime limit
  • cryo-gpu 3 GPU nodes (P100), 2-day runtime limit

Edison cluster:

  • edison: 9 GPU nodes, 2-day runtime limit
  • edison_k40m: 5 GPU (k40m) nodes, 2-day runtime limit
  • edison_k80: 4 GPU (k80) nodes, 2-day runtime limit

PI-specific cluster partitions:

  • accardi_huang_reserve: node178, GPU node, 7-day runtime limit
  • boudker_reserve: node179, GPU (P100) node, 7-day runtime limit

  • blanchard_reserve: node180, GPU (P100) node, 7-day runtime limit

  • cantley-gpu: 2 GPU (V100) nodes, 7-day runtime limit

Of course, the above will be updated as needed; regardless, to see an up-to-date description of all available partitions, using the command sinfo on curie. For a description of all the nodes' # CPU cores, memory (in Mb), runtime limits, and partition, use this command:

Code Block
sinfo -N -o "%25N %5c %10m %15l %25R"

...

This command simply references a Slurm environment script (if resources have been requested and allocated), and also provides an alias for a more informative squeue command


...

SCU clusters and job partitions

Available SCU HPC resources

The SCU uses Slurm to manage the following resources:

General purpose cluster:

  • The panda cluster (35 nodes): CPU-only cluster intended for general use


CyroEM cluster: 

  • The cryoEM cluster (18 nodes): 15 CPU-only nodes, 3 GPU (P100) nodes. Available only for analysis of cryoEM data


PI-specific clusters:

  • The Edison cluster (9 GPU nodes):  5 k40m and 4 k80 nodes reserved for the H. Weinstein lab
  • node178: GPU (p100) node reserved for the Accardi and Huang labs
  • node179: GPU (p100) node reserved for the Boudker lab
  • node180: GPU (p100) node reserved for the Blanchard lab
  • cantley-node0[1-2] (2 nodes): GPU (V100) nodes reserved for the Cantley lab 


All jobs, except those submitted the Edison cluster, should be submitted via our Slurm submission node: curie.pbtech. Jobs submitted to the Edison cluster should be submitted from its submission node, edison-mgmt.pbtech.

Warning

Note: Unless you perform cryoEM analysis, or otherwise have specific PI-granted privileges, you will only be able to submit jobs to the panda cluster.

Please see About SCU for more information about our HPC infrastructure.

Slurm partitions

Slurm groups nodes into sets referred to as 'partitions'. The above resources belong to one or more Slurm partitions, with each partition possessing its own unique job submission rules. Some nodes belong to multiple partitions because this affords the SCU the configurational flexibility needed to ensure fair allocation of managed resources.


Panda cluster partitions:

  • panda: 35 CPU-only nodes, 7-day runtime limit


CryoEM cluster:

  • cryo-cpu: 15 CPU-only nodes, 2-day runtime limit
  • cryo-gpu 3 GPU nodes (P100), 2-day runtime limit


Edison cluster:

  • edison: 9 GPU nodes, 2-day runtime limit
  • edison_k40m: 5 GPU (k40m) nodes, 2-day runtime limit
  • edison_k80: 4 GPU (k80) nodes, 2-day runtime limit


PI-specific cluster partitions:

  • accardi_huang_reserve: node178, GPU node, 7-day runtime limit
  • boudker_reserve: node179, GPU (P100) node, 7-day runtime limit

  • blanchard_reserve: node180, GPU (P100) node, 7-day runtime limit

  • cantley-gpu: 2 GPU (V100) nodes, 7-day runtime limit


Of course, the above will be updated as needed; regardless, to see an up-to-date description of all available partitions, using the command sinfo on curie. For a description of all the nodes' # CPU cores, memory (in Mb), runtime limits, and partition, use this command:

Code Block
sinfo -N -o "%25N %5c %10m %15l %25R"


Or if you just want to see a description of the nodes in a given partition:

Code Block
sinfo -N -o "%25N %5c %10m %15l %25R" -p panda # for the partition panda

...

Page Properties
hiddentrue


Related issues



Code Block
languagebash
titleSetting up the slurm environment
cat - >> ~/.bashrc <<'EOF'

if [ -n "$SLURM_JOB_ID" ]
then
        source /etc/slurm/slurm_source/slurm_source.sh
fi

alias squeue_long='squeue -o "%.18i %.9P %.8j %.8u %.8T %.10M %.11l %.6b  %.6D %R"'

EOF
source ~/.bashrc