Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
#!/bin/bash
#SBATCH --job-name=n1_bench
#SBATCH -p cryo-gpu-v100
#SBATCH --mem=170g
#SBATCH --nodes=1
#SBATCH --ntasks=5
#SBATCH --ntasks-per-node=5
#SBATCH --cpus-per-task=4
#SBATCH --gres=gpu:4

cd /athena/scu/scratch/dod2014/relion_2

source /software/spack/centos7/share/spack/setup-env.sh

# 3.1_beta skylake openmpi 4.0.1
spack load -r /ii7uzb5

# 3.0.8 w openmpi 4 + slurm
#spack load -r /sfp6sf5

# 3.1_beta w openmpi 4 + slurm
#spack load -r /ii7uzb5

mkdir -pv Refine3D/quackmaster/run_single_node/${user}_${SLURM_JOB_ID}

mpirun  -display-allocation -display-map -v -np 5 relion_refine_mpi --o Refine3D/quackmaster/run_single_node/${user}_${SLURM_JOB_ID} --split_random_halves --i Select/job088/particles.star --ref 012218_CS_256.mrc --firstiter_cc --ini_high 30 --dont_combine_weights_via_disc --scratch_dir /scratchLocal --pad 2  --ctf --particle_diameter 175 --flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --auto_local_healpix_order 4 --offset_range 5 --offset_step 2 --sym C1 --low_resol_join_halves 40 --norm --scale --j 4 --gpu --pool 100 --auto_refine

# report whether relion crashes
# remove particles from scratch
# if it crashes
if [ $? -eq 0 ]
  then
     echo -e "$SLURM_JOB_ID exited successfully"
  else
     user=$SLURM_JOB_USER
     MY_TMP_DIR=/scratchLocal/${user}_${SLURM_JOB_ID}
     echo -e "$SLURM_JOB_ID failed so cleaning up $MY_TMP_DIR"
     rm -rf $MY_TMP_DIR
fi


Multi node jobs:


If you want your job to finish sooner, and there are idle nodes, sooner then you can run a single job on multiple nodes at once.  This Through the magic of OpenMPI and low-latency RDMA networking this will allow each iteration, and the entire job, to finish sooner than it would on a single node.

...

In the above notice --ntasks=13 and mpirun -n 13 options.


These will ensure 13 total MPI processes are launched, across the 3 nodes in cryo-gpu-v100 partition, with 5 processes on the first node and 4 on the remaining.  Note, you ideally want 1 with the exception of the first node that will coordinate MPI communication across all nodes, you want 4 MPI process per node given each node has 4 GPUs.  Any more than 1 4 MPI process per node , thus GPU, will result in poor performance as the GPUs will be oversubscribed.  


If you wanted to run this on 2 nodes instead of three then use --ntasks=9 as well as mpirun -n 9.


In Slurm log output we can see the allocation, with three nodes, and the correct number of MPI processes:

...