Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel2
maxLevel6
outlinefalse
styledecimal
typelist
printabletrue

...

Computational jobs on the AI cluster are managed with a SLURM job manager. We provide an in-depth tutorial on how to use SLURM <placeholder>, but some basic examples that are immediately applicable on the AI cluster will be discussed in this section.

Important notice:

Warning

Do not run computations on login nodes

...

  1. load python module

    Code Block
    languagebash
    # list existing modules with 
    module avail
    # load suitable python module (miniconda in this example)
    module load miniconda3-4.10.3-gcc-12.2.0-hgiin2a
  2. create new environment

    Code Block
    languagebash
    conda create -n myvenv python=3.10 # and follow prompts...
  3. once this new environment is created, activate it with

    Code Block
    languagebash
        # initialize conda
         conda init bash # or "conda init csh" if csh is used
         # list existing environments
         conda env list
         # activate newly installed env from previous step
         conda activate myvenv
         # if you need to deactivate
         conda deactivate myvenv
  4. After all this is done, conda will automatically load ~base~ environment upon every login to the cluster. To prevent this, run

    Code Block
    languagebash
    conda config --set auto_activate_base false
  5. Use this env to install jupyter (and other required packages)

    Code Block
    languagebash
    conda install jupyter
working with pip
  1. load python module

    Code Block
    languagebash
    # list existing modules with 
    module avail
    # load suitable python module (miniconda in this example)
    module load miniconda3-4.10.3-gcc-12.2.0-hgiin2a
  2. Create new environment and activate it

    Code Block
    languagebash
    # create venv
    python -m venv ~/myvenv
    # and activate it
    source ~/myvenv/bin/activate
  3. Install jupyter into this new virtual environments:

    Code Block
    languagebash
    pip install jupyter

Submit a slurm job

Once conda is installed, submit a slurm SLURM job. Prepare this slurm SLURM batch script similar to this (use your own SBATCH arguments, like job name and the amount of resources you need, this is only an example):

Code Block
languagebash
#!/bin/bash

#SBATCH --job-name=<myJobName> # give your job a name
#SBATCH --nodes=1
#SBATCH --cpus-per-task=1
#SBATCH --time=00:30:00 # set this time according to your need
#SBATCH --mem=3GB # how much RAM will your notebook consume?
#SBATCH --gres=gpu:1 # if you need to use a GPU
#SBATCH -p ai-gpu # specify partition

module purge
module load miniconda3-4.10.3-gcc-12.2.0-hgiin2a
# if using conda
conda activate myvenv
# if using pip
# source ~/myvev/bin/activate

# set log file
LOG="/home/${USER}/jupyterjob_${SLURM_JOB_ID}.txt"

# gerenerate random port
PORT=`shuf -i 10000-50000 -n 1`

# print useful info
cat << EOF > ${LOG}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~ Slurm Job $SLURM_JOB_ID
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Hello from the Jupyter job!

In order to connect to this jupyter session
setup a tunnel on your local workstation with:
     --->  ssh -t ${USER}@ai-login01.med.cornell.edu -L ${PORT}:localhost:${PORT} ssh ${HOSTNAME} -L ${PORT}:localhost:${PORT}
(copy above command and paste it to your terminal).

Depending on your ssh configuration, you may be
prompted for your password. Once you are logged in,
leave the terminal running and don't close it until
you are finished with your Jupyter session.

Further down look for a line similar to
     ---> http://127.0.0.1:10439/?token=xxxxyyyxxxyyy
Copy this line and paste in your browser
EOF

# start jupyter
jupyter-notebook --no-browser --ip=0.0.0.0 --port=${PORT} 2>&1 | tee -a ${LOG}

Save this file somewhere in your file space on the login node and submit a batch slurm job with

Code Block
languagebash
sbatch script_name.txt 

Set up connection to the jupyter servers

Slurm job will generate a text file ~/jupyterjob_<JOBID>.txt. Follow instructions in this file to connect to the jupyter session. Two steps that need to be take are:

...