Table of Contents | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...
Computational jobs on the AI cluster are managed with a SLURM job manager. We provide an in-depth tutorial on how to use SLURM <placeholder>, but some basic examples that are immediately applicable on the AI cluster will be discussed in this section.
Important notice:
Warning |
---|
Do not run computations on login nodes |
...
load python module
Code Block language bash # list existing modules with module avail # load suitable python module (miniconda in this example) module load miniconda3-4.10.3-gcc-12.2.0-hgiin2a
create new environment
Code Block language bash conda create -n myvenv python=3.10 # and follow prompts...
once this new environment is created, activate it with
Code Block language bash # initialize conda conda init bash # or "conda init csh" if csh is used # list existing environments conda env list # activate newly installed env from previous step conda activate myvenv # if you need to deactivate conda deactivate myvenv
After all this is done, conda will automatically load ~base~ environment upon every login to the cluster. To prevent this, run
Code Block language bash conda config --set auto_activate_base false
Use this env to install jupyter (and other required packages)
Code Block language bash conda install jupyter
working with pip
load python module
Code Block language bash # list existing modules with module avail # load suitable python module (miniconda in this example) module load miniconda3-4.10.3-gcc-12.2.0-hgiin2a
Create new environment and activate it
Code Block language bash # create venv python -m venv ~/myvenv # and activate it source ~/myvenv/bin/activate
Install jupyter into this new virtual environments:
Code Block language bash pip install jupyter
Submit a slurm job
Once conda is installed, submit a slurm SLURM job. Prepare this slurm SLURM batch script similar to this (use your own SBATCH arguments, like job name and the amount of resources you need, this is only an example):
Code Block | ||
---|---|---|
| ||
#!/bin/bash #SBATCH --job-name=<myJobName> # give your job a name #SBATCH --nodes=1 #SBATCH --cpus-per-task=1 #SBATCH --time=00:30:00 # set this time according to your need #SBATCH --mem=3GB # how much RAM will your notebook consume? #SBATCH --gres=gpu:1 # if you need to use a GPU #SBATCH -p ai-gpu # specify partition module purge module load miniconda3-4.10.3-gcc-12.2.0-hgiin2a # if using conda conda activate myvenv # if using pip # source ~/myvev/bin/activate # set log file LOG="/home/${USER}/jupyterjob_${SLURM_JOB_ID}.txt" # gerenerate random port PORT=`shuf -i 10000-50000 -n 1` # print useful info cat << EOF > ${LOG} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~ Slurm Job $SLURM_JOB_ID ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Hello from the Jupyter job! In order to connect to this jupyter session setup a tunnel on your local workstation with: ---> ssh -t ${USER}@ai-login01.med.cornell.edu -L ${PORT}:localhost:${PORT} ssh ${HOSTNAME} -L ${PORT}:localhost:${PORT} (copy above command and paste it to your terminal). Depending on your ssh configuration, you may be prompted for your password. Once you are logged in, leave the terminal running and don't close it until you are finished with your Jupyter session. Further down look for a line similar to ---> http://127.0.0.1:10439/?token=xxxxyyyxxxyyy Copy this line and paste in your browser EOF # start jupyter jupyter-notebook --no-browser --ip=0.0.0.0 --port=${PORT} 2>&1 | tee -a ${LOG} |
Save this file somewhere in your file space on the login node and submit a batch slurm job with
Code Block | ||
---|---|---|
| ||
sbatch script_name.txt |
Set up connection to the jupyter servers
Slurm job will generate a text file ~/jupyterjob_<JOBID>.txt
. Follow instructions in this file to connect to the jupyter session. Two steps that need to be take are:
...