Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel2
maxLevel6
outlinefalse
styledecimal
typelist
printabletrue

...

Computational jobs on the AI cluster are managed with a SLURM job manager. We provide an in-depth tutorial on how to use SLURM <placeholder>, but some basic examples that are immediately applicable on the AI cluster will be discussed in this section.

Important notice:

Warning

Do not run computations on login nodes

...

These instructions will allow users to launch jupyter notebook server on the AI cluster’s compute nodes and connect to this server using local browser. Similar approach can be used to connect to other services.

Note

Jupyter jobs are interactive by nature, so all the consideration about interactive jobs and their inefficiencies apply here

Prepare python environment (using conda or pip) on the cluster:

...

  1. load python module

    Code Block
    languagebash
    # list existing modules with 
    module avail
    # load suitable python module (miniconda in this example)
    module load miniconda3-4.10.3-gcc-12.2.0-hgiin2a
  2. Create new environment and activate it

    Code Block
    languagebash
    # create venv
    python -m venv ~/myvenv
    # and activate it
    source ~/myvenv/bin/activate
  3. Install jupyter into this new virtual environments:

    Code Block
    languagebash
    pip install jupyter

Submit a

...

SLURM job

Once conda is installed, submit a SLURM job. Prepare this SLURM batch script similar to this (use your own SBATCH arguments, like job name and the amount of resources you need, this is only an example):

...

Save this file somewhere in your file space on the login node and submit a batch slurm SLURM job with

Code Block
languagebash
sbatch script_name.txt 

Set up connection to the jupyter servers

Slurm SLURM job will generate a text file ~/jupyterjob_<JOBID>.txt. Follow instructions in this file to connect to the jupyter session. Two steps that need to be take are:

...