Table of Contents | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...
Computational jobs on the AI cluster are managed with a SLURM job manager. We provide an in-depth tutorial on how to use SLURM <placeholder>, but some basic examples that are immediately applicable on the AI cluster will be discussed in this section.
Important notice:
Warning |
---|
Do not run computations on login nodes |
...
These instructions will allow users to launch jupyter notebook server on the AI cluster’s compute nodes and connect to this server using local browser. Similar approach can be used to connect to other services.
Note |
---|
Jupyter jobs are interactive by nature, so all the consideration about interactive jobs and their inefficiencies apply here |
Prepare python environment (using conda or pip) on the cluster:
...
load python module
Code Block language bash # list existing modules with module avail # load suitable python module (miniconda in this example) module load miniconda3-4.10.3-gcc-12.2.0-hgiin2a
Create new environment and activate it
Code Block language bash # create venv python -m venv ~/myvenv # and activate it source ~/myvenv/bin/activate
Install jupyter into this new virtual environments:
Code Block language bash pip install jupyter
Submit a
...
SLURM job
Once conda is installed, submit a SLURM job. Prepare this SLURM batch script similar to this (use your own SBATCH arguments, like job name and the amount of resources you need, this is only an example):
...
Save this file somewhere in your file space on the login node and submit a batch slurm SLURM job with
Code Block | ||
---|---|---|
| ||
sbatch script_name.txt |
Set up connection to the jupyter servers
Slurm SLURM job will generate a text file ~/jupyterjob_<JOBID>.txt
. Follow instructions in this file to connect to the jupyter session. Two steps that need to be take are:
...