Page Comparison

Table of Contents

minLevel	2
maxLevel	6
outline	false
style	decimal
type	list
printable	true

...

Computational jobs on the AI cluster are managed with a SLURM job manager. We provide an in-depth tutorial on how to use SLURM <placeholder>, but some basic examples that are immediately applicable on the AI cluster will be discussed in this section.

Important notice

...

Warning
Do not run computations on login nodes
...

Stopping and monitoring SLURM jobs

To stop (cancel) a SLURM job use

Code Block

language	bash

scancel <job_id>

Once the job is running, there are a few tool that can help monitoring the status. Again, refer to <placeholder> for detailed SLURM tutorial, but here is a list of some useful commands:

Code Block

language	bash

# show status of the queue
squeue -l                      
# only list jobs by a specific user
squeue -l -u <cwid>            
# print partitions info
sinfo                          
# print detailed info about a job
scontrol show job <job id>     
# print detailed info about a job
scontrol show node <node_name> 
# get a list of all the jobs executed within last 7 days:
sacct -u <cwid> -S $(date -d "-7 days" +%D) -o "user,JobID,JobName,state,exit"

Version	Old Version 33	New Version 34
Changes made by	eud4002	eud4002
Saved on	Jan 30, 2025	Jan 30, 2025

Versions Compared

Key

Important notice

Warning
Do not run computations on login nodes
...

Stopping and monitoring SLURM jobs

Page Comparison

Versions Compared

Key

Important notice

WarningDo not run computations on login nodes...

Stopping and monitoring SLURM jobs

Warning
Do not run computations on login nodes
...