Overview
In collaboration with Dr. Mert Sabuncu from Radiology, the ITS team has established the framework for a new high-performance computing (HPC) cluster dedicated to AI/ML type workflows, like training neural networks for imaging, LLMs and so on.
This cluster features high-memory nodes, Nvidia GPU servers (A100, A40 and L40), InfiniBand interconnect, and specialized storage designed for AI workloads.
Login to the AI cluster
AI cluster is accessiUse SSH to log in to the cluster. Replace <cwid>
with your credentials.
ssh <cwid>@ai-login01.med.cornell.edu
Once logged on:
Last login: Fri Jan 3 11:35:53 2025 from 157.000.00.00 <cwid>@ai-login01:~$ <cwid>@ai-login01:~$ pwd /home/<cwid> <cwid>@ai-login01:~$
Home directories have limited space. It is only used for small files. All data should be stored within your designated scratch space.
Storage
AI cluster has the following storage systems configured:
Name | Mount point | Use | Is backed up? | Comment |
---|---|---|---|---|
Home |
| home filesystem. Used to keep small files, configs, codes, scripts, etc | no | |
Midtier |
| each lab has an allocation under intended for data that is actively being used or processed, research datasets | no | |
AI GPFS |
| tbd | no | limited access, granted on special requests |
Common File Management
# List all files and directories in the scratch directory ls /midtier/labname/scratch/ # Navigate to a specific subdirectory cd /midtier/labname/scratch/cwid # Copy a file from the current directory to another directory cp data.txt /midtier/labname/scratch/cwid/ # Move the copied file to a different directory mv /midtier/labname/scratch/cwid/data.txt /midtier/labname/scratch/backup/ # Create a new directory mkdir /midtier/labname/scratch/cwid/new_project/