PACE

  • project directory
~/p-fschaefer7-0
  • home directory
/storage/home/hcoda1/6/shuan7
  • scratch (temp)
~/scratch

logging in

  • need to run GT VPN (GlobalProtect)
  • logging in:
  • (kitty sets $TERM wrong)
TERM=xterm-color ssh shuan7@login-phoenix.pace.gatech.edu
  • (GT password)
  • to see headnodes
pace-whoami

transferring files

submitting jobs

gts-fschaefer7
  • see accounts
pace-quota
  • see queue status
pace-check-queue -c inferno
#!/bin/bash
#SBATCH -Jcknn-cg                 # job name
#SBATCH --account=gts-fschaefer7  # charge account
#SBATCH --nodes=1                 # number of nodes and cores per node required
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=22gb        # memory per core
#SBATCH -t48:00:00                # duration of the job (hh:mm:ss)
#SBATCH -qinferno                 # QOS name
#SBATCH -ojobs/cg_%j.out          # combined output and error messages file
cd $SLURM_SUBMIT_DIR              # change to working directory

# load modules
module load anaconda3

# enter conda virtual environment
conda activate ./venv

# run commands
lscpu
lsmem

time srun python -m experiments.cg
  • submit job to scheduler (two queues: inferno and embers)
sbatch job.sbatch
  • job inherits current directory, have to run sbatch from proper directory!

  • submitted job status

squeue -u shuan7

interactive session

salloc -A gts-fschaefer7 -q inferno -N 1 --ntasks-per-node=4 -t 1:00:00
  • for gpu
salloc -A gts-fschaefer7 -q inferno -N 1 --gres=gpu:A100:1 --mem-per-gpu=12G -t 0:15:00

software

modules

conda