PACE

  • project directory
~/p-fschaefer7-0
  • home directory
/storage/home/hcoda1/6/shuan7
  • scratch (temp)
~/scratch

logging in

  • need to run GT VPN (GlobalProtect)
  • logging in:
  • (kitty sets $TERM wrong)
TERM=xterm-color ssh shuan7@login-phoenix.pace.gatech.edu
  • (GT password)
  • to see headnodes
pace-whoami

transferring files

submitting jobs

gts-fschaefer7
  • see accounts
pace-quota
  • see queue status
pace-check-queue -c inferno
#!/bin/bash #SBATCH -Jcknn-cg # job name #SBATCH --account=gts-fschaefer7 # charge account #SBATCH --nodes=1 # number of nodes and cores per node required #SBATCH --ntasks-per-node=1 #SBATCH --cpus-per-task=8 #SBATCH --mem-per-cpu=22gb # memory per core #SBATCH -t48:00:00 # duration of the job (hh:mm:ss) #SBATCH -qinferno # QOS name #SBATCH -ojobs/cg_%j.out # combined output and error messages file cd $SLURM_SUBMIT_DIR # change to working directory # load modules module load anaconda3 # enter conda virtual environment conda activate ./venv # run commands lscpu lsmem time srun python -m experiments.cg
  • submit job to scheduler (two queues: inferno and embers)
sbatch job.sbatch
  • job inherits current directory, have to run sbatch from proper directory!

  • submitted job status

squeue -u shuan7

interactive session

salloc -A gts-fschaefer7 -q inferno -N 1 --ntasks-per-node=4 -t 1:00:00
  • for gpu
salloc -A gts-fschaefer7 -q inferno -N 1 --gres=gpu:A100:1 --mem-per-gpu=12G -t 0:15:00

software

modules

conda