Cancel all slurm jobs larger job ID X

Cancel all slurm jobs larger job ID X

Sometimes it happens that we have running a whole bunch of slurm jobs from different projects, some of them are running already for days, while others are just fired – and then we noticed, damn, the 100 jobs that I just fired are wrong and they need to be canceled. Unfortunately, there is no slurm command that can do that, it requires some kind of scripting to do that.

The following script takes as an input a slurm job ID and cancels all jobs larger than that (that belong to the logged in user…).

#!/bin/bash

declare -a jobs=()

if [ -z "$1" ] ; then
    echo "Minimum Job Number argument is required.  Run as '$0 jobnum'"
    exit 1
fi

minjobnum="$1"

myself="$(id -u -n)"

for j in $(squeue --user="$myself" --noheader --format='%i') ; do
  if [ "$j" -gt "$minjobnum" ] ; then
    jobs+=($j)
  fi
done

scancel "${jobs[@]}"

If you store this e.g as killLarger.sh in your PATH somewhere, you can just use it from anywhere and cancel slurm jobs that are larger than this ID.

Leave a Reply

%d bloggers like this: