Eagle Job Partitions and Scheduling Policies

Learn about job partitions and policies for scheduling jobs on Eagle.

Partitions

Eagle nodes are associated with one or more partitions. Each partition is associated with one or more job characteristics, which include run time, per-node memory requirements, per-node local scratch disk requirements, and whether graphics processing units (GPUs) are needed.

Jobs will be automatically routed to the appropriate partitions by Slurm based on node quantity, walltime, hardware features, and other aspects specified in the submission. Jobs will have access to the largest number of nodes, thus shortest wait, if the partition is not specified during job submission.

The following table summarizes the partitions on Eagle.

Partition Name	Description	Limits	Placement Condition
debug	Nodes dedicated to developing and troubleshooting jobs. Debug nodes with each of the non-standard hardware configurations are available. The node-type distribution is: 4 GPU nodes 2 Bigmem nodes 7 standard nodes 13 total nodes	1 job with a max of 2 nodes per user 01:00:00 max walltime	-p debug or --partition=debug
short	Nodes that prefer jobs with walltimes <= 4 hours	No partition limit. No limit per user.	--time <= 4:00:00 --mem <= 85248 (1800 nodes) --mem <= 180224 (720 nodes)
standard	Nodes that prefer jobs with walltimes <= 2 days	2100 nodes total 1050 nodes per user	--time <= 2-00 --mem <= 85248 (1800 nodes) --mem <= 180224 (720 nodes)
long	Nodes that prefer jobs with walltimes > 2 days Maximum walltime of any job is 10 days	525 nodes total 262 nodes per user	--time <= 10-00 --mem <= 85248 (1800 nodes) --mem <= 180224 (720 nodes)
bigmem	Nodes that have 768 GB of RAM	90 nodes total 45 nodes per user	--mem > 180224
bigscratch	Nodes that each have larger /tmp/scratch mounts (24 TB SSD) for per-node large-data tasks	20 nodes total 10 nodes per user	--tmp > 1500000
gpu	Nodes with dual NVIDIA Tesla V100 PCIe 16 GB Computational Accelerators for GPU-based software	44 nodes total 22 nodes per user 2 GPUs per node	--gres=gpu:1 (1 per node) --gres=gpu:2 (2 per node) --timelimit <= 2 days
gpul	Nodes with dual NVIDIA Tesla V100 PCIe 16GB Computational Accelerators for GPU-based software	8 nodes 2 node per user 2 GPUs per node	--gres=gpu:1 (1 per node) --gres=gpu:2 (2 per node) --timelimit > 2 days

Use the option listed above on the srun, sbatch, or salloc command or in your job script to specify what resources your job requires. Sample job scripts and the syntax for specifying the queue are available on the sample job scripts page.

Job Scheduling Policies

The system configuration lists the four categories that Eagle nodes exhibit based on their hardware features. No single user can have jobs running on more than half of the nodes from each hardware category. For example, the maximum quantity of data and analysis visualization (DAV) nodes a single job can use is 25.

Also learn how jobs are prioritized.