Eagle Job Partitions and Scheduling Policies
Learn about job partitions and policies for scheduling jobs on Eagle.
Partitions
Eagle nodes are associated with one or more partitions. Each partition is associated with one or more job characteristics, which include run time, per-node memory requirements, per-node local scratch disk requirements, and whether graphics processing units (GPUs) are needed.
Jobs will be automatically routed to the appropriate partitions by Slurm based on node quantity, walltime, hardware features, and other aspects specified in the submission. Jobs will have access to the largest number of nodes, thus shortest wait, if the partition is not specified during job submission.
The following table summarizes the partitions on Eagle.
Partition Name | Description | Limits | Placement Condition |
---|---|---|---|
debug |
Nodes dedicated to developing and troubleshooting jobs. Debug nodes with each of the non-standard hardware configurations are available. The node-type distribution is:
|
1 job with a max of 2 nodes per user 01:00:00 max walltime |
-p debug |
short | Nodes that prefer jobs with walltimes <= 4 hours |
No partition limit. No limit per user. |
--time <= 4:00:00 --mem <= 85248 (1800 nodes) --mem <= 180224 (720 nodes) |
standard | Nodes that prefer jobs with walltimes <= 2 days |
2100 nodes total 1050 nodes per user |
--time <= 2-00 --mem <= 85248 (1800 nodes) --mem <= 180224 (720 nodes) |
long |
Nodes that prefer jobs with walltimes > 2 days Maximum walltime of any job is 10 days |
525 nodes total 262 nodes per user |
--time <= 10-00 --mem <= 85248 (1800 nodes) --mem <= 180224 (720 nodes) |
bigmem | Nodes that have 768 GB of RAM |
90 nodes total 45 nodes per user |
--mem > 180224 |
bigscratch | Nodes that each have larger /tmp/scratch mounts (24 TB SSD) for per-node large-data tasks |
20 nodes total 10 nodes per user |
--tmp > 1500000 |
gpu |
Nodes with dual NVIDIA Tesla V100 PCIe 16 GB Computational Accelerators for GPU-based software |
44 nodes total 22 nodes per user 2 GPUs per node |
--gres=gpu:1 (1 per node) --gres=gpu:2 (2 per node) --timelimit <= 2 days |
gpul |
Nodes with dual NVIDIA Tesla V100 PCIe 16GB Computational Accelerators for GPU-based software |
8 nodes 2 node per user 2 GPUs per node |
--gres=gpu:1 (1 per node) --gres=gpu:2 (2 per node) --timelimit > 2 days |
Use the option listed above on the srun, sbatch, or salloc command or in your job script to specify what resources your job requires. Sample job scripts and the syntax for specifying the queue are available on the sample job scripts page.
Job Scheduling Policies
The system configuration lists the four categories that Eagle nodes exhibit based on their hardware features. No single user can have jobs running on more than half of the nodes from each hardware category. For example, the maximum quantity of data and analysis visualization (DAV) nodes a single job can use is 25.
Also learn how jobs are prioritized.