Find training resources for using NREL’s high-performance computing (HPC) systems as well as related online tutorials.
Schedule of Upcoming Training and Office Hours
Eagle uses the workload manager, Slurm. The HPC Operations team will be conducting a series of Slurm workshops to assist users in migrating from Peregrine's Torque/Moab to Eagle's Slurm. In addition to new system capabilities with Eagle, Slurm brings a host of new features and functionality that are worth taking some time to investigate. These workshops will assume basic familiarity with Eagle and Slurm as documented on the User Basics and Running Jobs pages of our website.
Slurm: Advanced Techniques
In the second of our series, Eagle Workshop - Advanced Slurm Techniques, we will be covering topics that will be beneficial for job management.
- Job monitoring and forensics: usage examples on sreport, sacct, sinfo and sview (FastX).
- Advanced srun and sbatch functions (flags).
- Parallelizing with SLURM.
- Remote exclusive GPU usage, requesting GPU nodes.
- Using "srun" in place of mpirun.
- Please bring your questions and working examples.
Wednesday, March 20th from 10 a.m. to 11 a.m. in the ESIF B208 Maxwell Conference Room
For registration and Webinar access:
Slurm: New NREL Capabilities
This workshop covered the following features which are new to the NREL HPC workflow relative to what was possible on Peregrine and its job scheduler:
- Basic Slurm core functionality overview
- Slurm partitions - request by features
- Effective queue partition requests
- Request by resource needs
- GPU compute nodes
- Local scratch
- Memory requirements
- Job dependencies and job arrays
- Job steps
- Job monitoring and basic troubleshooting.
The resources used during this presentation are available here:
Transition from Peregrine to Eagle
This workshop detailed changes, advice, and caveats for acclimating to Eagle relative to what was standard on Peregrine. The resources used during this presentation are available here:
Tutorials and Guides
Please see our GitHub repository for cloneable walkthroughs and examples you can follow along within your shell.
NREL HPC Wiki
The GitHub repository wiki features more tips and tricks for developing effective workflows on HPC systems. Users are welcomed and encouraged to contribute information they think will benefit the whole community.View WIKI