Eagle System User Basics

Follow these instructions to start using the NREL HPC Eagle system's resources.

Before you can access NREL HPC systems, you will need an HPC user account.

Please note that as of February 2024, the methods for external/non-NREL users to connect to Eagle have changed. External/non-NREL users must use the SSH Gateway or HPC VPN. Directions are available below.

This change does not apply to internal NREL users or to Kestrel.

Users With Prior High-Performance Computing Experience

Internal Connection

If you are on the HPC VPN, NREL External VPN, or on site with an NREL device, you may access Eagle login nodes at eagle.hpc.nrel.gov, which will round-robin forward you to one of:

el1.hpc.nrel.gov
el2.hpc.nrel.gov
el3.hpc.nrel.gov

Similarly, DAV nodes can be accessed at eagle-dav.hpc.nrel.gov, which will forward your sessions to one of:

ed1.hpc.nrel.gov
ed2.hpc.nrel.gov
ed3.hpc.nrel.gov
ed5.hpc.nrel.gov
ed6.hpc.nrel.gov

External Connections to Eagle

If you are an external HPC user, you will need a One-Time Password Multifactor token (OTP) for two-factor authentication. Please request a multifactor token registration code if you did not receive one with your account.

Once you have a multifactor token, you will need to either use the SSH Gateway or the HPC VPN to connect to Eagle.

Both methods require logging in with a password and the six-digit code generated by your authenticator app, entered one after the other with no spaces or separators. For example, if your password is "abc" and the code from your authenticator app is "123", when prompted for "Password+OTP" you'll enter "abc123".

Users of the SSH Gateway will be presented with a commandline on the gateway machine, and you may proceed to ssh from the gateway's commandline to eagle.hpc.nrel.gov (or any individual el or ed node as listed above.)

For users of the VPN, once connected, you will be able to use the ssh app/interface of your choice to connect to eagle.hpc.nrel.gov (or any individual el or ed node as listed above.)

SSH Connection Examples

Here are examples of using SSH from a terminal to login to HPC systems:

$ ssh username@eagle.hpc.nrel.gov     # Internal connection (NREL or HPC VPN)

$ ssh -Y username@eagle-dav.hpc.nrel.gov  # Internal connection with graphical capabilities (NREL or HPC VPN)
 
$ ssh username@hpcsh.nrel.gov      # External connection to SSH Gateway

Idle login sessions will be automatically logged out.

Learn more about connecting to our systems.

To get started quickly:

Use srun [...] --pty $SHELL to request an interactive job
Use sbatch with a job script to submit a job to be ran without interaction

Both of these commands require an account and walltime be specified with -A and -t.

Use srun during a job to submit executables to the pool of nodes within your job after using either of the commands above (if you use srun outside of a job, it will request a resource allocation for you similar to salloc).

Slurm will automatically route your job to the appropriate partition (known as a "queue" on Peregrine) based on the hardware features, walltime, node quantity, and other attributes of your job submission.

Use squeue to observe the current status of the job queue

For a more thorough guide on job submission practices, please see Running Jobs on Eagle.

We have also constructed a streamlined PBS to Slurm Analogous Command Cheat Sheet to get experienced HPC users going quickly.

Users With Limited High-Performance Computing Experience

Access to Eagle is available only from within the NREL firewall, by the Secure Shell (SSH) protocol 2.

You may be accustomed to the graphical interface of your laptop or personal workstation, but tasks on HPC systems are typically executed with a command line interface via a terminal application. If you are using a Mac, your computer already has a terminal application and SSH.

If you are connecting using a Windows systems, a terminal package such as PuTTY that supports SSH needs to be installed. Configuring PuTTY to connect to Eagle is only necessary the first time Eagle is accessed. You may then log into Eagle using your HPC account username and password. For instructions on configuring PuTTY to connect to Eagle, see Connecting to HPC Systems. You may also consider applications such as Git for Windows or Cmder, which provide a terminal-emulator and compatibility layer for many Linux commands in Windows—particularly of note is the ssh command—allowing you to emulate the workflow detailed below on your Windows device.

To access Eagle from a system at NREL, start the terminal application and enter:

ssh <username>@eagle.hpc.nrel.gov

...where <username> should be replaced with your NREL HPC username. Remember to finally execute this command by hitting the return or enter key, after which you will be prompted for your password once your SSH login request successfully reaches the system you specified.

Upon successfully logging into an HPC system, your command-line prompt should contain your username and the hostname of the system you landed on like so:

[username@el1 ~]$

Note that $ prepends the prompt for input to your shell application. In our documentation, we also use $ as a symbol to indicate user-input of the text that follows as a command to your terminal, but since the prompt is placed by your terminal do not actually type $ before any commands. $ is a special character in Bash (the default shell application on our systems) when used in a command. The $ in terminal prompts typically indicates you are a standard user without elevated privileges in Linux operating systems.

HPC systems run Linux, specifically CentOS. If you are unfamiliar with Linux systems or a standard command-line workflow, we encourage viewing a quick-guide to getting started with the Linux command line, or one of the National Institute for Computational Sciences seminars available for HPC-centric command line usage. If these resources prove to be insufficient, a quick web-search should reveal no shortage of introductions to terminal usage.

At this point, you have started a session on one of the system's login nodes (denoted by the el# hostname in your shell prompt such as el3—short for "Eagle Login node 3") which are systems that serve as a gateway to the rest of the system. Below is common etiquette you should follow to avoid inappropriate use:

Please do not run your intensive applications on login nodes as you are likely sharing them with dozens of other users who will notice the degradation in responsiveness and notify HPC Operations. If you need to run arbitrary commands in real time before making a batch job, please see the Interactive Jobs page. Eagle is comprised of thousands of non-login "compute nodes" which are dedicated to running your applications.
Your /home directory has an enforced capacity of 50GB and should only store utility files, not data for jobs. Please use /scratch and/projects accordingly for job data, as you will get much faster file-manipulation throughput on those filesystems. See Eagle's System Configuration for more information on the intended usage of each of Eagle's mountpoints.

Login nodes are a shared resource, and are subject to process limiting based on usage. Each user is permitted up to 8 cores and 100GB of RAM at a time, after which the Arbiter2 monitoring software will begin moderating resource consumption, restricting further processes by the user until usage is reduced to acceptable limits.

In this context, any software or task you wish to run on the compute nodes is referred to as a "job." Eagle uses the Slurm Workload Manager to schedule jobs submitted by users across the system. Part of Slurm's responsibility is to make sure each user gets a fair, optimized timeshare of HPC resources including any specific hardware features (e.g. GPUs, nodes with extra RAM, etc). Jobs can be any executable file, whether this is a shell script, which invokes several commands, a precompiled binary with MPI functionality, or anything else you could launch from the command line.

The most common job submissions are shell scripts which contain calls to several programs within them. Below is an example of what such a script might resemble. It is a simple shell script that tells each compute node in the job to output its ID, which is a unique number that represents that particular node during the job's duration.

#!/bin/bash

#SBATCH -t 1:00
#SBATCH --job-name=node_rollcall
#SBATCH --output=node_rollcall.%j.out
#SBATCH --nodes=10
#SBATCH --ntasks-per-node=1

echo "Running on $SLURM_JOB_NUM_NODES nodes: $SLURM_NODELIST"
srun bash <<< 'echo "I am $SLURMD_NODENAME and my ID is $SLURM_NODEID"'

Using a text editor, you can create a file and paste in the contents of the above codeblock. The most common terminal-based text editors are listed below with links to a quickstart guide for them:

vi - see Colorado State University's Basic vi Commands

emacs - see GNU Emacs web page

nano - see nano Command Manual

Assuming you name the file rollcall.slurm, here is how you would submit it as a job:

$ sbatch -A <project_handle> rollcall.slurm

...where <project_handle> is the handle for one of the HPC project allocations you are associated with (you may also specify this with an #SBATCH directive at the top of your batch script). Note that every job must have an account (handle) specified, and that arguments to sbatch and srun must precede the executable file or they will be ignored. The #SBATCH directives allow you to specify command line arguments without having to supply them each time you call sbatch, however these directives will be ignored by srun or invoking this script manually from within an interactive job.

HPC systems at NREL set the environment variable $NREL_CLUSTER to help you identify what cluster your scripts are running on. The below example shows how you can determine what cluster the script is running on and submit jobs differently to accommodate each cluster's differences:

if [[ ${NREL_CLUSTER} = "peregrine" ]]; then
   qsub <batch_file> -A <project-handle>
elif [[ ${NREL_CLUSTER} = "eagle" ]]; then
    sbatch <batch_file>
fi

For more thorough demonstrations of Slurm's functionality and sample batch scripts, see Running Jobs on Eagle and its child pages.

NREL HPC on GitHub

The GitHub repository features more tips and tricks for developing effective workflows on HPC systems. Users are welcomed and encouraged to contribute information they think will benefit the whole community.

Visit GitHub