Skip to main content

System Transition from Peregrine to Eagle

Learn about migrating your workflow from Peregrine to the latest high-performance computing (HPC) system—Eagle.

In an effort to address the feedback given by the HPC user community, HPC operations has acquired a state-of-the-art system. The new system, Eagle, was configured with features that better accommodate the needs expressed by our users. Eagle also comes with some improvements to the operating environment that we believe will further streamline the HPC user experience compared to previous systems. Please see the sections below for guidance with getting situated on Eagle.


Workshops

The HPC Operations team held workshops for providing live assistance with acclimating to Eagle, and is developing similar sessions to help users get the most out of HPC resources. The resources used during these presentations are available here:

Transitioning from Peregrine to Eagle

Separate instructions for how to use Globus to migrate files quickly and reliably

PBS to Slurm Analogous Command Cheatsheet

Please see our schedule of upcoming workshops if you are interested in attending one of our sessions hosted directly by members of HPC Operations who are familiar with the nuances of how Eagle's software environment operates.

Accessing Eagle

Users who had access to Peregrine will be able to access Eagle using the same username and password.

Internal users connected to the NREL network either onsite with an NREL device or offsite via VPN can access Eagle using these domain names:

Login Node DAV Node

eagle.hpc.nrel.gov 

eagle-dav.hpc.nrel.gov

External users connecting directly with the SSH protocol and a One-Time Password token may use these domain names without connecting to the NREL HPC VPN:

Login Node DAV Node

eagle.nrel.gov

eagle-dav.nrel.gov

Sharing RSA Public Keys Between Systems for Ease of Access

First and foremost, please do not run ssh-keygen while logged into HPC systems. SSH keys are generated for your account automatically the first time you login. This mechanism only detects the absence of your RSA keys, not any modification. If you generate new keys, the modified keys will not be propagated, and you will not be able to login to compute nodes in any of your jobs. 

You can access Eagle from Peregrine and vice versa. This is often done to transfer files with commands such as scp or rsync. You will likely find having to type in your password each time you access one system from the other to be tedious at best, so this section will demonstrate how to safely and effectively copy your RSA public key between the two systems so you don't have to provide your password each time you swap systems.

First, login to Peregrine via SSH. From your terminal, type:

[<username>@login4] $ ssh-copy-id <username>@eagle.hpc.nrel.gov

Where <username> should be replaced with your HPC account username. You will be prompted for your password directly after executing this command. If done successfully, this will emplace the public key that was generated on Peregrine as an authorized key on Eagle, so you won't have to provide your password anymore.

This command will prompt you to login to el.hpc.nrel.gov to make sure the key was copied successfully, which you should do as part of the next step.

Once logged into Eagle, run the following to complete two-way access:

[<username>@el4] $ ssh-copy-id <username>@peregrine.hpc.nrel.gov

You should now be able to easily login to one system from the other. Be mindful of too many nested shells, as this may create other problems.

Allocation Management and Accounting

Prior to Eagle, allocation usage statistics could be seen by running the alloc_tracker command while logged into Peregrine. alloc_tracker has been deprecated because it does not have sophisticated enough logic to track usage across both Peregrine and Eagle, and the successor script hours_report should be used instead. Please run hours_report --help to see detailed usage information and advanced querying options.

As with Peregrine, allocations which exhaust their allotment of NREL Hours may still submit jobs but they will be submitted with minimum priority.

Running Jobs

Eagle uses Slurm for job scheduling, whereas Peregrine used PBS. The workflow for submitting jobs is identical, however the exact job-submission command syntax differs greatly from Peregrine.

 

For a comprehensive guide on using Slurm, see running jobs on the Eagle system

Please see our PBS to Slurm Analogous Command Cheat Sheet if you are familiar with using Moab/Torque on Peregrine and would like to see equivalent usage commands for Eagle.

Filesystems and Data Transfer

Eagle was provided with new storage hardware, which features larger capacity and faster access. Consequently, Eagle will not share a filesystem with Peregrine. This means that users will need to copy or move any relevant data to the new filesystems. We recommend that users desiring to transfer data see our short overview on data storage and transfer, which discusses the best tools to use respectively for different magnitudes of file size.

Please note: Globus is the de facto method we are encouraging for transferring data from Peregrine to Eagle. To get setup and start transferring your data more quickly than conventional methods, see our documentation on Globus

Filesystem Mounts

Notable filesystem mounts on Eagle differ slightly from those on Peregrine. Below are the various mountpoints and their intended use:

/home

Each user has a personal directory under /home with a quota of 50 gigabytes. This is where shell startup files, scripts, source code, executables and other small files should reside.

/nopt

The /nopt directory on Eagle is where NREL-specific software, module files, and licensed software is kept.

/scratch

Each user has a directory under scratch. This is where data that may be mutually accessed by several nodes at once should reside. Scratch has parallel network transfer protocols and features much higher bandwidth than the normal NFS mounts listed above. Files in /scratch that have not been accessed in 30 days will be deleted.

"Local Scratch"

Each Eagle node has a "local scratch", the full path of which will be set under the $LOCAL_SCRATCH environment variable during a job and will be /tmp/scratch across all Eagle nodes. A node will not have read or write access to any other node's local scratch, only its own. These directories are for performant per-node file manipulation.

/projects

Each project allocation has a directory in /projects to serve as a mutually accessible repository for all members of that project.

/shared-projects

Much like a directory under /projects allows member of a project to mutually access files, a directory in /shared-projects allows mutual access from members of several projects. One can be requested from the HPC Operations team. Projects which splintered into several allocations due to recent changes in allocation policies may benefit from this by allowing access to common data from the child project allocations.

/datasets

The /datasets directory on Eagle hosts widely-used datasets that are accessible across project allocations.