Eagle System Configuration
Learn about the Eagle system configuration.
The Eagle system is a high-performance computing (HPC) system with different types of servers (nodes) configured to run compute intensive and parallel computing jobs. All nodes run the Linux operating system: Red Hat Linux or the derivative CentOS distribution. The nodes and storage are connected by a high-speed 100 Gb/sec EDR InfiniBand network. A brief description of the configuration and features of the nodes, interconnect and file systems is provided below.
Compute Node Hardware Details
Eagle has 2100 compute nodes available for HPC jobs. Additional nodes maybe available, but are not guaranteed. A variety of different node types are available:
|Number of Nodes||Memory||Processors||Accelerators||Local Storage|
|1728||96 GB||Dual Intel Xeon Gold Skylake 6154 (3.0 GHz, 18-core) processors||N/A||1 TB SATA|
|288||192 GB||Dual Intel Xeon Gold Skylake 6154 (3.0 GHz, 18-core) processors||N/A||1 TB SATA|
|48||768 GB||Dual Intel Xeon Gold Skylake 6154 (3.0 GHz, 18-core) processors||N/A||
10 nodes with 25.6 TB SSD
38 nodes with 1.6 TB SSD
|50||768 GB||Dual Intel Xeon Gold Skylake 6154 (3.0 GHz, 18-core) processors||Dual NVIDIA Tesla V100 PCIe 16 GB Computational Accelerator||
10 nodes with 25.6 TB SSD
40 nodes with 1.6 TB SSD
There are three login nodes on the system. The /home, /nopt, /scratch, /projects, /shared-projects, /datasets and /mss file systems are mounted on all login nodes.
Users may connect to eagle.hpc.nrel.gov from the NREL network. This will connect to one of the three login nodes. Users also have the option of connecting directly to an individual login node using one of the following names:
To connect to Eagle from outside the NREL network, use eagle.nrel.gov.
Data Analysis and Visualization Nodes
The Data Analysis & Visualization (DAV) nodes are each equipped with Dual Intel Xeon Gold Skylake 6154 (3.0 GHz, 18-core) processors and dual NVIDIA Tesla V100 PCIe 16 GB Computational Accelerators. These nodes support OpenCL and CUDA programming models. These nodes support hardware-accelerated remote visualization of data using the FastX remote desktop and visualization software.
Users may connect to ed.hpc.nrel.gov. This will connect to one of the three DAV nodes. Users also have the option of connecting directly to an individual DAV node using one of the following:
To connect to Eagle DAV/FastX from outside the NREL network, use eagle-dav.nrel.gov.
All nodes and storage are connected using an enhanced 8-dimensional InfiniBand Enhanced Data Rate (EDR - 100 Gb/sec) hypercube topology that provides a bisection bandwidth of 26.4 TB/sec.
The Home File System (HFS) subsystem on Eagle is a robust NFS file system intended to provide highly reliable storage for user home directories and NREL-specific software. HFS has a capacity of 182 TB. Snapshots (backup copies) of files in the HFS filesystem are available up to 30 days after change/deletion.
The /home directory on Eagle resides on HFS and is intended to hold small files. These include shell startup files, scripts, source code, executables, and data files. Each user has a quota of 50 GB.
The /nopt directory on Eagle resides on HFS and is where NREL-specific software, module files, licenses, and licensed software is kept.
Parallel File System
The Parallel File System (PFS) on Eagle is a parallel Lustre file system intended for high-performance I/O. Use PFS storage for running jobs and any other intensive I/O activity. The capacity of 14 PB is provided by 28 Object Storage Servers (OSSs) and 56 Object Storage Targets (OSTs) with 3 Metadata Servers, all connected to Eagle's Infiniband network with 100 Gb/sec EDR. The default stripe count is 1, and the default stripe size is 1 MB.
The PFS hosts the /scratch, /projects, /shared-projects, and /datasets directory.
There are no backups of PFS data. Users are responsible for ensuring that critical data is copied to Mass Storage or other alternate data storage location.
Each user has their own directory in /scratch. Data in /scratch is subject to deletion after 28 days of inactivity.
Each project/allocation has a directory in /projects intended to host data, configuration, and applications shared by the project.
Projects may request a shared project directory to host data, configuration, and applications shared by multiple projects/allocations.
The /datasets directory on Eagle hosts widely used data sets.
Common Data Sets
There are multiple big data sets that are commonly used across various projects for computation and analysis on NREL's HPC Systems. We provide a common location on Eagle's scratch filesystem at /datasets, where these data sets are available for global reading by all compute nodes on Eagle. Each data set contains a readme file that covers background, references, explanation of the data structure, and Python examples.
The National Solar Radiation Database (NSRDB) is a serially complete collection of meteorological and solar irradiance data sets for the United States and a growing list of international locations for 1998-2017. The NSRDB provides foundational information to support U.S. Department of Energy programs, research, and the general public.
The Wind Integration National Data Set (WIND) Toolkit consists of wind resource data for North America and was produced using the Weather Research and Forecasting Model (WRF).
Node File System
Each Eagle compute node has a local solid-state drive (SSD) for use by compute jobs. They vary in size; 1 TB (standard), 1.6 TB (bigmem), and 25.6 TB (bigscratch), depending on the node feature requested. There are several possible scenarios in which a local disk may make your job run faster. For instance, you may have a job accessing or creating many small (temporary) files, you may have many parallel tasks accessing the same file, or your job may do many random reads/writes or memory mapping.
The local disk is mounted at /tmp/scratch and set under the $LOCAL_SCRATCH environment variable during a job. A node will not have read or write access to any other node's local scratch, only its own. Also, this directory will be cleaned once the job ends. You will need to transfer any files to be saved to another file system.
For more information about requesting this feature, please see Resource Request Descriptions on the Eagle Batch Jobs page.