Skip to main content

Announcements

Read announcements for NREL’s high-performance computing (HPC) system users. 

Applications and Frameworks available for testing

March 03, 2020

The following applications and frameworks are installed on Eagle, and can be accessed from the test modules collection. They can be made visible with the command

module use /nopt/nrel/apps/modules/test/modulefiles Continue reading

Reminder to be proactive with data

March 03, 2020

Please remember to periodically review the HPC data retention policy https://www.nrel.gov/hpc/data-retention-policy.html. NREL HPC has only a certain level of responsibility to manage the data that is associated with your research.  Critical data should be routinely backed up and moved via Globus. Continue reading

March Eagle Status

March 03, 2020

During the system time February 10-16, HPC Operations worked closely with HPE and DDN to patch, upgrade and adjust configurations intended to make the eaglefs filesystem (/scratch and /projects) more reliable.  So far those updates appear to be effective.  We continue to monitor Eagle closely and continue working with HPE and DDN.  Some changes you may want to be aware of... Continue reading

New Montioring Tool

January 14, 2020

This is an HPC utility that shows the allocation of a process’ threads between different CPU cores along with CPU utilization and CPU frequencies.  The “threads” tool was designed to monitor and troubleshoot MPI and OpenMP applications on Eagle. It allows combined monitoring of performance and behavior of processor cores and associated threads. It also shows thread placement and utilization on cores (such as whether the thread is actively running or waiting), and it is also possible to capture the performance data for post analysis like core affinity, thread performance, and other application characteristics.

Continue reading

Eagle /scratch purging

January 14, 2020

Per HPC policy, we will continue purging any files in Eagle /scratch that have not been accessed in over 90 days. Users are recommended to copy or move any Eagle /scratch files they need to MSS. Continue reading

Eagle non-FY2020 /projects to be deleted in February

January 14, 2020

Per HPC policy, any projects whose allocation expired after September 30th, 2019 are deleted after 3 months. Users are recommended to move their files to MSS for FY2019 projects that have expired. Continue reading

New Applications Available for Testing

January 14, 2020

From the Eagle command line, "module use /nopt/nrel/apps/modules/test/modulefiles" will add a collection of binaries (applications, libraries, middleware, etc.) that are available for testing by the user community. We are especially interested in problems that people run into, so we can correct them before moving things into production.

Binaries available for testing include:

Continue reading

Tip of the Month: Transitioning from Python2 to Python3

January 14, 2020

Dear NREL HPC users, if you currently use Python2 in your workflow, this is a reminder that Python2 will no longer be supported after January 1st 2020. If you haven't considered beginning to port your Python2 coding to Python3, now would be a good time to start that effort. Python2 will officially retire and all support will be be focused on Python3. For a countdown on python 2.7 being dropped from support, see website on Python.  Some helpful links covering transitioning Python2 to Python3 can be followed here: Continue reading

HPC Interactive GPU MPI Changes

November 14, 2019

Changes have been made in terms of the way GPU nodes are requested for Interactive jobs. Please see the instructions on using the GPU's for interactive jobs on this link. Continue reading

Eagle Status

November 14, 2019

Over the summer we had a few issues with Lustre (/scratch, /projects, /shared-projects and /datasets.) A combination of a firmware and a hardware issue would cause one controller out of a dual set to fail, and then the second one would not take over. We applied a firmware fix during a previous system time.  During the October system time we replaced all the controllers.  We also resolved any remaining issues with Lustre and returned the targets (Lustre equivalent of disks) that had been made read only and had data migrated off them to service. We also added additional targets to the cluster.

We’ve been investigating issues with some MPI jobs at scale. Although Intel MPI generally functions well, we are currently working with multiple vendors to determine the underlying causes of the observed problems. Continue reading