No Time, No Chips: No Problem for NREL
Researchers Build Two Computing Solutions Amid Global Supply Shortages
Imagine spending $6.5 million in 30 seconds. For automakers, that was a price worth paying to advertise their latest electric vehicles to over 100 million viewers during the 2022 Super Bowl.
Celebrity cameos and special effects may pique interest, but they cannot overcome the barriers that prevent people from buying electric vehicles or adopting other clean energy technologies.
A 2020 survey by Consumer Reports cited the cost of new electric cars and limited access to charging stations as the biggest roadblocks for the public.
Creating clean energy options that are affordable and accessible for everyone is possible at the National Renewable Energy Laboratory (NREL), where researchers rely on high-performance computing to transform data into the models and simulations of their breakthrough discoveries.
NREL has its foot on the (electric) pedal to clean up our entire energy economy at an unprecedented speed and scale. No delay—not even a pandemic-induced global supply shortage—can nudge the laboratory off course, with its computational science experts achieving some dazzling feats of creative problem-solving.
A New Challenge
Whether it is creating more efficient, cleaner transportation or developing better buildings, grids, and solar, water, geothermal, and wind energy generation and storage, the U.S. Department of Energy (DOE) relies on NREL to tackle a wide range of energy challenges. In fact, of all 17 national laboratories, NREL is the only laboratory solely dedicated to energy efficiency and renewable energy research for DOE.
Each of those energy challenges requires the powerful computing capabilities of NREL's supercomputers—like Eagle and the highly anticipated Kestrel—to help researchers rapidly identify insights and accelerate solutions.
About 85% of NREL's high-performance computing (HPC) time is dedicated to DOE projects. But in the final months of 2020, DOE's Vehicle Technologies Office (VTO) asked NREL to plan to accommodate their anticipated doubling of computing resource need by 2022.
A Swift Solution
The advanced computing and computational science experts at NREL were tasked with a considerable challenge: design a world-class HPC resource almost half the size of Eagle that could be operational within a year. This is an aggressive timeline for a normal year, made worse by global semiconductor chip shortages and supply chain delays caused by COVID-19.
Nonetheless, the resulting machine—appropriately named Swift—was completed and became operational in NREL's Energy Systems Integration Facility (ESIF) last summer. Although Swift only physically occupies one server row in the ESIF, it packs 2 petabytes of storage and over 28,000 compute cores (for multiple, simultaneous processes) across 440 nodes. For context, Facebook relies on 1.5 petabytes to store its users' 10 billion photos.
In anticipation of future demands, NREL researchers designed Swift with flexibility in mind. That is why they selected Spack—packaging software from DOE's Office of Science Exascale Computing Project—to serve as Swift's software environment.
“Spack is an international project focused on providing easily deployable software in complex high-performance computing environments,” said NREL Computational Scientist Jon Rood, who emphasized why Spack makes strategic sense for the long term. “Spack's popularity continues to increase as it evolves to serve system administrators, scientific software developers, and end users of supercomputers to provide them with a consistent platform in which productivity is paramount.”
Rood added, “Further advantages to Spack are its ability to tie into the Extreme-Scale Scientific Software Stack ecosystem, also known as E4S, where researchers can benefit from pre-built software applications and containers, which provide some of the most popular scientific software—with no wait time between downloading the applications and utilizing them to obtain results.”
Placing Swift inside NREL's ESIF reflects a strategy that interlaces NREL's advanced computing operations and computational science expertise. NREL's world-class design and delivery of computing solutions enables rapid data movement and economies of shared support infrastructure. In 2022 and beyond, the combination of Eagle (or Kestrel) and Swift will provide robust support for the VTO portfolio. In addition, future drops of the software environment that are being continually optimized by the user and application engagement team will occur, enabling performance optimization, more flexibility, and harmonization of resources reporting to NREL's HPC.
Living on the Edge
Remember that 15% slice of NREL's HPC capacity? It is dedicated to NREL's laboratory-directed research and development and Technology Partnership Program portfolio, which is targeting NREL's vision: a clean energy future for the world. If 15% of computing capacity does not sound like enough to support a bold vision like that, it is not; as NREL researchers crafted and delivered the Swift solution for DOE, they simultaneously did the same for fellow NRELians with Vermillion.
Vermillion is the first phase of a flexible, on-premise cloud resource tailored for prominent NREL projects like artificial intelligence (AI) training. This on-premise cloud computing—or edge computing—is performed near the original data source, instead of accessing the data on the cloud at one of a dozen data centers around the world. The latency—or time delay—of accessing cloud-based information necessitates edge computing; autonomous vehicles rely on split-second data access to keep passengers safe. Other AI-based energy solutions (for smart grids and buildings) also benefit from edge computing.
NREL is a living laboratory—we simulate and test our proposed solutions to see how they may work in our complex, interconnected world. With Vermillion, NREL can now experiment with HPC, commercial cloud computing, and edge computing to envision more clean energy technology scenarios. Vermillion is designed to be accessible and flexible to accommodate the needs of researchers now and in the future.
The system software is built on powerful open-source standards utilizing Linux, OpenStack, and Kubernetes Infrastructure, known in the technical world as LOKI. This software stack pools virtual resources for dynamic grouping, providing greater flexibility to match NREL's demanding workflows. And it leverages Slurm scheduling to expediently assign and execute computational tasks and maximize job throughput.
In true NREL style, Vermillion's name is inspired by the natural world and alludes to growing possibilities. Named for a tributary of the Green River, Vermillion is the first exclusively NREL-dedicated computing resource that will feed a roaring flow of research. Already, Vermillion is positioned to evolve and track the cutting edge of both the NREL workload and the computing industry. Just as more tributaries magnify the force of a river, NREL researchers are eagerly anticipating the amplified effects of what may soon join Vermillion next.
A global crisis brought the supply chain to its knees, and those impacts continue to reverberate across industries. But nothing has been able to slow NREL researchers down on the path to a clean energy future.
If your organization has computational needs that NREL could support, contact Aaron Andersen or Jennifer Southerland for more information. The Vermillion complex is intended to be augmented over time.