New Ultra-Efficient HPC Data Center Debuts
March 11, 2013
Scientists and researchers at the U.S. Department of Energy's National Renewable Energy Laboratory (NREL) are constantly innovating, integrating novel technologies, and "walking the talk." Since 1982, NREL has won 52 R&D 100 Awards — known in the research and development community as "the Oscars of Innovation" — for its groundbreaking work.
When it came time for the lab to build its own high performance computing (HPC) data center, the NREL team knew it would have to be made up of firsts: The first HPC data center dedicated solely to advancing energy systems integration, renewable energy research, and energy efficiency technologies. The HPC data center ranked first in the world when it comes to energy efficiency. The first petascale HPC to use warm-water liquid cooling and reach an annualized average power usage effectiveness (PUE) rating of 1.06 or better.
To accomplish this, NREL worked closely with industry leaders to track rapid technology advances and to develop a holistic approach to data center sustainability in the lab's new Energy Systems Integration Facility (ESIF).
"We took an integrated approach to the HPC system, the data center, and the building as part of the ESIF project," NREL's Computational Science Center Director Steve Hammond said. "First, we wanted an energy-efficient HPC system appropriate for our workload. This is being supplied by HP and Intel. A new component-level liquid cooling system, developed by HP, will be used to keep computer components within safe operating range, reducing the number of fans in the backs of the racks."
Next, the NREL team, which included the design firms SmithGroupJJR and the Integral Group, created the most energy-efficient data center it could to house and provide power and cooling to the HPC system. High-voltage (480 VAC) electricity is supplied directly to the racks rather than the typical 208 V, which saves on power electronics equipment, power conversions, and losses. Energy-efficient pumps largely replace noisy, less-efficient fans.
"Last but not least, we wanted to capture and use the heat generated by the HPC system," Hammond said. "Most data centers simply throw away the heat generated by the computers. An important part of the ESIF is that we will capture as much of the heat as possible that is generated by the HPC system in the data center and reuse that as the primary heat source for the ESIF office space and laboratories. These three things manifest themselves in an integrated 'chips-to-bricks' approach."
Like NREL's Research Support Facility, the ESIF HPC data center did not cost more to build than the average facility of its kind. It actually cost less to construct than comparable data centers and will be much cheaper to operate. NREL's approach was to minimize the energy needed, supply it as efficiently as possible, and then capture and reuse the heat generated.
"Compared to a typical data center, we may save $800,000 of operating expenses per year," Hammond said. "Because we are capturing and using waste heat, we may save another $200,000 that would otherwise be used to heat the building. So, we are looking at saving almost $1 million per year in operation costs for a data center that cost less to build than a typical data center."
Warm-Water Cooling Boosts Data Center Efficiency
The ultra-efficient HPC system in NREL's new data center has been designed in collaboration with HP and Intel. The HPC system will be deployed in two phases that will include scalable HP ProLiant SL230s and SL250s Generation 8 (Gen8) servers based on eight-core Intel Xeon E5-2670 processors as well as the next generation of servers using future 22nm Ivy Bridge architecture-based Intel Xeon processors and Intel Many Integrated Core architecture-based Intel Xeon Phi coprocessors. The first phase of the HPC installation began in November 2012, and the system will reach petascale capacity in the summer of 2013.
In the spirit of overall energy efficiency, the Intel Xeon Phi coprocessor delivers on several fronts. According to Intel, it can easily port complete applications in a short time, so software engineers won't need specialized tools or new languages to support significant software packages. "Intel coprocessors also increase the efficiency of computer resource usage," said Stephen Wheat, general manager of high performance computing at Intel. "The methods of code optimization for Xeon Phi are identical to what one does to make the most of Xeon processors. Finely tuned optimizations for Xeon Phi almost always result in a better-performing source code for Xeon processors. As the optimized and tuned application is run in production, the achieved performance per watt on both Xeon Phi and Xeon processors allows achieving the results with the lowest energy use."
While some of the NREL HPC components may be off the shelf, the team is taking a different approach in cooling this supercomputer.
"In traditional computer systems, you have a mechanical chiller outside that delivers cold water into the data center, where air-conditioning units blow cold air under a raised floor to try to keep computer components from overheating," Hammond said. "From a data center perspective, that's not very efficient; it's like putting your beverage on your kitchen table and then going outside to turn up the air conditioner to get your drink cold."
"NREL's ultimate HPC system is currently under development and will be a new, warm-water cooled high-performance system," said Ed Turkel, group manager of HPC marketing at HP. "It will be a next-generation HPC solution that's specifically designed for high power efficiency and extreme density, as well as high performance — things that NREL requires."
Starting this summer, NREL's HPC data center will require just over 1 megawatt of power to operate. "That's a lot of power; the heat dissipated from that is very substantial," Hammond said. "Getting the heat directly to liquid rather than through air first and then to liquid is the most efficient way to utilize it."
Water being supplied to the servers will be approximately 75 degrees Fahrenheit; the water returning from the HPC will be in excess of 100 degrees Fahrenheit and is designed to be the primary source of heat for ESIF's office and lab spaces. Data-center waste heat is even used under the front plaza and walkway outside the building to help melt snow and ice. Thus, the heat byproduct from the data center will also improve safety around the ESIF.
Compared to a typical data center, NREL's HPC data center will be much warmer. The 75-degree-Fahrenheit design point is a higher starting temperature for computer cooling. Starting at this temperature allows NREL to eliminate compressor-based cooling systems and instead use cooling towers. In a data center, this is comparable to a homeowner using an energy-efficient swamp cooler rather than an air conditioner. In addition, the pump energy needed to move liquid in the cooling system is much less than the fan energy needed to move the air in a traditional data center. Water is about 1,000 times more effective than air in terms of the thermodynamics, or the heat exchange.
"We're quite enamored with NREL being able to reuse the heat for the building and parts of the campus," Wheat said. "While others have done this before, here we are looking at a combined total efficiency goal and not just harvesting heat. We're looking to see how this can be the best solution for the entire campus."
Using the HPC's waste heat to boost the ESIF's sustainability and finding unique solutions to cut the data center's PUE are just what NREL does. "This is in our DNA; it is part of our mission at the lab, and we want others to follow suit," Hammond said. "NREL isn't one-of-a kind in what we are doing — but we've set out to be the first of a kind. For us, it just makes sense. Others can follow suit if it makes dollars and sense."
The lab's industry partners also see a long-term relationship for energy efficiency and HPC, especially when it comes to exascale computing.
"We see the area of HPC as being insatiable; people will have a need for ever-greater performance," Wheat said. "One of the things we are mindful of is that while our systems are becoming denser in terms of footprint, they are becoming more power efficient. NREL is the premiere place to demonstrate a means to continue the growth of HPC capability in an environmentally friendly way."
HP's Turkel echoes that sentiment: "As power-efficient and dense as our HPC systems are, to meet our customer's rapidly expanding requirements for performance, we would need to grow even our most powerful and efficient system to be impractically large and complex, while consuming enormous amounts of energy.
"To get to the levels of scale that our customers are demanding of us, we have to fundamentally change the dynamic around power, density, and performance," Turkel added. "We have to be able to do it in a much smaller package using less energy. This project is a step in that direction — and it's apropos that NREL is a partner in the effort."
"eBay, Facebook, and others have data centers that are water capable, but there aren't any products on the market now that are providing liquid cooling," Hammond said. "NREL is getting the first product that is direct-component liquid cooled. We're going to show it's possible, efficient, safe, and reliable."
Expanding NREL's View into the Unseen
The $10 million HPC system will support the breadth of research at NREL, leading to increased efficiency and lower costs for research on clean energy technologies including solar photovoltaics, wind energy, electric vehicles, buildings technologies, and renewable fuels.
The new system is crucial to advancing NREL's mission and will enable scientists to address challenges that have been intractable to date. The new system will greatly expand NREL's modeling and simulation capabilities, including advancing materials research and developing a deeper understanding of biological and chemical processes. "Modeling and simulation capability is a key part of advancing our technologies," Hammond said. "It allows us to do research we can't do experimentally because it would be too expensive, or would take too long if actual systems were built. We can mathematically model and run numerical simulations that allow us to understand things through direct observation."
Before building an HPC data center in the ESIF, NREL had a small system on its campus, while collaborating with Sandia National Laboratories on the RedMesa supercomputer to bridge the gap until NREL had a facility to house its own HPC system. For the past two years, the NREL/Sandia solution has been oversubscribed. "We averaged 92% utilization day in and day out; we needed much more capable systems to meet growing demand for modeling and simulation," Hammond said.
According to Hammond, NREL will also reach out to the local utility to study demand-response scenarios. "There are times in summer when electricity demand is high that we could shed load with the data center to help Xcel Energy." NREL could alter workloads and schedule particular jobs to run in mornings when there's a high demand for heat and cooling is less expensive. In another scenario, NREL could schedule workloads to take advantage of lower electricity costs or be mindful of when rates are higher to help reduce operating expenses. "There is a lot of interest in looking at how to integrate the HPC system in the building automation system as part of the energy systems integration work that we're doing," Hammond said.
"The computational activities at NREL had to be part of the efficiency equation," said Wheat. "A motivation for Intel to work with NREL was the ability to work together to validate how to do an efficient data center. We needed to be able to assure that we had the right balance of processor performance for the workload — with a performance-per-watt focus. Being a partner with NREL in this process is of value to us for demonstrating leadership in our community. Others are taking notice of what NREL has done; I believe we all benefit from that."