Universities face fierce competition for research grants as federal funding continues to dry up. Although used for data-driven student needs, universities housing supercomputers are leveraging them to secure funding for AI work in fields such as chemistry, civil engineering, physics, and evolutionary biology. For many universities, supercomputers are a key to becoming more competitive in the federal grant process. And the name of the grant game is—go big or go home!
How big is big?
In the spring of 2021, Big Red 200 started production at Indiana University and features 672 compute nodes, each equipped with 256 GB of memory and two 64-core, 2.25 GHz, 225-watt AMD EPYC 7742 processors. Not to be outdone, The University of Minnesota’s supercomputer, Agate has a theoretical performance representing approximately seven quadrillion floating-point operations per second (seven petaflops), of which the cluster’s general-purpose processors provide two petaflops and five petaflops by the GPU subsystem. And no university supercomputer list would be complete without the mention of the University of Florida’s HiPerGator 3.0, with 240 AMD EPYC Rome machines with 1024GB of RAM and 30,720 total cores as well as 150 AMD EPYC Milan machines with 512GB of RAM and 19.200 total cores. These supercomputers are the research technologies enabling advanced research in artificial intelligence and medical science, as well as machine learning and data analytics.
Mundane in comparison to the supercomputers—but extremely important for other services—there is the whole academic network supporting in-class and remote students, faculty, and a myriad of point-of-sale systems operating in cafeterias and bookstores. In fact, the typical university campus is a microcosm of the world of computing and faces the same real-world problems such as staffing, budget shortages, security, and obsolescence. However, when you slice up these university networks and put them into revenue-generating or business-operating functions, they all share one common denominator that enables them to function—power. Each component of the power chain, from overhead busways to rack power distribution all are crucial for enabling compute availability.
Diverse power needs
As a rule of thumb, 6kW to 10kW is common power for the needs of most racks in the data center. Then, the power range climbs to an average of 8kW to 12kW for higher-end compute needs and 20kW and 30kW-plus for those supercomputing applications. For all these energy requirements, it’s critical for the power coming in to have a Power Distribution Unit (PDU) design that can work under varying conditions. IT personnel responsible for university networks should seek a product that can reliably operate with 60 degrees (max rating of PDU) while also having the ability to monitor cabinet environmental conditions with open API’s and SNMP traps and alerts. In most of these environments, the vertical or zero U is the PDU of choice because it’s in the back of the cabinet, out of the way, in the hot aisle, where it’s extremely important to monitor the temperature at the back of the cabinet. However, when it comes to providing power to supercomputers, typically a higher density and more robust solution are required.
Supplying the right power to supercomputers
When the power requirements enter the 20kW to 30kW and above range, intelligent PDUs that offer ultra-high-density power to help fuel HPC clusters for computational research are a must. For example, 400V three-phase, high-power PDUs need to be used with 60A infeed, to deliver 30kW-plus per rack—enabling research facilities and data centers to run higher voltages and higher currents. The 400V is a more efficient method to achieve these higher kW rates required for supercomputer operations. In addition, high-density power distribution units are required to match both the increased load requirements of these computer systems, as well as the sheer number of power cords feeding into a single rack. For this purpose, universities should consider PDUs with 36 to 54 outlets per strip.
When considering receptacle types, a.k.a outlets for supercomputers, keep in mind that a single node within a high-performance rack can have three or more receptacles and any number of plug types. Therefore, it’s important to have multiple plug types available. Network administrators need to look for a hybrid of the standard C13 and C19 outlets that accommodate both C20 and C14 plugs in a single outlet and can support multiple voltages where power comes in at 415VAC and is distributed to the outlet at 240VAC.
For supercomputing rack power distribution, universities need to make sure their PDUs have:
- Ultra high-density rack power distribution for powering high-performance computing (HPC) clusters.
- Reliable power distribution to bolster research across various disciplines.
- Physical security and a controlled access system to protect sensitive data at the rack level.
- Actionable insights that provide information to aid in eliminating wasteful power, enabling sustainable computing.
Supercomputers are a different breed, so they require a bit more thought when it comes to powering all those cores and processors e.g., not just any off-the-shelf PDU will do. There is a lot riding on keeping the operating system, application programs, and data flowing to the massive gigabytes of RAM. It’s not only the university’s reputation that could be tarnished if the compute power fails, it’s also tied to grant money and the thousands of individuals who have paid to use all that processing. Today’s rack-mounted PDUs pack in more density, flexibility, and intelligence to effectively deliver the power required by high-performance computing, but can also scale to the more mundane 6kW rack needs. So do your homework and research the industry’s most intelligent, flexible PDUs that offer the ability to handle 6kW to 30kW-plus loads while also giving the university the ability to monitor environmental conditions at the rack level. This will ensure continuity, sustainability, and just as important —preserve the university’s reputation as a reliable supercomputing service.
Calvin Nicholson is Senior Director of Product Management at Legrand DPC (Data, Power and Control Division), which includes the Server Technology and Raritan brands. He is responsible for overseeing product strategy and product management for PDUs and other related products. He holds a number of patents in both the power/data center and gaming industries and has held various positions within Server Technology Inc., including Director of Product Marketing and Director of FW Engineering.