High-Performance Computing

High-Performance Computing

How the investment in heavy-duty computer processing power is paying off in research output

Imagine thinking thousands of thoughts at the same time. What if each thought was one piece of a really big problem—a problem now solvable in hours or days rather than years because of this simultaneous thought process? That’s what high-performance computing (HPC) does.

HPC is a lot of computer processing power linked together, thinking through and solving different pieces of big problems in parallel. HPC makes it possible to answer big questions in a fraction of the time it would take when using individual desktop PCs. With HPC resources, some problems can be solved that could never be solved before.

With this problem-solving capability, scientists start to ask questions they would never have thought to ask. And universities are increasingly the hubs of the HPC systems that ask these questions, solve these problems, and change the world.

High-performance computing does, however, come at a high cost. “The typical investment varies from under $100,000 per year (within an individual research group) to tens of millions per year,” says Henry J. Neeman, director of the OU Supercomputer Center for Education & Research (OSCER) at Oklahoma University.

While the majority of colleges and universities don’t make this investment, nearly one-quarter do. When colleges and universities with high, medium, and low research intensity were surveyed by the Educause Center for Applied Research, 23 percent of respondents reported providing central IT (centralized HPC) services for research computing. That’s according to the November 2012 report “Research Computing: The Enabling Role of Information Technology.”
Here’s what researchers at three institutions are doing to make use of their investments in HPC.

HPC and Cancer Research

In the University of Pittsburgh Department of Biomedical Informatics, Xinghua Lu and his colleague Songjian Lu study signaling pathways in tumors, which lead to improved cancer treatments. Their research relies on use of the Pittsburgh Supercomputing Center’s Sherlock, which is a modified YarcData uRiKa graph analytics appliance from Cray. A National Science Foundation Strategic Technologies for Cyberinfrastructure award funded the Sherlock project at a cost of approximately $1 million, shares Nicholas A. Nystrom, director of strategic applications for the Pittsburgh Supercomputing Center.

Signaling pathways, according to the NCI Dictionary of Cancer Terms, are groups of molecules in a cell that work together to control one or more cell functions. As Lu and Lu note, these pathways consist primarily of signaling proteins that transmit cellular signals, which regulate cell activity such as cell death and cell proliferation.

“Cancers are mainly caused by the mutations in the proteins affecting certain signaling pathways,” says Songjian Lu, a post-doctoral associate who works in Xinghua Lu’s lab.

They identify chains of signals that cause cancer, using existing technologies to compare tumor cells with normal cells and detect the mutated genes in tumors. Then, using knowledge mining, data mining, and graph models, they ultimately connect the genetic mutations in the tumors to the mutated proteins that cause cancer.

The calculations these researchers run to discover the mutated proteins are huge. As biological systems are complex, they explain, many computational problems stemming from the biological problems are so difficult that even supercomputers cannot solve them exactly. This does not, however, prevent them from using HPC to pinpoint the mutated proteins that cause the genetic mutations in cancer.

In the end, this research is very important because new and existing drugs that treat cancer often target cellular signaling pathways. “A better understanding of perturbed [changed] signaling pathways will help design new drugs to target these pathways,” says Xinghua Lu.

And for institutional administrators, highlighting research projects is an opportunity to share a good story with potential donors and students.

Big Data for Global Climate Change

Derek J. Posselt, an associate professor of atmospheric, oceanic, and space sciences at the University of Michigan, uses HPC to study the interactions between global climate change and local-scale clouds and rainfall. Posselt conducts his research using the Infiniband system, which the University of Michigan built out of products from Dell, Mellanox Infiniband, and Data Direct Networks.

The hardware, software, staff and facilities for the system cost roughly $12 million to purchase and run for five years, says Brock Palen, senior HPC system administrator, for the university’s CAEN Advanced Computing office.

The resulting research is incredibly practical.

“We have discovered that, while rainfall and snowfall generally increase in a warming atmosphere, the patterns of precipitation become more local and intense,” says Posselt. The amount, frequency, and intensity of rain and snow have far-reaching effects on human population, ranging from agriculture to city planning to pollution management to ecosystem management and protection, he adds.

HPC is critical to Posselt’s research in two key areas:

  • First, climate change happens on large, continental scales. Studying those scales requires simulations of large horizontal areas. At the same time, circulations of clouds and rainfall happen on scales of only a few kilometers. Studying both the larger and smaller processes simultaneously requires extremely large computational resources—that is, high-performance computing systems.
  • Second, it is not enough to predict a single possible outcome of climate change. “It is important to understand the uncertainties inherent in the [weather] system and in our models of the system,” says Posselt. HPC makes it possible to predict a host of different outcomes by running hundreds to millions of computer simulations in a short time.

Climate change and its effect on local weather are among the problems that researchers could not solve if their institutions had not made an investment in HPC. “Most of the results we have obtained would have been impossible without the use of HPC resources. For those results that we could have obtained with ordinary desktop computing, HPC has increased our efficiency by at least an order of magnitude. We obtain results in days with HPC that would have taken months on an ordinary desktop computer,” explains Posselt. 

Simulation Technology Development

Mark S. Shephard, the Johnson Professor of Engineering and director of the Scientific Computation Research Center at Rensselaer Polytechnic Institute (N.Y.), uses HPC to create simulation tools that help to model and improve products and technologies. The HPC investment was made in the IBM BlueGene Q system, which cost $6.6 million, plus nearly $1 million on top of that in related facilities costs, he explains.

Simulation tools help scientists use mathematical models to model lots of physical phenomenon. “The emphasis of the work is to take the mathematical model and make sure we solve it very well, on as big a problem as we would like, even on problems with billions and billions of unknowns,” explains Shephard.

In a practical application of simulation, the company HeartFlow collaborated with Shephard to apply the technology to heart image data collected from individual patients. “[Physicians] image the patient’s arterial system and simulate the blood flow in that system to decide whether the patient needs to be stented,” he explains.

Without these simulations, a physician would make the decision whether to stent the patient based on a reduction in blood flow. The accuracy of these simulations is saving a significant number of people from having to be stented, Shephard points out.

Institutional Benefits and Beyond

The Rensselaer Polytechnic Institute’s IBM BlueGene Q system has produced numerous benefits for the university, its faculty, and students says Christopher D. Carothers, a professor in the Department of Computer Science. The school has received $50 million in external research funding that it probably would not have without the HPC platform. Also, since 2007, the university has not had to buy any new faculty members their own research computer cluster; they all use the HPC resource instead.

In addition, the HPC system is a substantial system of almost half a petaflop (a thousand trillion floating point operations per second) in processing power. This allows marketers in human resources and in admissions to tell potential faculty and students that if they come to RPI, they will get time on the HPC platform.

Another benefit of the system to faculty and students is that faculty does not have to waste research students’ time having them maintain a departmental computer cluster. And finally, faculty and students can scale their research to any size they want because of the size of the HPC platform.

An alternate way to measure the value of a high-performance computing investment is by usage.

At Clemson University (S.C.), 36 of 54 academic departments use the Clemson Palmetto HPC Cluster, according to Jim Bottum, a research professor of electrical and computer engineering. The Clemson Palmetto HPC Cluster consists of computer processors from vendors such as Dell, IBM, and HP.

Approximately 80 faculty user groups from colleges and universities across South Carolina use the Clemson Palmetto HPC Cluster. It has drawn more than $47 million in research funding to its users since fiscal year 2010. According to Bottum, officials at his institution measure ROI from the Clemson Palmetto HPC cluster by the number of research publications that HPC makes possible, the number of grants received for HPC-based research, the number of Ph.D.s awarded where the candidate’s research is dependent on HPC, and the increasing use of HPC resources by nontraditional users.

In other words, with high-performance computing, it’s not a simple ROI equation. Yet, the results and benefits of HPC-enabled research is simply limitless.


Advertisement