Numbers, lines and code representing large quantities of data

Solving for the future: Investment, new coalition levels up research computing infrastructure at UW–Madison

Bolstered by a $4.3 million investment from the Wisconsin Alumni Research Foundation (WARF), UW–Madison’s research computing hardware is getting a significant upgrade—giving UW researchers the sustained shared infrastructure they need to help them push the bounds of science.

“Our researchers on campus can breathe easier now,” explains Miron Livny, John P. Morgridge professor of Computer Sciences and director of the Center for High Throughput Computing (CHTC) in UW–Madison’s School of Computer, Data & Information Sciences and investigator and chief technology officer at the Morgridge Institute for Research.

Ranked 8th in the nation for volume of research, with $1.3 billion in research expenditures in 2020, UW–Madison has long been positioned as a leading “R1” research institution. But UW–Madison is the only school in the Big Ten that doesn’t have dedicated research computing data center space. At the same time, UW lags far behind our Big Ten peers in the number of centrally-administered computing cores per research principal investigator (PI).

And in UW’s CHTC, aging servers couldn’t be replaced because the one-time investment to install them—nearly a decade ago—didn’t include ongoing funding to keep them up to date.

“It’s very simple. The thing was falling apart,” Livny explains. “We needed to do a refresh—and then, we needed to make an annual commitment at the campus level in order to sustain it.”

That commitment began to take shape this year, seeded by the WARF investment and a new coalition of campus partners that formed the Advisory Committee on Research Computing. The committee is co-led by Amy Wendt, professor of electrical and computer engineering divisional associate vice chancellor for research in the Office of the Vice Chancellor for Research in Graduate Education (OVCRGE), and UW–Madison Chief Technology Officer Todd Shechter with the Division of Information Technology (DoIT).

The coalition was a natural and necessary evolution after the launch of ResearchDrive in 2019, explains Shechter. With ResearchDrive, PIs across campus could receive 5 terabytes of storage without cost for their research data, with additional storage space available for an annual cost.

“ResearchDrive was our first attempt at supporting researchers across the campus with an allocation of storage that was at no cost to the researchers,” Shechter explained. “And it was a good start.”

“But as we listen to the needs on campus and think about what comes beyond storage, that’s when we think about computing—and providing what’s necessary for researchers to conduct world-class research at an R1 university.”

Boosting computational power

Modern academic research runs on computational power—and in today’s research labs, central processing units (CPUs) and graphics processing units (GPUs) are critical pieces in a scientist’s workbench.

So with funding secured through the WARF investment, CHTC began executing a major hardware refresh this summer, adding 207 new servers representing over 40,000 “batch slots” of computing capacity. The task of translating this investment into operational computing capacity was assigned to Brian Bockelman, a Morgridge Institute for Research investigator, and his team. Under his leadership, most of these servers are already available to CHTC users.

That means more CPUs, more GPUs, more memory—and, in essence, more computing power.

And perhaps just as powerful as the computational resources themselves? The fact that they’re centrally supported and operated, notes Steve Ackerman, vice chancellor for research and graduate education.

When Ackerman joined the Office of the Vice Chancellor for Research in Graduate Education in 2012, he says, faculty members immediately began knocking on his door. They all shared the same concern: “We need some kind of computational infrastructure that’s sustainable, so we’re not having to stand up our own facilities,” Ackerman recalls.

“This investment—and the plan to sustain it—takes away some of that angst for individual PIs,” says Ackerman.

And by leveraging the expertise of partners across campus—DoIT’s experience running data centers and networks coupled with CHTC’s mastery of handling computing throughput, for example—we can build a centrally coordinated, well-run shared research computing infrastructure to keep pace with our peer institutions, Ackerman explains.

“This is something the faculty have been asking for, for a decade. The need has been there,” said Ackerman, whose OVCRGE is also contributing $3 million to the initial investments. “And now, everything’s coming together at the right point.”

“What it all really comes down to is, we’re helping to provide a sustainable path for researchers to harness computing capacity to do their best,” Shechter added.

And increasingly—across almost all disciplines—that path involves high throughput computing, pioneered on our campus.

What is high throughput computing?

Think about what it takes to accomplish a huge task, like building a house. It doesn’t all go up at once, right? Construction involves breaking up the bigger build into many smaller tasks.

And you wouldn’t try to build a house with just a hammer and a nail. Similarly, a researcher trying to execute a massive number of computational tasks—which may touch thousands of files and consume millions of processing core hours—can’t simply use an average laptop.

In both of these scenarios, you need to break it all down. And you need the right tools to handle the files and tasks.

“High throughput computing”—a concept pioneered by Livny more than 30 years ago—relies on a similar strategy. When a researcher breaks up a single computational task into many smaller tasks, they can scale out their computation—and, as a result, dramatically improve their productivity.

Software tools developed by the CHTC enable researchers to leverage the computing power of thousands of machines assembled in a network—forming a capacity “pool.” Powered by the HTCondor Software Suite, data-intensive research that otherwise might require years to complete via conventional computing can be accomplished in days, or even hours. With greater computational capacity and the power of automation, a scientist can analyze massive amounts of data—or explore very large parameter spaces with little effort—while running many tasks concurrently, Livny explains.

The end result? Researchers can harness more computing capacity and produce computationally heavy research results more quickly and with less effort, driving science forward, faster—with lower budgets.

Wisconsin Alumni Research Foundation Chief Executive Officer Erik Iverson says that’s exactly what drove WARF to invest in strengthening the university’s research computing infrastructure.

“WARF’s mission is to support research at the University of Wisconsin and we have been privileged to do that for nearly a century,” Iverson explained. “This investment helps enable and advance the exceptional work of our UW colleagues, to continue their research and benefit the world—part of our commitment to supporting UW.”

“Research innovations are made more possible when we work together,” Iverson added.

Leveraging CHTC expertise

Every year, the CHTC team works with hundreds of UW–Madison researchers across nearly all academic disciplines, from the sciences to the arts: Capturing the first-ever images of a super massive black hole at the center of our galaxy. Predicting fuel cell behavior at nuclear facilities. Improving the efficiency of plant breeding. Using AI technologies to help farmers manage their livestock.

The one thing these pursuits all have in common? They benefit from research computing facilitation, workload handling services and computing capacity offered by the CHTC.

And by offering centrally-supported research computing infrastructure, coupled with high throughput computing technologies and expertise from CHTC, coalition members contend we’re leveling the playing field for researchers—who may not have access to big budgets that match their big ideas.

“This really opens the door for innovation. If you have an idea, you can try it out. It’s a ‘fair share’ philosophy for the shared capacity available to every campus researcher, for everyone,” Livny says. “And you don’t have to worry about, ‘OK, it’s here this year. But what about next year?'”

“This investment demonstrates the longstanding and long-term UW–Madison commitment to open access to research computing capacity on campus,” Livny added. “And we’re proud to provide it, and eager to move to the next phase.”

A growing coalition

The research computing investments, equipment upgrade and services to support researchers are the result of a growing collaboration between several campus entities, including:

The coalition included support from former Chancellor Rebecca Blank, Provost John Karl Scholz, Vice Chancellor for Finance and Administration Rob Cramer, Associate Vice Chancellor for Finance David Murphy, College of Letters & Science Dean Eric Wilcots, Morgridge Institute Chief Executive Officer Brad Schwartz, and Chief Information Officer and Vice Provost for Information Technology Lois Brooks.

“Creating a coalition of investment and operational partners allows us to take a giant leap forward in the computational capacity and services available to our researchers,” says Brooks. “This will better support researchers’ ability to innovate and solve the most challenging problems.”

Stay tuned to TechNews

Watch for more stories as we share the impact this investment is having among researchers on our campus.

Are you a UW–Madison researcher looking for research computing resources and consultation help to accelerate your science? You can get started by contacting the Center for High Throughput Computing.