Where's Your Computer's Hotspot? U.Va. Engineering School Professors Create Thermal Modeling Tools to Better Inform Chip Design

July 6, 2007 -- As the summer temperatures climb, people often shut their computers down for a week or so to take a vacation. In off mode, our computers are taking a vacation, too; yet they are destined for cooler climates.

Every time your computer’s Central Processing Unit (CPU) — which contains the integrated electronic circuitry that performs the instructions of a computer’s programs and coordinates its hardware components — performs a task, an area on the CPU chip heats up. Ideally, once the task is completed, that specific area can begin to cool, and another area of the chip (the area responsible for the next task) will begin to warm. Unfortunately, “hot” tasks can run for extended periods of time, or the next task might require the still-hot area on the chip, or a space very close by, so the “hotspot” never gets a chance to completely cool down.

It is at this point that users will hear the whirring of the machine’s internal fan as it tries to cool the CPU so that it can quickly and reliably perform the next task assigned. But the growing severity of cooling challenges for many categories of integrated circuits encompasses more complex problems than the annoyance of an internal fan’s whir.  That’s why Engineering School colleagues Kevin Skadron, associate professor in the Department of Computer Science, and Mircea Stan, associate professor in the Charles L. Brown Department of Electrical and Computer Engineering, teamed up to create HotSpot, a software tool that provides an efficient way for chip architects and designers to account for thermal effects as they explore various design options.

“Until recently, most research on thermal design has been focused on better cooling mechanisms such as a bigger fan,” says Skadron. “But today’s chips generate so much power that they are difficult and expensive to cool. The smarter idea is to design a chip so that it doesn’t dissipate as much heat in the first place.”

With power density and, thus, cooling costs rising exponentially, temperature-aware design has become a necessity. There is an urgent need for design techniques to help control or reduce heat dissipation, especially autonomous hardware techniques that can dynamically regulate operating temperature on the fly when the package’s capacity is exceeded. Such runtime responses include “throttling” (slowing down to avoid overheating), “dynamic voltage scaling” (a special case of throttling that involves simultaneously changing the processor supply voltage and clock frequency to reduce power consumption) and “migrating computation” (wherein tasks that are localized to on-chip hot spots move to another part of the chip). They provide safe cooling and prevent thermal emergencies by changing the processor’s behavior rather than relying on costly thermal packaging or software solutions (which might be vulnerable to hacking). Evaluating such techniques, however, requires a thermal model that is practical for early design stages, such as architectural studies in which the number and layout of computational resources is determined or totally new chip organizations may be considered.

Enter HotSpot. Designers can download HotSpot and connect it to a chip simulator — any power-performance model used in the computer-architecture community — and it will simulate the temperature at all locations on the chip when certain tasks are being performed. Designers can effectively see a map of how temperatures in a variety of locations on the chip respond to functions. For the first time, a tool exists to provide detailed on-chip temperature information in a fast yet accurate way to the computer design community during early design stages, where the biggest impact on a chip design occurs; chip architects can now rapidly explore large design spaces to find configurations that balance thermal performance with execution speed.

“Lots of thermal modeling simulators are available, but most assume that you are modeling with a detailed design already in hand,” Stan says. “HotSpot gives designers more flexibility by requiring less detailed information so the feedback can inform the design process. After all, it is too late to look at these issues once the chip is designed, because then you have to go back to the drawing board, which takes time and resources.”

James H. Aylor, dean of U.Va.’s Engineering School, adds, “HotSpot is revolutionary because the software makes it possible to study thermal evolution during the design process. This is a prime example of interdisciplinary research collaborations that will have far-reaching impacts on academia and industry.”

For everyday computer consumers, this translates to reduced purchase price and operating costs, increased performance, miniaturized devices and longer battery life — and an increased computer lifecycle.

HotSpot has been downloaded 1,280 times to date since its release in June of 2003, and the paper that introduced HotSpot has been cited 196 times, according to Google Scholar. Skadron and Stan, together with associate professor Robert Ribando in the Department of Mechanical and Aerospace Engineering and assistant professor Sudhanva Gurumurthi in the Department of Computer Science, were recently awarded a $485,000 National Science Foundation (NSF) grant to expand and improve HotSpot. The project has also been supported by Intel, IBM, the Semiconductor Research Corporation, Army Research Office and several other NSF grants. In June of 2007, the team will release HotSpot version 4.0.

“This team lets us achieve our ultimate goal,” says Skadron, “which is to create a thermal simulator for an entire computer system — to go beyond the CPU and be able to thermally model the other parts of computers that are temperature sensitive like the graphics card, memory and disc drive. It’s comprehensive, optimized temperature-aware design.”

For further information on HotSpot, visit http://lava.cs.virginia.edu/HotSpot/index.htm.