• A Sterling Future For HPC

    For the past decade, keynote speakers at the International Supercomputing Conference (ISC) have examined the major accomplishments in HPC during the preceding year. This time the talk is more ambitious. At ISC '13 in Leipzig, Germany in June, Thomas Sterling will deliver a keynote that examines the HPC accomplishments over the last decade. He plans to reveal "the true achievement of our field."

    You already know Sterling, of course. He's famous as the "father of Beowulf," the commodity computing cluster he and NASA Goddard colleague Donald Becker pioneered in 1994, for which they won a Gordon Bell Prize.

    He's now Professor of Informatics and Computing at the Indiana University School of Informatics and Computing, leading a team conducting research associated with the ParalleX advanced execution model for extreme scale computing. The goal: to develop a new model of computation that will enable a new generation of extreme scale computing systems and applications.

    He's also Chief Scientist and Associate Director of the PTI Center for Research in Extreme Scale Technologies (CREST), Adjunct Professor at Louisiana State University, and CRSI Fellow at Sandia National Laboratories. He has co-authored six books and holds six patents. To top it off, he's one of HPCwire's People to Watch for 2013!

    His speech will examine the innovations in technology and architectures in HPC, as well as their contributions to science and other fields. He'll also offer a collection of predictions for the next decade from key HPC leaders.

    In anticipation of that talk, HPCwire asked Dr. Sterling to make a few predictions of his own.

    HPCwire: It seems like the push toward exascale has lost some momentum over the last year. Do you think exascale will slip into the next decade?

    Sterling: This is a complicated issue, but my view is that, if anything, momentum towards exascale in the US is building, not waning. There are two tracks to exascale, both being led by DOE in the US.

    NNSA [National Nuclear Security Administration] is driving the incremental track. That is an attempt to extend conventional practices, both in architecture and programming, to deploy an exascale version of what we have today. This is prudent, responsible, and low-risk. It will support important mission-critical workloads, and will present a ready, if not seamless, migration path for legacy codes. However, it's likely to be limited in applicability, scalability, and efficiency for many problems.

    OS/ASCR is guiding the advanced track. This approach is to create innovations in architecture, system software, and programming models and methods. It could achieve exascale-era computing systems that are truly general-purpose, usable, reliable, and cost-effective (in terms of both operations and power.) It's possible that we'll even shift paradigms to a new execution model.

    NNSA is likely to deliver its incremental platform to the national labs sometime between 2018 and 2020. R&D timeline projections suggest an advanced-class system is likely by 2022 or shortly after.

    Still, the process of a congressionally-validated plan is complex. Its formulation is well along and is being refined, but there are other issues related to how it moves through the obscure (at least to mere mortals such as myself) layers of authorization.

    The apparent path for supercomputing is now entering a multifaceted period. We have matured, I think, beyond the adolescent obsession of the next Linpack number. The trends leading to exascale should be measured in terms of progress toward unprecedented accomplishments in science, engineering, societal, commercial, and defense-related goals. I think we are sustaining a mid-course correction that is placing us on the new trend lines: the ones that actually matter.

    HPCwire: Will another nation beat the US to the exascale milestone? Which one has the best shot?

    Sterling: It is possible of course that another nation will beat the US to the exascale milestone.

    However, there is an unstated assumption that "the exascale milestone" is 1 exaflops Rmax [maximal LINPACK performance]. Such systems don't have to emphasize networking capability or even memory capacity (which, in combination, are the most expensive part of the system when balanced) to gain high marks. Any nation that wants the stature of being the first exascale system by this definition can probably do so in five years or slightly more, if they are willing to pay for it, by deploying a stunt machine.

    Who may get to 1 exaflops Rmax first? History shows that, if not the US, it is likely to be Japan or China, but otherwise I have no deep insight. The EU is taking on new leadership in hardware and is expanding its energies in software infrastructure. Japan continues to extend its own advances with, for example, Kei and Tsubame-2. The Chinese have announced Tianhe-2, to exceed 100 petaflops by 2015.

    But the US, guided by DOE programs, is pursuing opportunities with radically different approaches for true general-purpose exascale computation. The X-stack program begun in September 2012, for example, is getting dramatic improvements to efficiency, scalability, generality, and programmability, and is aggressively pursuing innovations to improve power consumption and reliability. If the milestone is general purpose exascale computing, then I think the US is in a compelling leadership position through the DOE partnership of Thuc Huang and Bill Harrod.

    Still, I wish we had a science accomplishments benchmark – something like the X-prize. Perhaps some end-game computational achievement, like proving the process producing gamma ray bursts (including neutrinos); or some microbiology challenge involving viruses; or perhaps demonstrating climate change at a level that is provably predictive (and yes, I know it's inherently chaotic.) We need something that matters. We need to stop playing the horses and ensure that we can pull the plow.

    Next >>