Opinions
I cochaired a workshop on Complex Systems for the National Science
Foundation in Fall, 2008. Here is a link to the
Complex Systems Report .
During my term as President of SIAM, I had more than usual
occasion
to express my opinions on a range of scientific topics. Here
are links to three documents
- An article published in the Notices of American
Mathematical Society
about simulation
- My past president's address
to SIAM, published
in SIAM News
- Remarks on the role of numerical
computation in computer
science published in the CRA Newsletter and in SIAM News
As Director of Research Programs for the Cornell Theory Center 1991-97,
I interacted with Federal agencies that coordinated high performance
computing activities. Here are written remarks prepared around 1995
that still ring true to me.
Response to Questions Posed by the HPCCIT Subcommittee
John Guckenheimer Director, Research Programs
Cornell Theory Center
1. What are the most important technical trends in high
performance computing and communications and how will they affect your
center?
There are two trends that dominate high performance computing today:
ubiquitous network access and the emergence of scalable computing
systems. The impact of both is profound, and together they have the
potential to transform large parts of the scientific enterprise. Within
the scientific enterprise, data is fundamental. Yet, the
primary
data upon which scientific theory rests has been largely inaccessible,
sequestered in lab notebooks in individual laboratories or stored in
tape archives that were written with obsolete equipment. Universal
connections to the internet and upgrading its bandwidth create the
technical capability to make all the primary data associated with
published experimental and computational science available to the
scientific public. Where research rests upon the analysis of large data
sets, such as the US census or medical images, these data can be made
much more readily available than has been the case heretofore. Already,
within the realm of algorithms and software, the proliferation of ftp,
gopher and www servers has fostered the sharing of scientific work in a
way that was simply impossible a few years ago. Results (i.e.,
programs) can even be distributed without the intervention of
publishers, mail, etc.
What does this have to do with HPCC? That depends upon the operational
definition of high performance computing. Traditional views of HPCC
have been limited to expensive, specialized hardware. This need no
longer be the case. Little more than a decade ago, a Cray 1
supercomputer did linear algebra calculations at a rate of
approximately 10 Mflops. These were the machines that created the
demand for academic supercomputer centers. Today, machines of this
speed are breaking through the 10,000-20,000 workstation market to the
1,000-5,000 PC market. If one takes
the figure of 10 Mflops as a threshold for high performance computing,
then high performance computers can be placed on the desktops of every
working scientist in this country at reasonable cost. In such a world,
what is the role of centers going to be? Here are two
possibilities.
- There will always be problems that require far more
computing
resources than we have available. This is true of the grand challenge
problems selected in the competitions of the past few years.
National and state centers are a vehicle for providing a select
community of researchers access to the latest, fastest and largest
computing environments. They are also a vehicle for mediating between
the computer manufacturers and the grand challenge researchers when new
hardware and software is being tested and refined by its application to
challenging problems. HPCC centers can become a means for enabling much
broader communities to make use of the high performance
computers
on their desktops. This possibility will be addressed in the answer to
the next question.
- The centers can support multidisciplinary research. The
scientific community is being asked to work on strategic problems that
have social and economic impact. Many of these problems do not have as
clear a disciplinary focus as traditional research. This is
particularly true of computational science, where diverse intellectual
skills are required and the underlying problems do not have sharp
formulations. The high performance computing centers could become a
milieu that develops a culture to support such interchange. Solution of
the problems on the national agenda requires extensive cross
disciplinary communication that impacts deeply the disciplines
contributing to the solutions to these problems.
There is a final aspect of the trend towards scalable computing that
deserves mention. The cost advantages of large scale production are
compelling. The largest high performance computers are being
built from the same components as desktop systems. This fact has
implications for the performance ratio of
supercomputers:workstations. The emerging generation of high
performance computers are multiprocessor machines being built from the
same RISC CPUs used in workstations. The largest of these machines that
are being assembled have approximately 1000 CPUs. Allowing for the fact
that these machines often use the fastest chips available and that
multiprocessor workstations are becoming commonplace, the performance
ratio is likely to remain at approximately 1000. In this arena, the
costs scale linearly with size and economic benchmarks are 20K
for
a workstation and 20M for a large high performance computer. This
raises the question as to what can be done with this factor of 1000.
What are the problems that can be solved on a 100Gflop computer with
100Gb of RAM that cannot be solved on a 100Mflop computer with 100Mb of
RAM? To project trends conservatively five years into the future, these
numbers might be multiplied by 10. Most scientific problems
scale
far more rapidly than linearly. Increasing the resolution of a three
dimensional fluid dynamics simulation scales like the fourth power of
the resolution. Thus a performance ratio of 1000 buys an increase in
resolution less than 6. For some problems, this may be critical, but
for many others sheer increases in computing speed will bring only
incremental improvements in our problem solving capability. In
computational science, the gradient from the easily solvable to the
absurdly complex is often very, very steep. The goals for HPCC need to
include a balanced set of priorities what can be realistically
accomplished with the machines we build.
This statement about relative performance of large and small computers
is counterbalanced by the fact that large multiprocessor machines are
genuinely different from the current generation of vector
supercomputers. The architecture of vector supercomputers has been
highly optimized for linear algebra computations with large vectors
and matrices, and their performance on problems that do not readily
vectorize is not much better than that of the fastest superscalar
workstations. Rather than concentrating upon the problem of trying to
adapt these MMP machines to do what vector machines were designed
specifically to do, it seems more productive to find new classes of
problems that can be solved well on MPP machines and continue to build
machines with diverse architectures that are adapted to different
problem domains.
2. What are the obstacles to the most effective use of
high
performance computing and communications resources such as those
at your center and what needs to be done to overcome those
obstacles?
The obstacles to the effective use of HPCC resources are technical,
organizational and cultural. The technical obstacles are not absolute,
but rather reflect the imbalance between different parts of a computing
environment. At the present time, the critical bottleneck is data
communications as a component of computation. Moving bits is as
expensive as computation with those bits. Optimizing computational
performance usually requires careful staging of data movement as part
of algorithmic design. On RISC CPUs, the performance penalty for a
cache miss is large, and poorly designed codes sometimes spend far more
time paging data among hierarchical levels of memory than in
computations. For
remote operations, the actual performance of the ftp program in moving
data across typical nodes on the internet is perhaps 30KB/sec. This is
three to four orders of magnitude slower than the HPCC objective of
gigabit rates. If one thinks about the task of working with datasets of
1GB (say, a landsat thematic mapper image), then this is the difference
between file transfer in eight hours or eight seconds. Thus network
speed is a real obstacle for researchers wanting to make remote use of
a center for data intensive applications. The ability to
compute
remotely, transfer data and study it locally is limited. The bandwidth
to support this in an easy, interactive fashion is not yet present. The
timing of its future availability is an important factor in planning
what types of facilities centers need during the next few years.
The technical obstacles to the effective use of the centers include a
set of more amorphous issues that are usually lumped under the headings
of software, algorithms and problem solving environments. To understand
the varied aspects of this problem, consider an analogy between the
airline industry and high performance computing. The government has
stimulated the growth of air travel as the primary means of long
distance transport by subsidizing the construction of airports and
supporting research on aircraft design. These federal activities are
necessary for a healthy airline industry, but they are not sufficient.
To make a workable system, a whole set of peripheral infrastructure
needs to be put into place as well: local transportation systems for
airport access, reservation systems, an air traffic control system and
safety standards for aircraft maintenance. The role of the government
in these different peripheral systems varies, and the division between
public and commercial enterprises is a continuing political issue. Note
also that the creation of the airline industry has given rise to new
commercial opportunities like the ability to deliver fresh
fish from around the world to supermarkets in places like Ithaca,
The high performance computing community has not done much to stimulate
its own widespread diffusion into new domains. Every time an
individual steps onto a commercial airplane, he is a user of the air
transportation system. Easy use of the high performance computing
centers has not been part of their legacy, but the Branscomb panel has
called renewed attention to broadening the high performance computing
agenda. When looked at from the perspective of individuals in
areas that have not seen intensive high performance computing activity,
there is a dilemma. Since there are few tools, individuals are forced
to make the choice between making the development of those tools a
central aspect of their work, finding someone else to do the technical
work or waiting for something better to come along. Today, use of high
performance computing in these fields requires a pilot's license or
direct access to a pilot.
3. How can the HPCC Program help your center accomplish its
goals?
The peripheral institutions required for widespread use of high
performance computing need to be nurtured. Among the centers
supported by NSF, NCAR is a role model for how high performance
computing can take place in a setting that makes sense for a particular
research community. The HPCC program can help other centers provide
similar support for diverse research communities. This is an issue that
is perhaps more relevant to state and NSF centers than those of mission
oriented agencies. It is also an issue that looms large in the
development of the national information infrastructure and the support
of computational tools for small businesses.
While computers scale, people do not. As the problems we
tackle
become larger and more complex, there is a tension between the
facilitation of individual creativity on the one hand, and the
development of institutional structures directed at large scale
problems. There are limits to what both individuals and institutions
can do. An individual cannot write
millions of lines of code to create a large software system, but
organizations do not have the creative insight and intelligence of
individuals. Thus, we need organizational structures adequate to
support work in data intensive areas without fettering individual
insight. For example, social scientists need better access to
large databases and tools for extracting information from these
databases. Facilities that provide interactive access to such
databases over the internet are feasible, but they do not yet exist.
The question here is how this is to be organized.
There are at least four different kinds of groups that potentially have
a role to play in broadening the scope of high performance computing in
diverse fields: individuals, consortia of researchers (like the grand
challenge groups), institutions (the centers, government labs, etc.)
and commercial vendors. The collective tasks are large enough and the
time scale for significant developments is slow enough that it would be
helpful for the HPCC Program to formulate policies that will establish
a dependable context for stimulating the work that needs to be done.
Commercial vendors have been discouraged from entering the high
performance computing business by the small size of the market, but
perhaps the market has been defined too narrowly. If one relies upon
scalability and targets a market that starts with the desktop
supercomputer, then the commercial opportunities expand dramatically.
If the possibilities for remote HPC services are included, the
potential markets are even larger. It is important to factor
into
the planning the continued evolution from today's high performance
computer to tomorrow's desktop.
High performance computing centers differ significantly in their
constituencies and their missions. From its inception, the Cornell
Theory Center has been dedicated to the promulgation of parallel
computing and the best support for scientific research that it can
provide. The HPCC Program can help the center accomplish these goals by
providing
support for partnerships that will establish the needed infrastructure
for high performance computing in selected, but diverse
areas.
Priorities need to be established and choices need to be made. We feel
that these choices need to reflect better coordination with scientific
disciplines that are eager to be large users of high performance
computing, recognizing that their modes of HPC use may be very
different from those of currently mature HPC areas. The
Theory
Center operates under a cooperative agreement with the National Science
Foundation, and we need to approach these tasks cooperatively - seeking
together the means for realizing the opportunities created by the
breathtaking advances in high performance computing hardware that
continue to happen as we speak. The creation
of the internet has had a dramatic impact upon scientific culture. The
revolution that will come from effective use of the high performance
computers on our desktops has only begun.