By Alan Beck, Editor-in-Chief, HPCwire
Timed for release during Supercomputing 2003 in Phoenix, AZ, the high-performance computing news service HPCwire ran an interview with Maxine Brown, Project Manager on the Calit² -led OptIPuter project (www.optiputer.net), Brown, who works for the Electronic Visualization Laboratory at the University of Illinois at Chicago, was interviewed by HPCwire Editor-in-Chief Alan Beck, for a special LIVEwire edition from SC2003.
Brown is also the Guest Editor of a special issue of Communications of the ACM, about a "Blueprint for the Future of High-Performance Networking" (http://www.calit2.net/research/labs/features/11-19_cacm.html).
HPCwire: How would you define an "application-empowered" network, and how does such a network differ from a conventional one?
MAXINE BROWN: I define today's research and education networks as "best effort"; that is, scientists get the best bandwidth available at the time, and as the number of users sharing the links increase or decrease, the corresponding throughput adjusts accordingly. Today's Grid is built on these "best effort" shared TCP/IP networks. In other words, the network is simply the glue that holds the middleware-enabled computational resources together. In contrast, we are interested in developing application-empowered (or "experimental" networks, which was the term used several years ago), in which the networks themselves are schedulable Grid resources. These application-empowered deterministic networks are becoming a necessary component of cyberinfrastructure, complementing the conventional "best effort" networks that provide a general infrastructure with a common set of services to the broader research and education community.
The main application drivers for these new application-empowered networks are high-performance e-science projects, where e-science represents very large-scale applications -- such as high-energy physics, astronomy, earth science, bioinformatics, environmental -- that study very complex micro to macro-scale problems over time and space. In the future, these networks will conceivably migrate to other domains, including education, emergency services, health services and commerce.
E-science will require distributed petaops computing, exabyte storage and terabit networks in the coming decade. The National Science Foundation (NSF), Department of Energy, National Institutes of Health, NASA and other federal agencies are already investing in large-scale equipment and facilities, or "cyberinfrastructure," to address the specialized needs of these advanced scientific communities. Internationally, I am aware of cyberinfrastructure initiatives by Canada, the Asian-Pacific community, the European Union and the United Kingdom Research Councils.
Empowering these e-science applications are cross-cutting technologies dependent on networking, such as remote access to instruments, streaming high-definition video, specialized visualization displays, data mining, high-performance distributed supercomputing systems and real-time collaboration with distant colleagues. To optimally make use of these technologies, scientists want high-bandwidth connectivity "with known and knowable characteristics," which was the outcome of two NSF-funded workshops: the Workshop on Grand Challenges in e-Science, held December 2001, and the Workshop on Experimental Infostructure Networks, held May 2002.
Moreover, e-scientists expect deterministic and repeatable behavior from networks before they will rely on them for persistent e-science applications. A scientist wants to understand the operational characteristics of his/her networks: How much bandwidth can I be guaranteed? How much bandwidth can I schedule? What is the guaranteed latency that I can expect?
Application-empowered networks create a distributed platform that encourages experimentation with innovative concepts in middleware and software design while providing "known and knowable characteristics" with deterministic and repeatable behavior on a persistent basis.
HPC: Does our need for new networking infrastructures require new technological breakthroughs? Why or why not?
MB: Yes -- by the end of this decade, large-scale e-science applications will require advanced networking capabilities, not general services -- to guarantee bandwidth, to guarantee latency and/or to guarantee resource scheduling (including the networks themselves). And, there is clearly more than one way to implement these advanced services, and more than one set of advanced services needed.
I am aware of major research activities in intelligent signaling and dynamic control of optical networks, new application toolkits, advanced Grid middleware and high-performance transport protocols. One major goal is to design and develop software that will allow the applications themselves to control advanced, all-optical, IP-over-wavelength metro, regional, national and international networks, based on Dense Wave Division Multiplexing (DWDM) and photonic switching. The development efforts underway support applications distributed over photonic networks that will conceivably consist of hundreds of independent wavelengths (or "lambdas") per fiber.
For example, University of Illinois at Chicago (UIC) is working with Northwestern University and University of Amsterdam on several research efforts, such as: developing application-signaling methods to dynamically control switched optical networks; developing signaling and control methods that work both within and across network domains; developing Authentication, Authorization and Accounting policy-driven implementations of dynamic vLANs; accelerating data distribution for applications that require access to massive amounts of data at extremely high speeds; and developing new streaming video and multicast techniques for optical networks.
These schools are also part of the OptIPuter project under the leadership of Larry Smarr, director of the California Institute for Telecommunications and Information Technology, which spans the University of California- San Diego (UCSD) and University of California-Irvine. The OptIPuter is not just looking at networking infrastructure, but is taking a systems approach. For the OptIPuter, all the computational resources, regardless of location, are tightly coupled over parallel optical networks. Essentially, the OptIPuter is a "virtual" parallel computer in which the individual "processors" are widely distributed clusters; the "memory" is in the form of large distributed data repositories; "peripherals" are very-large scientific instruments, visualization displays and/or sensor arrays; and the "motherboard" uses standard IP delivered over multiple dedicated lambdas. This research requires a re-optimization of the entire Grid stack of software abstractions.
The Canadians believe that general-purpose IP Research & Education Networks will be replaced by many parallel high-performance ad-hoc "application-specific" IP networks; they are already developing optical networks with high-density DWDM and user-controlled optical cross-connects to enable this new architecture. Each network will have its own routing and discovery topology; yet, each network will peer with other networks at numerous optical Internet exchanges, such as StarLight (in Chicago) and NetherLight (in Amsterdam) so data can flow among research groups who are otherwise connected. Canada's CANARIE is developing tools to enable international teams of scientists to create these networks "on the fly" and optimize their topologies and peerings to meet specific research and/or application requirements for their "Communities of Interest." CANARIE has already deployed application-specific networks, such as the WestGrid High-Performance Computing distributed backplane, the NEPTUNE undersea collaboratory and the SHARCNET HPC distributed backplane.
HPC: What are the most critical challenges facing enablement of effective, efficient application-empowered networks? How should these challenges be approached for solutions?
MB: Application-empowered networks may become the basis for the wired Internet infrastructure underlying future e-science, education, emergency services, health services and commerce, but it still remains to be proven. We're not so much developing a network as a laboratory where we can experiment with fine-tuning and more tightly integrating all the layers -- from the physical infrastructure to the protocols to the middleware to the toolkits to the application. This involves a diverse group of people, from network engineers, software developers, system administrators, computer scientists, application programmers and discipline scientists. Networks, by their very nature, cross geographical boundaries, so there also needs to be cooperation among various funding agencies and institutions about what they hold important. So, the biggest challenge has less to do with technology than with the sociology of having multi-institutional, multi-disciplinary, multinational teams working together to build these networks and distributed laboratories.
As co-principal investigator of the NSF-supported STAR TAP, StarLight and Euro-Link initiatives with my colleague Tom DeFanti, I have been involved with several groups from Canada, CERN, the Czech Republic, Japan, the Netherlands, Sweden, the United Kingdom and the United States, representing national research networks, consortia and institutions, who are making their lambda-based networks available for global, networked experiments. The result of this grass roots effort has been the establishment of the "TransLight" initiative, which encourages scientists to request and schedule lambdas for global experiments, some of which are being demonstrated here at SC2003. This group, along with a broader cross-section of the high-performance community, has met annually for the past three years to discuss the development of a global lambda laboratory. This August at our annual meeting, held in Reykjavik, Iceland, we formed the "Global Lambda Integrated Facility (GLIF)," a virtual, global facility, or environment, providing not only networking infrastructure, but committing network engineering, systems integration, middleware and applications support, to accomplish real work.
HPC: What is the mutual influence between application-empowered networks and traditional high performance computing?
MB: I remember when it was originally called High Performance Computing and Communications. The term has been around for at least a decade, and has referred to scientists using both the most advanced computational tools and the most advanced networks of their time to do their science. Scientists at the national supercomputer centers were using vector processors in the '80s, massively parallel processors in the early '90s, distributed shared memory machines in the mid-to-late '90s, and massively parallelized PC clusters, such as the TeraGrid, this decade. Moore's Law dominated, and parallelization of microprocessors allowed for super-exponential growth in computing power. Yet, during this rapid upgrade in supercomputing power, the national R&E Internet backbone has grown from 56Kbps to 10Gbps, which is 200,000 times more bandwidth.
This decade, the exponential growth rate in bandwidth capacity is even greater than Moore's Law, caused, in part, by the use of parallelism, as in supercomputing a decade ago. However, this time the parallelism is in multiple lambdas on single optical fibers, creating "supernetworks." So, in addition to the availability of extreme computing power, we now have the ability to move data among computers, storage devices, instruments, visualization displays and people at rates in which endpoint-delivered bandwidth is greater than individual computers can saturate.
The ability to interact with very large data stores in close to real time will change the fundamental nature of how scientists interact with their data and collaborate with one another.
HPC: What will application-empowered networks look like in SC2008?
MB: The growing dependence on information technology, and the benefits to e-science research that derive from new levels of persistent collaboration over continental and transoceanic distances, coupled with the ability to process, disseminate and share information on unprecedented scales, will transform cyberinfrastructure and empower e-science research and education.
Several emerging applications have reached the limits of the capabilities inherent in conventional "best-effort" networks, which are based on statically routed and switched paths. In the longer term, the ability to dynamically create and tear down wavelengths rapidly and on demand through cost-effective wavelength routing, are a natural match to the peer-to-peer interactions required to meet the needs of leading-edge, data-intensive science. The integration of intelligent photonic switching with high-performance transport protocols, Grid middleware and application toolkits will become an effective basis for efficient use of application-empowered networks, and holds the promise of bringing future terabit networks within reach, technically and financially, to scientists in all world regions with openly accessible fiber.
HPC: Are there any other critical points that our readers should bear in mind?
MB: First and foremost, attend our SC2003 panel on "Strategies for Application-Empowered Networks," where we will be discussing exactly the issues you've raised.
I also recommend a related panel on "SuperNetworking Transforming Supercomputing," moderated by Steve Wallach with panelists Dan Blumenthal, University of California; Santa Barbara; Andrew Chien, UCSD; Jason Leigh, UIC; Larry Smarr, UCSD; and Rick Stevens, Argonne National Laboratory/University of Chicago.
Also, I encourage those SC2003 attendees who aren't members of ACM to stop by the ACM booth at the conference and pick up a copy of Communications of the ACM (CACM). I am guest editor of this month's special issue on "Blueprint for the future of high-performance networking." These articles explain how optical networking technology is being integrated into today's cyberinfrastructure for the benefit of e-science. This issue contains a full description of "TransLight: a global-scale LambdaGrid for e-science" by Tom DeFanti, Cees de Laat, Joe Mambretti, Kees Neggers and Bill St. Arnaud; a survey of "Transport protocols for high performance" by Aaron Falk, Ted Faber, Joseph Bannister, Andrew Chien, Robert Grossman and Jason Leigh; middleware considerations for "Data integration in a bandwidth-rich world" by Ian Foster and Robert Grossman; "The OptIPuter" systems architecture by Larry Smarr, Andrew Chien, Tom DeFanti, Jason Leigh and Philip Papadopoulos; and, a description of e-science requirements in "Data-intensive e-science frontier research," by Harvey Newman, Mark Ellisman and John Orcutt.
If people aren't attending SC2003, see http://www.acm.org/cacm .
Copyright 1993-2003 HPCwire http://www.hpcwire.com/
To read HPCWire interview with OptIPuter software architect Andrew Chien, click here.
?Blueprint for the Future of High-Performance Networking?