"Add Another Zero: An Interview with Larry Smarr"

11.12.03 -- Calit² director Larry Smarr talks about the future of the Internet and the impact of Gigabit-scale networking on higher education, in wide-ranging 10-page interview with the journal Educause Review. For the publication's November-December issue, on newsstands this week, California State University senior research associate Steven Daigle visited Smarr to hear about a variety of projects now underway on the campus of UCSD and its partner institution in Calit², UC Irvine. The result: a Q&A titled "Add Another Zero: An Interview with Larry Smarr."

In the discussion, Smarr talks about the "perfect storm" he sees ahead with the convergence of 'info-bio-nano-technologies'; explains how he came up with the slogan 'Gigabit or Bust' for California's broadband initiative; and argues that the U.S. will have to work harder if it wants to catch up to the lead some Asian and European countries have taken in broadband and Grid computing. Smarr also explains what he hopes to accomplish at Calit², in particular with the OptIPuter project.

To read a PDF version with all graphics, click here.

To read an HTML version without graphics, but including video clips of Smarr responding to

Educause Review questions, read below and click on the video links [RealPlayer required]:

EDUCAUSE Review, vol. 38, no. 6 (November/December 2003): 82–92.

Add Another Zero: An Interview with Larry Smarr

Larry Smarr is a pioneer in building national information infrastructure to support academic research, governmental functions, and industrial competitiveness. In 1985 Dr. Smarr became the founding Director of the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign (UIUC). In 1997 he took on additional responsibility as founding Director of the National Computational Science Alliance, composed of over fifty colleges and universities, government labs, and corporations linked with NCSA in a national-scale virtual

enterprise to prototype the information infrastructure of the twenty-first century. In 2000 Dr. Smarr moved to the Department of Computer Science and Engineering in the University of California at San Diego (UCSD) Jacobs School of Engineering, where he holds the Harry E. Gruber Chair. Shortly after joining the UCSD faculty, Dr. Smarr was named the founding Director of the California Institute for Telecommunications and Information Technology—known as Calit²—which spans the Universities of California at San Diego and Irvine.

Dr. Smarr received his Ph.D. from the University of Texas at Austin and conducted observational, theoretical, and computational astrophysical sciences research for fifteen years before becoming Director of NCSA. He is a member of the National Academy of Engineering and a Fellow of the American Physical Society and the American Academy of Arts and Sciences. In 1990 he received the Franklin Institute's Delmer S. Fahrney Gold Medal for Leadership in Science or Technology. Dr. Smarr is currently a member of the NASA Advisory Council, Chair of the NASA Earth System Science and Applications Advisory Committee, and a member of the Advisory Committee to the Director, NIH. He also served on the President's Information Technology Advisory Committee (PITAC).

In the following conversation with EDUCAUSE Review, Dr. Smarr discusses the future of broadband, the "Gigabit or Bust" initiative in California, and Grid technologies, among other topics.

EDUCAUSE Review: Larry, your career has covered a wide range of interests and disciplines. How would you characterize your professional goals and interests at this stage of your career?

Smarr: I've been privileged to work with many pioneers not only in computer science and my original fields of physics and astrophysics, but more broadly in earth sciences, environment, medicine, biology, and nanotechnology. I guess I think of myself as a perpetual student because there is so much to learn and yet even more is being discovered.

My current position is Director of the California Institute for Telecommunications and Information Technology. In a way, this is a dream position. At Calit², we were able to start with a blank piece of paper and create both an institutional structure and a disciplinary set of topics that we're going to be examining over the next five, ten, twenty years. We picked the fundamental goal of looking at the evolution of the Internet, enabling us to explore a lot of wireless technology areas as the Internet moves throughout the physical world. Many of the devices that the wireless Internet will connect will be based on micro-electrical-mechanical systems (MEMS) technologies and ultimately nano-based technologies. That is why, in the two Calit² buildings at the University of California at San Diego and the University of California at Irvine, there are clean-room facilities to support bio-MEMS, material characterization, nano-fabrication, and so forth. This has allowed me personally to branch out into areas of physics and chemistry and electrical-engineering optics that I hadn't studied before. On the other hand, we're applying these technologies to areas that I think will evolve a great deal as the Internet evolves: for example, we will be able to put massive sensor nets into the environment, our transportation systems will be based on peer-to-peer sharing of information among the automobiles and the infrastructure of the highways, and even our own bodies will move online, with biologically appropriate sensors that can measure our vital signs and report over the Internet in a secure and private fashion. Another emerging arena is the world of networked computer gaming, in which hundreds of thousands of people are choosing to live their lives in a virtual cyber-world and to create a cyber-civilization in which they have totally different roles and personas than they do in the physical realm. I guess if I had to pick a phrase that characterizes my future interests, it would be this intersection of the physical world and the cyber-world, this blend of atoms and bits that is beyond anything we've yet experienced as humans. It is, I think, a very deep area and one that will involve not just technical folks, but also performance artists, social scientists, even philosophers.

EDUCAUSE Review: Higher education and government will command a smaller and smaller portion of U.S. networking infrastructure as broadband moves to the home. Similarly, U. S. networks will represent a decreasing portion of global infrastructure as other countries build out. Are there important implications here for leadership and innovation?

Smarr: Clearly, the ability to get true broadband to hundreds of millions of homes and small businesses in the United States and throughout the world will be the next big driver of the economy. We've seen studies over the last five years from industry groups, from the National Research Council on the academic side, and from states that are developing projects like California's "Gigabit or Bust" initiative. All agree: the current wired Internet coupled to the personal computer is an S-curve transition that is on its flat mature top. We therefore cannot expect to see the high rates of growth that derived from the coevolution of the wired Internet and the personal computer, a development that drove the economic miracle and bull market of 1982 through 2000.

We have to look at new S-curves. One of them I described before is the wireless Internet, which links to cell phones, PDAs, cars, and sensor nets. Another new S-curve is broadband, either wired or wireless, to the home. This represents a new coevolution of technologies that many people believe will pull the cork out of the bottle and release the next wave of economic growth. One of the big barriers is caused by regulations, and therefore change is much slower than it should be. Our economy suffers as a result.

Already a number of countries, like Korea, are very far ahead of the United States in the penetration of broadband to the home—and of course by "broadband to the home" today we typically mean only one megabit per second or less. Yet many states in the United States are now beginning to experiment with different approaches to accelerating the broadband buildout. And that's the great thing about the United States: we have fifty laboratories for innovation. States like California, with the "Gigabit or Bust" program, are ambitiously attacking the problem. So I think we're going to have winners and losers. I think the countries and the states that are aggressive about getting broadband to the home—creating a new market with literally tens of millions, maybe eventually one hundred million participants, and creating a whole new set of devices in the home, not just entertainment but Internet-linked washing machines, dryers, refrigerators—are going to unleash a great wave of new companies and all kinds of new ideas. So we're pushing very hard for the idea at Calit² and certainly here in California.

EDUCAUSE Review: The goal of the "Gigabit or Bust" initiative is to bring one gigabit of broadband per second to every home, classroom, and business in California by 2010. How realistic a goal is that?

Smarr: I have to take some blame for the "Gigabit or Bust" slogan. When the Corporation for Education Network Initiatives in California (CENIC) and the Next Generation Internet (NGI) Roundtable were considering broadband to the home, we looked at all the previous reports. Most reports talked about a national decadal goal of 100 million bits per second going to the home, which would be an improvement of roughly one hundred times the speed of today's broadband. So that would not be an insignificant change. But I felt that if California is going to think of itself as a leader, then it needed to "add another zero." Besides, a gigabit seems like such a nice, round number, and "one hundred megabits" just doesn't roll off the tongue as easily.

But seriously, think how absurd the situation is today. If you buy a Macintosh laptop today, it comes with a built-in gigabit Ethernet, included in the price. So, our personal computers in 2003 have a gigabit input or output, and yet people are saying that in seven years we can't get that kind of bandwidth to our homes? In seven years, what do you think laptops are going to have for bandwidth? There's this crazy mismatch right now between the "last-mile" problem (which isn't a mile at all—it's more like the last-twenty-or-thirty-feet problem, from the curb to the house), and as a result we have these islands of data in our computers, in our servers, and we have this absurd bandwidth bottleneck between them. We have to smash that bottleneck and unleash the enormous peer-to-peer bandwidth capability that the intrinsic system of servers and personal computers allows for.

'Gigabit or Bust'
"We have to smash [the bandwidth] bottleneck and unleash the enormous peer-to-peer bandwidth capability that the intrinsic system of servers and personal computers allows for." Smarr helped author the expression that has rallied Californians behind an initiative to bring one gigabit of broadband per second to every classroom, home, and business in the state by 2010. Here, he explains why the goal is so important. Length: 4:40 [Video]

The "Gigabit or Bust" initiative is ambitious, and it is controversial. So one of the things that the NGI Roundtable and CENIC have come up with—and that I'm very excited about—is an annual competition in California, in a variety of categories, for the best examples of communities, of school districts, and of companies that are leading the charge toward the "Gigabit or Bust" goal. We're publicizing these awards because in these technology transitions, the biggest barrier is that there is no model to follow. A few good examples will inspire others. Thus this contest that CENIC and the NGI Roundtable have established could be one of the most important reasons that California gets, at the least, very close to the goal. In 2010 there will be many houses with a gigabit per second, and there will be some houses with only up to 100 million bits per second—but if we can get to that point, it'll be a great success.

EDUCAUSE Review: High-end research is certainly one driver of ubiquitous broadband, but do you believe that pressures from the popular culture—specifically the Net generation—may be equally important in building that future infrastructure?

Smarr: Almost every component of today's information infrastructure started as a result of federal funding of university research. I believe we will get a lot of technology momentum driving for gigabit to the home from experiments on college campuses, which are, after all, small towns. At UCSD, 40,000 people are associated with the campus. There are streets and utilities, and we have to dig from the curb to the dorms, but we already have 100 megabits to lots of students' rooms at UCSD in 2003, roughly 100 times the bandwidth of what passes for "broadband" to the home today. College campuses therefore give us a time machine because they offer us a "living laboratory" of how our society may live five to ten years from now. People who are interested in this issue of broadband to the home should be studying our college campuses intensively today, yet I don't think there's nearly enough of that being done.

'Living Laboratories'
"College campuses give us a time machine because they offer us a 'living laboratory' of how our society may live five to ten years from now." Smarr says the advent of broadband connectivity, including the wireless Web, on campuses makes the university a unique place to experiment and come up with technologies before they enter the mainstream. Length: 4:44 [Video]

On the other hand, look at the way that the kids are driving the network gaming industry: requirements for computer graphics to support computer gaming are now exceeding those of feature-length movies, and the revenue streams are beginning to move in that direction. There is a wonderfully insidious, driving force in every home with children in California or anywhere else in the country or the world. A kid without broadband at home can't get to the beautiful, high-resolution, 3-D graphical worlds of these modern networked games. This kid is going to say: "I can't have a proper social life—I can't grow up as a good kid with my peers—unless I have high bandwidth at home." That kind of grass-roots, bottom-up pressure will be one of the most important drivers.

EDUCAUSE Review: A recent "Gigabit or Bust" roundtable focused on several public-policy areas that may offer potential "killer apps" for ubiquitous broadband. Do you have any favorite killer apps in health, government, business, and the like?

Smarr: Again, I think we can study the sociology of how a broad citizen base will interact with a ubiquitous broadband environment by looking at college campuses today. Internet2 connects about 200 campuses, with between 5,000 and 50,000 or so students at each of these campuses. They have 10 or 100 megabits to their rooms, and their personal computers typically have 50- to 100-gigabyte drives and 2-3-gigahertz pentium chips. Then the campuses are all hooked to each other over Internet2, which is by now gigabit to tens of gigabit connectivity. So if you want to ask about the broadband killer app, you should look at this broadband "living laboratory" connected over vastly different geographic distances and interests. What you see is that in this laboratory, the killer app that has emerged is the sharing of multimedia objects. And this, of course, has raised very challenging intellectual property issues. But the fact that we have a controversy over intellectual property shouldn't blind us to the fact that if people are given broadband access, there will be an intense hunger for the sharing of multimedia objects such as music and video. This is such an overwhelming drive that people are willing to go to jail for it. Now that's what I call a true killer app.

EDUCAUSE Review: You coined the term "metacomputing" in 1988 to describe the integrated utilization of a variety of distributed computing resources. How important are such things as the Grid community, the Grid toolbox, and the Grid infrastructure in 2003 and beyond?

Smarr: The transition in the Web from the early 1990s to the late 1990s is a classic S curve. A similar transition is happening for the Grid between the mid-1990s and the end of this decade. The early days, at the bottom of the letter S, form the era of early adopters. There is typically a lot of experimentation: different people have different ideas about the right way to do something complex—for example, middleware. Next is a period of consolidation, during which those ideas get sorted out and people decide that one particular brand of software, or some standard, is the way to go. After that is take-off—the knee of the S-curve at the bottom. I think we're about at that point now with Grid middleware. This middleware is the "operating system" of the metacomputer I envisioned almost fifteen years ago—but with the ability to authenticate and, in a secure fashion, to reserve computing storage, networking, visualization, files, data sets, people, and instruments. This middleware enables the integration, from everything that's tied together by the Internet, of a specific electronic metacomputer that you and the other people hooked to it need to use for the next ten minutes, or one second, or two hours.

I think it's clear that Globus, which was developed by Ian Foster and Carl Kesselman, has emerged as the gold standard in Grid middleware. It is being adopted by the Europeans, by the United Kingdom, and by many Grid efforts in Asia and in the United States as well. One of my concerns is that the United States is lagging in the race to build out Grid middleware under its large shared science projects. But the National Science Foundation (NSF) will soon launch a distributed cyber-infrastructure initiative, which should help us catch up to the Europeans and the British. It's always healthy to have some competition, and you don't necessarily always want to be the first to adopt something. But many of these ideas were developed in the United States, and I think it would be a shame if the United States did not reap the benefits. On the other hand, it's very exciting to see the internationalization so early in the history of the Grid. So really the only issue is whether the United States will be able to carry its weight as the global scientific Grid develops as the essential information infrastructure foundation for discipline after discipline—from particle physics to earthquake engineering to ecology to astronomy. I'm very encouraged by what I see happening at the NSF right now. There has also been pioneering work at the Department of Energy and also very good work coming from the National Institutes of Health and from NASA.

Global Grid

"The only issue is whether the United States will be able to carry its weight as the global scientific Grid develops as the essential information infrastructure foundation for discipline after discipline." Smarr talks about heated competition to roll out Grid networking and software in Europe and Asia, noting that U.S. broadband to the home has a long way to go to catch up with a country such as South Korea. Length: 4:14 [Video]

EDUCAUSE Review: Will the term "supercomputer center" become obsolete as Grid technologies support the widespread distribution of computational resources, support staff, and users? Will the focus shift to applications in "virtual centers of collaboration" as computing power becomes a utility? Where will all of this lead?

Smarr: I find it difficult to predict the future without understanding the past. When you look at previous infrastructures—the railroads, the electrical power system, the air traffic system—what you find is that as the infrastructure develops, it is "lumpy," with very different sizes of "lumps." I think the easiest way to understand this is to look at the air transport system. There are many airports around the world. I used to fly from Willard Airport, a very small airport in Urbana-Champaign, Illinois, up to Chicago's O'Hare, one of the largest and busiest airports in the world. Just because there's a distributed set of airports doesn't mean that they're the same size. Any developed, mature infrastructure will display this inhomogeneous distribution of capability.

Consider your nervous system. At the top of your spine is this big neural "lump," called the brain, which has specialized sections: cerebellum, cerebrum, and so forth. You have a medium-sized neural lump called the solar plexus; you have different sizes of nodes of neural-processing capacity all through your body, down to the individual sense cells in your fingers that feel pain or cold or hot. Likewise, it's completely clear to me that this distributed computing capability is beginning to develop in the Grid: from individual personal computers to Linux clusters in our laboratories, to the super-nodes on the Grid, which contain the most extreme computing or storage capability or the most extreme instruments.

In a Grid world, we are going to need a few large national centers every bit as much as we do today. In fact, I would argue that we will need them even more because someone has to help design, prototype, and build out these large-scale Grid engineering systems. Examples of these hyper-data-intensive shared scientific systems would include · CERN's Large Hadron Collider particle accelerator, with the hundreds of universities and tens of thousands of users that share it, · NASA's current fleet of twenty Earth-observing satellites sending down a terabyte per day of new data that goes out through NASA's Earth Observing System, delivering millions of data products a year to hundreds of thousands of end users, and · the National Institutes of Health's Biomedical Imaging Research Network, which connects many universities forming a federated repository of multi-scale brain images. All these and many other shared scientific systems ultimately should depend on a universal standards-based, distributed cyber-infrastructure that globally ties together all of these sources of data with all of the consumers of data and with all of our scientific instruments.

To build such a Grid, we're going to need large centers like the supercomputer centers again. The most important thing about the supercomputer centers is not that they have supercomputers, though that certainly is important. It is that they bring together a large group of experienced and dedicated technical professionals, from many technical specialties, who create a team that serves the U.S. user community. These teams have built a great deal of what we now know as our information infrastructure. We're not going to need fewer of these people because we're building a more complex system—we're going to need more of them.

Does this mean that we'll need only a few large centers? No. There will be a logarithmic-distribution function of capability spread across the campuses. In addition to the larger centers, there will be centers of specific focus on many campuses. There will be many Linux clusters, with associated storage and visualization in university laboratories studying chemistry, physics, biology, astronomy, and the engineering field sciences. There will be a large, distributed set of different-scale centers building out the cyberinfrastructure. But I think there are going to have to be some premier large centers that do the national work because that's what I see in every infrastructure that I look at.

EDUCAUSE Review: Your current responsibilities at Calit² involve building collaborative research agendas and working relationships among a wide range of accomplished scholars. The sociological challenges in all of this must be interesting. In managing this sort of research, are you finding some patterns or frameworks that others might adopt?

Smarr: Governor Gray Davis declared, in December 2000, that California was taking the bold initiative of establishing four institutes for science and innovation: Calit²; the California NanoSystems Institute (CNSI); the Center for Information Technology Research in the Interest of Society (CITRIS); and the Institute for Bioengineering, Biotechnology, and Quantitative Biomedical Research (QB3). Richard C. Atkinson, president of the UC System, endorsed the idea, and a bipartisan vote in 2002 by the California legislature approved the capital funds for these new institutes. We also received a lot of support from industry and from faculty winning federal government grants. I had learned a lot from being involved in previous national experiments of this sort, so I looked at these earlier experiences and tried to understand what would be a natural way to organize the two hundred faculty involved in Calit² across not just the San Diego and Irvine campuses but also the San Diego State University, University of Southern California, and other campuses that are working on particular projects around the country.

One of the big differences between NCSA and Calit² is that in NCSA, most of the members were academic professionals, with only a few faculty. Calit², in contrast, is a faculty-driven organization with a few technical professionals. In fact, Calit² needs many more technical professionals, but the operating budget won't allow for it right now. So I knew from the start that we would have a number of levels, all of which would be run by faculty. We developed a "layer concept." At the bottom are the new materials and devices that the future Internet will require; next is the networked infrastructure; and then come the interfaces and software systems. Then we have our four driving applications areas: environment and civil infrastructure; intelligent transportation and telematics; digitally enabled genomic medicine; and new media arts. We then added policy, management, and socioeconomic issues on top. Education and industry cut through all these layers. Each layer or applications area is led by two faculty researchers, one from each primary campus. All "layer leaders" report to the two campus Division Directors, who in turn report to me. Working with me are Ron Graham, Chief Scientist, and Stephanie Sides, Director of Communication. This has worked pretty well. With roughly ten faculty for each layer leader and with about ten layer leaders for each division director, there is a manageable numerical scale. And yet the structure allows for great decentralization of innovation. My job, meanwhile, is to try to see the emerging themes that come from such a broad, cross-disciplinary set of scientists. I spend time educating myself so that I can understand the trends in many of the underlying technologies or science areas, understand where the federal agencies are going, understand what the global competition is like, and then try to lead us in directions that are productive.

I think creating this collaborative structure is fairly natural. I can imagine it being applied to any campus. I think the scale is perhaps bigger than most campuses would want to undertake, but it could be used for essentially any topic.

EDUCAUSE Review: Some applied disciplines find themselves stuck in the pre-digital environment, in terms of both equipment and processes. Based on your work with researchers in the medical community, what role do you see for higher education in testing and promoting digitally based "sensor nets" and real-time feedback, diagnosis, and treatment?

Smarr: Our country underestimates the value of campuses as places to work out innovation in "systems." We've optimized our campus research environment for individual professors to do curiosity-driven research, and that certainly will continue to be the foundation on which advances are built. But somehow, along the way, we've lost the concept of "systems" as being an important area for research. When we look back at what John Hennessy was doing at Stanford when he was head of the MIPS project in the 1980s, we see that campuses can be engaged in very complex systems research. I think we're going to see more and more of this kind of systems research join the hyper-specialized individual research, but we're also going to need new institutions to support it. This is one of the reasons that Calit² is an interesting experiment—to see if a sustained collaborative framework can be built into a preexisting campus structure of deans and departments and individual researchers.

For example, Bill Griswold, in the Department of Computer Science & Engineering at UCSD, has handed out nearly one thousand Windows CE Pocket PCs with Wi-Fi and spatial software to students. Thus a small community of students, all undergraduates, are living in the kind of world that will exist in the future—not that far in the future, within the next year or two—when all of our cell phones will be geo-located. Many already are. Then, when you are working with people over the Internet, you will have not only their names but also their locations, if they want to disclose that information to you.

Where else can we do this kind of experiment on such a scale? And the same is true in medicine. I think many medical schools will realize that they can begin to prototype wireless embedded medical sensors, digital medical records, and the ability to do data-mining across large numbers of human medical records—things that we're not capable of doing today. I would like to see much more emphasis on this kind of research on college campuses.

EDUCAUSE Review: Many institutions have IT strategic plans that involve, among other things, wireless networks. Do you think that Wi-Fi is the next big thing?

Smarr: I certainly think that mobile Internet is the next big thing. Whether the specific unlicensed spectrum, 802.11 and its variants, is the next big thing—whether it's cellular Internet like 1xRTT in the United States or whether it's a bridging of these—that's for the market to sort out. But there is an explosive growth in mobile connectivity. As I recall, a recent study found that across public universities in 2002, roughly 20 percent of the educational area of classrooms and dorms and eating places was covered with Wi-Fi. Of course, on some campuses the percentage is much higher. Three years ago Carnegie Mellon was already deploying a great deal of Wi-Fi. At UCSD, I think 96 percent of the educational areas are covered by Wi-Fi. So there are pioneering schools that are learning how to live in a world in which the Internet is everywhere. And people don't really understand how big a change that is. It means that you can put a small device anyplace, and the device will be able to connect to the Internet because the Internet is already there in wireless form. On the other hand, to get bits from one place to another using wireless as an underlying medium, rather than electrons going down copper wire or photons going down clear glass, is vastly more complex. We have a great deal yet to learn about the electrical engineering of how to get the Internet throughout the physical world.

EDUCAUSE Review: You are very interested in nanotechnologies and in the general ideas of embedded intelligence and communications. When intelligent communicating devices get really small, what will change in our living, working, and learning environments and processes?

Smarr: One of the biggest mistakes people make when they predict the future of the Internet is that they consider only the networking aspect. But I think what's going to happen over the next two decades is that we're going to see an accelerating rate of change, similar to what Ray Kurzweil has talked about. This is going to induce what I call "the perfect storm."

The Perfect Storm
"One of the biggest mistakes people make when they predict the future of the Internet is that they consider only the networking aspect." The Calit² director predicts massive upheaval in technology as three distinct 'storms' hit at the same time: information technology and telecommunication; post-genomic biology; and nanotechnology, coming from physics, engineering and chemistry. Length: 3:57 [Video]

In the movie The Perfect Storm, a boat got into trouble because several storms moved together and merged into a super-storm, which created chaos and violence on a scale that no one expected. I propose that something similar is going to happen with the Internet. The storms are (1) information technology and telecommunication, (2) post-genomic biology, and (3) nanotechnology, coming from engineering, physics, and chemistry. Each is a huge revolution in its own right. But they're all happening at the same time, and they're all going to merge into one large storm of info-bio-nano-technology.

The world that we have built in academia, a world based on specialization, will have a real problem dealing with this perfect storm. Doctors don't think they have to know anything about information technology or physics. Physicists don't think they need to know anything about biology or information technology. And computer scientists don't think they need to know anything about biology or physics. By contrast, kids understand that this perfect storm is brewing; they are sliding across these stovepipes and are picking up biology, physics, chemistry, and information technology. They are living on the Net, sharing all these advances, and they're going to be much better prepared for this perfect storm. They're going to ride it out. But I'm afraid that many people—specialized faculty, in particular—are going to be in for a rude shock.

EDUCAUSE Review: What are some of the policy, fiscal, or technical challenges that data-intensive sciences will pose for colleges and universities in the years ahead?

Smarr: As the data Grid develops both nationally and internationally, campuses are going to have to adapt to it—much as they adapted in the mid-1980s when the NSF decided to link the five NSF supercomputer centers together using TCP/IP derived from the ARPANet (Advanced Research Projects Agency Network). Campuses had to decide whether or not they wanted to participate in providing access for their faculty to these remote resources. It was quite amazing to watch the reaction because there were actually only a handful of potential supercomputer users on each campus—normally not enough for institutions to get bulldozers to come out to trench the quads and for them to make capital investments in laying cable. But they did. And campuses did so because they knew they would lose their best faculty if they did not. And so, without anybody having to tell institutional leaders that they needed to carry out some infrastructure changes, they naturally did it.

I think the same thing is going to happen as we move from this supercomputer-driven to a data-intensive world. There are probably ten experimentalists or observers among the scientific community for every one theorist. As we enter this data-intensive world, we're going to get a much larger participation than we did fifteen years ago in the supercomputer world, which was largely driven by theorists. What we will need very soon is a standardized laboratory environment—integrating computing, storage, and visualization—that is very easy to set up.

Because the Grid is data-driven, there will be a new utility needed, a data-storage utility, which is going to have to be provided by the campus as a whole. If you've got all of these faculty on campus pulling down large data sets from remote federal repositories through the campus gateway to their individual laboratories, and you don't have a large data cache for the whole campus, then you're going to have to spend a ton of money on a giant gateway. Because every time faculty want to get a new data set, they're going to run out of space in their lab, so they will erase the older data. Then when they want to look at that older data again, they're going to run the same data set through the gateway, and this just doesn't make any technical sense. Campuses need large storage—say, one hundred terabytes or more of rotating storage that is available to anyone on campus. Then individual labs might need only a few terabytes for their own daily use of data, with every data set being stored in the shared campus cache for perhaps six months or so. When the users want the data again, they need only to go back to the center of the campus and not halfway across the country.

This doesn't sound like such a big challenge. But if two hundred universities are trying to figure out this storage architecture—each on its own—this is not going to be very efficient. Soon campuses will need to begin to think about finding common solutions. One thing that we need more of in the United States is dialogue among the chief information officers about these standardized solutions—whether for the individual laboratory or across campus. CIOs do not often get involved in what scientists put in individual laboratories, but maybe they should. If we could find more common solutions, we might save everybody a lot of time, and we could get on with doing the science.

Steve Daigle, Senior Research Associate, Information Technology Services, California State University Office of the Chancellor, served as contributing writer and facilitator for this interview.

Newsroom > Web Article

"Add Another Zero: An Interview with Larry Smarr"