calit2

Peter Arzberger Interviewed by GRIDtoday

Special Features:

PRAGMA PUTS PEOPLE AT CORE OF CYBERINFRASTRUCTURE
By Derrick Harris, Editor

Peter Arzberger, chair of the PRAGMA steering committee, talks with GRIDtoday editor Derrick Harris about the recent PRAGMA 7 workshop (Sept. 15-17 at SDSC), what the future holds for this international assembly, and why people are the most important resource of any Grid deployment.


GRIDtoday: First off, I'd like to ask what Grid computing means to you?

PETER ARZBERGER: Let me recast your question in terms of cyberinfrastructure, a term now used by many, in particular by the National Science Foundation. I like the definition given by my colleague, Philip Papadopoulos:

"Cyberinfrastructure (CI) is the integration of hardware, networks, software and policies that gives researchers seamless access to remote and geographically-distributed resources. For example, a user of a high-power electron microscope with high-resolution digital camera can use CI software to easily have a remote supercomputer do real-time image analysis that compares the live image to a large database of similar specimens. This computing with data augmentation can dramatically improve the entire data acquisition process. Cyberinfrastructure research itself includes building new networking protocols, determining how to discover and access remote resources, providing programming abstractions of distributed resources to make them appear local, and understanding how science can take advantage of the networked world."

I want to call out explicitly that a frontier component of cyberinfrastructure is on the "edges," where researchers are extending the physical infrastructure to sensors and sensor networks wirelessly. CI includes these activities, as well.

Gt: Moving on to PRAGMA 7, what is your overall impression of how the workshop went?

ARZBERGER: PRAGMA 7 was an incredibly successful workshop on several levels.

First, at the working group level, the workshop provided members of the teams to meet face-to-face to review progress and make plans for the future.

Second, this workshop is the seventh in the series of workshops that started in San Diego in March of 2002) and is the first time we have returned to any location. The value of hosting a meeting is that it allows the host site to introduce researchers for the first time into the PRAGMA community. We were certainly able to do that in PRAGMA 7.

Third, we worked very hard to expand the types of applications and technologies of interest to PRAGMA. In this case we highlighted optical computing, bioinformatics, and the geosciences, with presentations from Larry Smarr of OptIPuter, an NSF-funded project so named for its use of Optical networking, Internet Protocol, computer storage, processing and visualization technologies, and which is an envisioned infrastructure that will tightly couple computational resources over parallel optical networks using the IP communication mechanism; Mark Ellisman of the Biomedical Informatics Research Networks, an NIH initiative that fosters distributed collaborations in biomedical science by utilizing information technology innovations; and Chaitan Baru of the Geosciences Network, an NSF-funded project to enable scientific discoveries and improve education in earth sciences through information technology. In addition, we had "Birds of a Feather" sessions on such topics as earthquake engineering, which were not covered in previous conferences but were of interest to many institutions.

Fourth, we introduced into PRAGMA a new component, called Pacific Rim Undergraduate Experience (PRIME). PRIME is an NSF-sponsored program that supports the research experiences of UCSD undergraduate students during a 9 week period at one of the PRAGMA sites. This program was just launched in April of this year, and we were able to support a total of nine students, three each at the Cybermedia Center of Osaka University in Japan, the National Center for High-performance Computing in Taiwan, and Monash University in Australia. The students participated in the meeting and we devoted a session to their research and the experiences that such a research internship provides for undergraduate education.

Fifth, the conference's location, on the east side of the Pacific Ocean for the first time in six years, gave an opportunity for researchers from institutions in Canada, Mexico and Chile to attend a PRAGMA workshop for the first time. Other U.S. institutions such as the University of Washington and the University of Michigan also attended.

Sixth, we were extremely fortunate to have the new UCSD chancellor address the PRAGMA group. Of the three key themes of innovative, interdisciplinary and international, we feel that PRAGMA reflects all of these, and is a model for other organizations.

Finally, such a workshop is an opportunity to work and interact with the PRAGMA family of members. PRAGMA, an institution-based organization, consists of 23 member institutions. Its mission is to build sustainable collaboration and to advance the use of the Grid technologies in applications. At each workshop, in addition to the incredible amount of work, we also take the time to interact as people, and share with our hosts a bit of their culture and cuisine. This component has been critical to building a community of researchers, colleagues, friends and, ultimately, an extended global family.

We were fortunate to have the support from a number of organizations that allow us to provide the participants with an environment in which we could work and at the same time share some cultural experiences in San Diego. Let me cite the support from the National Science Foundation, the San Diego Supercomputer Center, the California Institute for Telecommunication and Information Technology, TransPAC/Indiana University and Cray Inc.

Gt: What are some of the accomplishments that came out of PRAGMA 7?

ARZBERGER: One of our largest challenges is making the Grid work on a routine basis for researchers. In PRAGMA 5, we proposed to work toward building, from the bottom up, a testbed to make a Grid routinely usable. We formulated a plan at PRAGMA 6, and followed through in building a testbed of resources at 10 institutions. That experience gave us many lessons learned. At PRAGMA 7, we reviewed that progress, and we set forth on a plan to expand the number of applications running on this testbed and to extend the testbed to more sites. This is an important approach to take, as most Grid computing will be done in a heterogeneous environment. Furthermore, we are learning how to share resources across national boundaries -- something that will be important in the future.

In addition, PRAGMA is a platform in which middleware developers and work with application scientists to improve overall performance of codes. At this meeting there were several examples of getting software to work together. One notable example is between the Grid Datafarm distributed file system (gfarm) developed by the Grid Technology Research Center of the National Institute for Advanced Industrial Science and Technology (AIST) in Japan and the integrated Genome Annotation Pipeline (iGAP) developed at UCSD. Working toward software interoperability provides feedback to the developers, as well as providing access of the software to more users.

New groups became involved in PRAGMA. The Korea Basic Science Institute (KBSI) and the National Grid Office (NGO), Singapore, are now officially institutional members of PRAGMA (their applications for membership were accepted by the PRAGMA Steering Committee). As mentioned above, many other institutions participated for the first time, which is an essential first step toward membership. PRAGMA is an organization in which its members have agreed to work together and abide by its operating principles and procedures.

Finally, there are countless other interactions that were started, several of which will likely produce longer lasting collaborations.

Gt: Are there any areas that left questions unanswered and, on a related note, do the working groups continue discussions and work on their various areas outside of the scheduled PRAGMA gatherings?

ARZBERGER: Working groups absolutely continue to work between meetings. Just in week since the conclusion of PRAGMA, the next round of tests on the testbed are being set up. Furthermore, over the next six weeks before Supercomputing (SC'04), members of working groups will be finalizing demonstrations to present at SC'04.

PRAGMA members are involved in many other activities. Two activities took place right after PRAGMA. One is Clusters 2004, which involved members of the resources group. The other involves members of the Telescience/EcoGrid group, which convened application scientists in lake metabolism and coral reef ecology to determine how to build global networks based on Grid and sensor network technologies.

Gt: You (in your opening remarks) mentioned a handful of themes of the workshop, and PRAGMA in general (including building partnerships to construct the Grid, developing applications and middleware to ease use of the Grid, and getting more people to use the Grid). What progress did PRAGMA 7 make in regard to these themes?

ARZBERGER: We developed a workshop theme of "working groups working together." to promote a great deal of interaction among the groups and new possible groups.

The four themes you refer to (partnerships to build a Grid, applications to drive Grid development, middleware to ease the use of the Grid, and people to use the Grid, grow the partnership and develop the tools) are themes of the new annual report: Collaboration Overview, which will be ready for Supercomputing 2004. These themes come directly from our mission and drive our activities.

In PRAGMA 7, we took steps forward in fulfilling the ideals represented in each of the themes.

Gt: PRAGMA is unique in that it really makes the "people" aspect a focal point of its work. How important are the relationships of the people involved in making any Grid project work?

ARZBERGER: People are the critical component of making the Grid work. In the diagram of cyberinfrastructure that I use, people are at the core. The Grid is built and used by people .

There are many aspects of making the Grid work. The objective part is the technology, which makes systems and software work together, or interoperate. But other components of Grid interoperability are often just as difficult to address, and are often overlooked in national and international arenas. Some important questions we must confront are:

  • Cultural and Behavioral: What are the incentives to build codes or frameworks for interoperability, given the reality of national funding streams and promotion criteria at many academic institutions? How can credit be given? Are there differences for integration of code versus development of code?

  • Institutional and Managerial: Who is responsible for fixing errors in the process of developing tools in a global setting? What are the administrative challenges we encountered to make the cycle of deployment and testing of tools successful? Are there issues that we encountered with users and developers distributed across many institutions?

  • Legal and Policy: Are there intellectual property issues that limit ability to integrate code at an international level, or an institutional level?

  • Financial and Budgetary: Which country, funding agency, institution is responsible for building interoperability for codes? For integrating codes from a variety of institutions?

As many of the larger Grid efforts in the U.S. have discovered, the "social engineering" is the biggest challenge.

Note: PRIME, the program for undergraduate researchers, would not work without the people network. Trust is essential with code developers.

Gt: What additional challenges exist in deploying an international Grid/high-performance network versus deploying either on a state-wide or nationwide scale?

ARZBERGER: As mentioned above, a functioning Grid consists of many components. International Grid efforts face challenges associated with all of these components. Let me list some additional challenges and benefits in the international arena.

One obvious challenge is that funding comes not just from different agencies, but from different economies, each with a high priority of ensuring the access of investments benefiting its citizens. Additional challenges involve different policies on access to software and data produced from public funding. Building interfaces in different languages to use or access the middleware or databases is yet another challenge.

Benefits for working in an international arena are many, including the ability for individual institutions to draw upon different funding resources (e.g., Japanese researchers have access to funding agencies that are not available to U.S. researchers); the ability to bring together colleagues in the same discipline from many countries (for example, using the PRAGMA connections to help bring together researchers in lake metabolism); and ultimately, the exposure to different cultures to gain new insights into various use patterns of the Grid.

Gt: In his keynote speech at PRAGMA 7, Larry Smarr mentioned opportunities for collaboration between Calit2 and PRAGMA. How do you see the two organizations working together? Have they worked together previously?

ARZBERGER: Larry was at the very first PRAGMA meeting and has been extremely supportive of our efforts. Calit2 in general, and Larry in particular, has also been helpful making ties with colleagues in Mexico. Many of the key players in PRAGMA at UCSD are also associated with various Calit2 projects, including Mark Ellisman of the NIH-funded Biomedical Informatics Research Network and the NSF-funded OptIPuter, and Philip Papadopoulos of BIRN, OptIPuter and Quartzite, a new NSF- funded project to provide OptIPuter and other researchers with a novel wave-selective switch not yet commercially available. I am the principal investigator of an NIH award, the National Biomedical Computation Resource, whose goal is to catalyze the use of Grid technologies in the biomedical community, and serve as deputy layer leader for the Digitally Enabled Genomic Medicine (DeGeM) layer in Calit2. PRAGMA is considered a Calit2 project.

Calit2 is the lead for the eight-campus OptIPuter project and is currently discussing expanding to link to other research sites, including some in PRAGMA. As Larry mentioned in his session, PRAGMA will be provided some space in the new UCSD Calit2 building, which will be opening in Spring 2005. This will allow short-term and long-term PRAGMA visitors to work directly with OptIPuter to build out the OptIPuter and cultivate other long-term collaborative projects.

Finally, we are all looking forward to the iGrid2005 meeting that will be hosted by Calit2 next September 2005, which will have a great PRAGMA participation.

Gt: In your opening remarks, you cited winning some Bandwidth Challenge awards at SC'03 as an accomplishment since last year's meeting. Where does PRAGMA stand going into SC'04? Will it be taking home some more hardware?

ARZBERGER: During the High-Performance Bandwidth Challenge, a highlight of SC2003, contestants from science and engineering research communities around the world demonstrated the latest technologies and applications for high-performance networking, many of which are so demanding that no ordinary computer network could sustain them. The awards won by two demonstrations by PRAGMA members, the "Trans-Pacific Grid Datafarm" team and the "Multi-Continental Telescience" team emphasized that collaborative science applications are a significant force behind the development of high-performance networking.

The "Trans-Pacific Grid Datafarm" team won the Distributed Infrastructure Award for a geographically distributed file system that took advantage of multiple physical paths to achieve high performance over long distances. For the competition, the National Institute of Advanced Industrial Science and Technology of Japan (AIST) replicated terabyte-scale experimental data between the United States and Japan over several OC-48 links. Five clusters in Japan, three in the United States, and one in Thailand constituted a Grid virtual file system of 70TB capacity and 13 Gb per second of parallel disk I/O performance using the Grid Datafarm Data Grid Middleware. Worldwide parallel and distributed data analysis of astronomical object survey was done for the terabyte-scale archive data using the testbed. For replicating 1.1 terabyte data, the team achieved stable 3.79 Gb per second network flow out of theoretical peak 3.9 Gb per second (97 percent) using 11 node pairs.

The "Multi-Continental Telescience" team won the Application Award. The team presented a multidisciplinary entry that showcased technology and partnerships encompassing Telescience, microscopy, biomedical informatics, optical networking, next-generation protocols and collaborative research. High network bandwidth over IPv6 allowed participants to the control multiple high energy electron microscopes, demonstrating the ability to use high-quality, low latency HDTV to navigate a specimen in Osaka University's 3.0 MeV electron microscope via IPv6. The demonstration was performed in synchrony with parallel processing and visualization of data from the Biomedical Informatics Research Network from the instrument over a global Grid of heterogeneous resources located at five institutions worldwide.

During our recent meeting, we started outlining our plans for SC'04. There will be booths by CCS/U Tsukuba, GTRC/AIST-Titech, KISTI, Osaka University, NCHC, NCSA and SDSC. There will be participation for many other sites as well. Details about our plans will be on our Web site before SC'04.

Gt: What effects will the PRIME program have in the Grid field? Will its international element help to ensure the future of PRAGMA and other international Grid initiatives?

ARZBERGER: PRIME, which was funded by NSF and Calit2, stands for the Pacific Rim Undergraduate Experience program. PRIME supports students traveling to institutions along the Pacific Rim and conducting research in a nine-week internship in the area of Grid technologies, either by developing technologies or by using them for applications. This was our first year, and we were fortunate to have nine very qualified UCSD undergraduate students selected, to be our first "class" for the program. They have set the bar very high for subsequent participants. Three students went to each of the Cybermedia Center (CMC) at Osaka University in Japan; the National Center for High-performance Computing (NCHC) in Hsinchu, Taiwan; and Monash University in Melbourne, Australia.

There are multiple effects that we can anticipate from programs like PRIME. Directly, the students contributed to the building of tools, some of which are being integrated into the global cyberinfrastructure. Another product of the students' activity, in which they act as glue between researchers in the U.S. and at the host institutions, is the collaborations that are started or strengthened. This year several new collaborations were started, which provide some UCSD researchers with access to tools and expertise not available on campus. These ties are essential to cyberinfrastructure. Finally, these students have become more mature users of the cyberinfrastructure, which will help spread its impact in science, engineering, and education.

For PRAGMA, the students have helped push forward several activities, and brought more applications to us. Furthermore, they have demonstrated that the PRAMA platform can benefit from and be enriched by students.

Gt: Are there plans to extend the PRIME program outside of the Pacific Rim -- perhaps into Europe (although the name would have to change)?

ARZBERGER: We are definitely looking to expand PRIME in several ways. First, there are several institutions in PRAGMA that want to accept students next summer. Second, there is another department at UCSD that wants to use the PRIME platform to send students to a Pacific Rim institution. Additionally, we want to include other U.S. PRAGMA institutions as sources of students.

The flow of students should not be one way. We are exploring how to have students conduct short term research internships at UCSD next summer. We would expect that these students would benefit PRAGMA in a similar way as the PRIME program has, by building tools and relationships that will be important for the global research endeavor.

Finally, we are looking to include both graduate students and postdocs in the PRAGMA platform. Each of these levels of students provides benefit to all concerned. Ideally, we would like to create a source of funding to allow PRAGMA to select postdocs from anywhere in the world and have them rotate to PRAGMA institutions.

Having a PRIME-like program elsewhere would be fabulous. The leads on PRIME will be working toward the goal of having all students participate in an international research experience as part of their education.

Such a change in education would change the world!

Gt: Finally, I would like to ask what the future holds for PRAGMA, and how you see the organization changing as Grid computing, and the idea of e-science/cyberinfrastructure, continues to evolve?

ARZBERGER: Before looking forward, I'd like to reflect on how PRAGMA got started and grew.

First and foremost, PRAGMA was, and is, an experiment in which all members were and continue to be willing to participate. As early as 1995, we had been encouraged by several at NSF to take advantage of our location on the Pacific Rim and build ties with other centers of computing and computation around the Pacific Rim. But only after much investment in networking, and the establishment of groups like the Global Grid Forum, did it seem feasible to start and sustain collaborations. Several of us at UCSD, including Philip Papadopoulos, Fran Berman, John Wooley, Mark Ellisman and Larry Smarr, each had individual ties. We thought we should "pool" our experiences and bilateral relationships into a forum for multilateral activities.

UCSD and SDSC hosted the first workshop in March 2002, but thanks to Philip Papadopoulos' insights, we planned the first three workshops. Our colleagues in the Korea Institute for Science and Technology Information (KISTI) agreed to host a second even before we submitted the proposal to NSF. Similarly, our colleagues at the National Institute for Advanced Industrial Science and Technology (AIST) and the Cybermedia Center at Osaka agreed to host a third meeting.

While the first meeting was a great success, the second one in Korea raised the bar for participant expectations, which was matched by the third. This provided a very solid trajectory for the future.

We established working groups, and after the fourth meeting we had developed a level of trust that we proposed new experiments using resources that would be dedicated by many sites. This experiment has generated a great deal of knowledge for establishing "bottom" up Grids for production.

We are also experimenting with how to combine middleware developed throughout PRAGMA. We see this in building Rocks rolls for SCE and Ninf-G, and the localization of Rocks, in K-Rocks.

In another experiment, the knowledge gained by the collaboration between the National Center for High-performance Computing (NCHC) and Telescience -- that is, how to control a microscope remotely -- was applied to sensors and cameras positioned in the ecological parks. This allowed PRAGMA activities to move to and push the edge of cyberinfrastructure (where many exciting challenges exist). This also allowed us to bring in new communities of limnologists and researchers studying coral reefs in parallel but closely connected meetings.

Finally, the PRIME program for undergraduate students was an experiment, which we feel was wildly successful.

We will continue to evolve and continue to experiment. We will bring in new partners as needed, engage new application communities, provide a platform for middleware testing and hosting of middleware from the region and in fact the world, we will continue to focus on collaborations where all parties bring expertise to the table and take away benefits, and we will expand the training and educations components for students and postdocs.

It is worth pointing out that NSF has been a solid partner every step of the way on our path. This has been a key to what we have been able to achieve. Another key to our success that is vital to our future is the people interactions that we have been able to foster. We have built up a level of trust and collaboration that is essential for experimentation, integrating software, launching programs for students, and expanding our activities. Over the last two years we have built friendships that will last a lifetime. This was manifest in spring of 2003 with the outpouring of support for our colleagues in China and Taiwan as they were battling SARS, and later with the outpouring of support for people in San Diego during the fires that raged here. Larry Smarr stated in his presentation that PRAGMA was "ahead of its time" when it was started. We want to continue to lead.

More information about PRAGMA , Pacific Rim Application and Grid Middleware Assembly, can be found at www.pragma-Grid.net. More information about PRIME, Pacific Rim Undergraduate Experience, can be found at prime.ucsd.edu.