COLLABORATORIUM

The Promise of Collaboratoriums
High speed computation has provided the means to examine physical systems at a level of detail and accuracy that has made simulation a full partner with experiment. The combination of computation with large scale databases allows for the analysis of prodigious amounts of information coming from today's experiments and simulations. By themselves, these new frontiers in computing and data storage would have great impact on scientific research. When these technologies are coupled with new capabilities in communications, however, an opportunity is created that revolutionizes both the scope and process of scientific investigation. Communications capabilities enable researchers to access instruments, data, and expertise, independent of their location. Emerging technologies include desktop videoconferencing, electronic notebooks, shared work spaces (whiteboards, shared displays and visualization), remote instrument operation, distributed computing, etc.

The revolutionary nature of these advances was recognized early by Professor William A. Wulf, who chaired a 1993 National Research Council panel that published a study of the collaboratory concept entitled "National Collaboratories, Applying Information Technology for Scientific Research." While the term "collaboratory" (or collaboratorium) has often been used to only refer to the technologies, it means much more. The adoption of electronic collaboration technologies will provide geographically-distributed research groups with the same capabilities for organization, close-knit interaction, and rapid response that a single co-located group has today. It is the synthesis of advanced communications technologies and a collaborative culture that promotes effective use of unique research facilities, increased individual specialization, broader, more comprehensive, program focus, and the improved scalability and efficiency for development of solutions to complex scientific problems that will be required for the nation's research agencies to meet their mission goals.

High field NMR research is a particularly fertile ground for the development of collaboratoriums. Leading edge instruments are highly suited to operation as user facilities, with teams of researchers performing experiments and analyzing data both locally and remotely. Also, the interdisciplinary nature of many high field NMR experiments in structural biology, materials science, physiology, etc., can take great advantage of the ability of collaboratoriums to bring a wide variety of knowledge to bear on a problem, independent of the location of the experts. In the NMR community a number of collaboratory efforts have been initiated, covering a wide range of electronic collaboration capabilities, NMR virtual research facilities, and community databases. Although none of these projects is fully mature, it is clear that fully developed collaboratoriums will be indispensable components of the next generation of major user facilities for high field NMR.

Collaboratorium Success Factors
Several themes emerged from the Washington Conference that are success factors for developing collaboratoriums in scientific research, particularly involving leading edge high field NMR facilities. First and foremost, collaboratories are built upon partnerships. Willing scientific partners, who individually have something to gain from the collaboration, are essential. It is also essential, however, to establish partnerships between the domain scientists performing the research and the computer scientists who develop and support the software tools that enable collaboration at a distance. Collaboratoriums are not off-the-shelf items. To obtain the full benefit of information technology in the discovery process, the process of day-to-day NMR research must be supported by the collaboration tools, enabling scientists to share instruments and data, and to analyze, integrate and discuss results using paradigms that are familiar to them. For example, at University of Wisconsin, Madison, a computerized desktop management system has been developed for NMR, providing a flexible view of the whole lab environment. These tools enable collaborating scientists to manage the whole experimental cycle. (See Appendix I for further details.)

Ease of use is also very important in successful collaboratoriums. The main focus of NMR researchers is getting results, and not learning the intricacies of numerous computer tools. The DOE2000 Collaborative Research Environment project at Pacific Northwest National Laboratory integrates a suite of generic and specialized tools into the Habanero framework from the National Center for Supercomputing Applications, e.g., audio, video, whiteboard, screen sharing, molecule viewing. Scientists simply invoke a single session manager, and check off the tools they need. The CORE2000 framework knows how to start (or terminate) each tool locally and remotely, on heterogeneous mixes of PC, Macintosh and Unix systems.

When researchers are geographically separated, they can no longer share information in traditional ways, e.g., paper notebook. Yet capturing and sharing the stages of scientific process is vital to successful collaborations. Therefore an electronic lab notebook is crucial. Electronic notebooks make information available immediately to remote collaborators. There are real advantages to electronic notebooks since they can accept media in many more forms than a paper notebook, e.g., 3D structural models, video, instrument data streams, and they can be easily annotated and searched. Extensibility, the ability to add functionality easily, is important in electronic notebooks as different research areas have different kinds of data and different applications for analyzing data. The ability to electronically sign pages is a capability currently being addressed.

In many respects, the success of electronic collaboration in research hinges on the scientific data, especially in NMR research. A geographically distributed team's ability to share the raw NMR data and the data from analyses (e.g., structures) is vital. Files and databases accessible by a distributed team are key to many collaboratories. Domain standards for data interchange are important to streamline software development and enable better integration of NMR applications with each other and with collaboratory tools. As an example, the Nucleic Acid Database at Rutgers targets rapid retrieval and good data validation to meet user requirements for reliable structures, geometry and correlates.

Advances in interactive instrument control are also needed. NMR user facilities will perform many kinds of experiments and have users with a broad range of expertise, e.g., K-12 remote observation of chicken embryo development with the University of Illinois's magnetic resonance imaging, Chickscope. Some collaborative instruments being developed can tailor the user interface to the experiment and the user. The ability to access the NMR spectrometer securely over the internet is essential.

As the development and use of collaboratoriums expands, other success factors will undoubtedly emerge. It is already clear, however, that collaboratoriums and the development of hardware (instruments) and software capabilities will be essential to the success of the next generation of high field NMR facilities.

Combining Facilities for Complex Problems
Scientists are increasingly dealing with more complex systems that require multiple technologies and diverse expertise to understand. One striking example is in molecular structural biology. Since the mid-twentieth century our objectives have evolved from identifying and determining the structures of individual biological molecules, to seeking to understand how the genome and its entire suite of products (proteins and RNA) interact in order to achieve biological function. This goal is a heady challenge for scientists moving into the next millennium, but the promise of medical and biotechnology payoffs provides significant motivation to take it on.

In order to address problems with this level of complexity we are increasingly dependent on large scale facilities such as synchrotron and neutron beam lines, sophisticated technologies such as mass spectrometry and NMR at high magnetic fields, advances in biological technologies such as gene sequencing and molecular biology tools for protein expression and modification, including strategic isotope labeling. Ultimately, the information obtained with these experimental tools has to be analyzed and integrated using computational tools in order to develop a picture of how biomolecules function in a coordinated manner in their complex environments. In order to achieve this level of understanding it is increasingly important for scientists from different disciplines and with different expertise to be able to network, communicate, and share.

As an example, recent studies of the calcium- binding protein calmodulin and its regulatory target myosin light chain kinase (MLCK) highlight the way information from different sources must be integrated in order to gain an understanding of biological function at the molecular level. The first direct structural evidence for the autoinhibitory hypothesis of kinase activation has been obtained by combining the results of X-ray and neutron solution scattering with specific deuterium labeling of the calmodulin/MLCK complex, high field NMR solution studies of calmodulin complexed with its binding domain from MLCK, X-ray crystallographic studies of the catalytic core of the kinase, Monte Carlo techniques, and computational modeling. The activation mechanism involves calcium/calmodulin binding to a sequence segment of MLCK, collapsing about that sequence segment and pulling it away from the surface of the catalytic core while at the same time removing the neighboring autoinhibitory sequence segment. Thus the catalytic site of the kinase is exposed for substrate binding and modification. This calmodulin/MLCK interaction serves as an important model for calcium/ calmodulin activation of kinase function in general. No single technique of the suite of techniques used to solve this problem was able to provide this understanding. The ability to exchange and integrate information was key.

Challenges Facing Shared Instrumentation Facilities
The administration and staff of a state-of-the-art NMR facility have to contend with four major problems associated with user support and operations: (1) encouraging the participation of latent users and developing satisfactory distinctions between independent users and collaborators, (2) optimizing the operation of the facility to accommodate and take full advantage of the wide range of technical expertise among the clientele, (3) overcoming barriers imposed on some users by their geographical separation from the facility, and (4) minimizing the hurdles that impede adequate communication among diverse collaborators.

There exists an untapped group of latent users (i.e. scientists who have developed biological problems that would benefit from NMR investigation), who are unaware of the potential of the technique or who have been unable to surmount the technological barriers that stand in the way of their using the approach. A challenge is to find a way for them to gain access to the facility. Independent users are ones who are able to collect and analyze data with only routine assistance from the facility staff members. In fairness to the scientific staff of the facility, projects that demand large amounts of their time and intellectual input should be collaborations in which they receive a measure of credit for their contributions. Collaborations need to be embarked upon with the full agreement of all participants, and all participants must agree upon the project's scope, goals, methods, and procedures for analysis, and on the interpretation of the results. A challenge is to define these clearly.

The experience gradient covers the range between local experts on the facility staff or visitors from laboratories that specialize in NMR, but do not have the particular equipment available at home, to users who have a very interesting problem to solve, but who know very little about NMR. For experienced users and staff members a challenge is to make the instrumentation and software intuitive to use so that time is not wasted in having to learn trivial details that may differentiate one commercial package from another. In addition, thorough documentation must be easily available on the capabilities and specifications of various components of the system. Another challenge is for the local experts to fully appreciate the complexities with handling the user's samples which may be labile, pH and temperature sensitive, and very valuable.

The geographic gradient poses another set of problems. Users and collaborators span a wide geographical range from people down the hall in the same building to those located in a distant part of the United States or abroad. A major challenge then is how a limited facility staff can maximize the support for a variety of technically demanding problems, particularly when remote users cannot spend enough time in the facility to learn how to conduct all aspects of the experiment themselves from face-to-face interactions with the staff and hands-on use of the instrumentation.

Communications are another challenge to successful collaborations. Constant, effective and efficient communication is required at many levels so that:

  • all involved have a common goal and clearly understand the objectives,
  • each individual or subgroup has a well defined role (these may overlap at times) and knows the roles of others in the collaboration,
  • each individual or subgroup is kept up-to-date with developments and/or results coming from other members of the collaboration,
  • members have a general understanding of the needs and limitations of the other members in terms of the technology being used,
  • each member has access to the same relevant information, and
  • all members are aware of and understand the nature of the scientific problems so that they can contribute to their solution.

A growing body of collaborative NMR experiments, supported by a new generation of collaboration tools, has begun to demonstrate the value of collaboratoriums in high field NMR research and suggest that challenges mentioned above can be overcome. Collaboratoriums increase the accessibility, utilization, and integration of unique research facilities, adding significant value. They enhance the ability to assemble and support multidisciplinary science teams bringing NMR technology to the latent users. Finally, collaboratoriums enable new science by supporting timely interactions across a gamut of tasks, by sharing and visualization of information, and by drawing on the complementary strengths of different techniques and expertise. Emerging computing, network and NMR technologies, and close cooperation of NMR researchers with computer and information scientists draws from the strength of each domain to make these collaboratoriums possible.