www.gridcomputingplanet.com/features/article.php/1573391
|
By Paul Shread January 22, 2003 Last week's alpha release of the Globus Toolkit 3.0 was a big step toward a Grid infrastructure standard, but Grid technology will need to provide even more advanced features to address the needs of commercial IT systems, according to a paper by Fujitsu researchers. The paper, "OGSA Fundamental Services: Requirements for Commercial GRID Systems," was authored by Hiro Kishimoto and Andreas Savva of Fujitsu Laboratories and David Snelling of Fujitsu Laboratories of Europe. Snelling said the paper "addresses quite high level functions, where GT3 [Globus Toolkit 3.0] will be mostly an infrastructure release. You might find some of the control and information services a good start, but most of what we are looking to develop for commercial Grids will need to be built on top." Widespread use of information technology, especially easy access to broadband networking technologies, is placing new requirements on the creation of commercial IT systems, the Fujitsu researchers wrote. IT systems must provide "innovative features while ensuring high performance and high availability under unpredictable workloads in an extremely heterogeneous and distributed environment," they said. "Systems must be developed at high speed with short development cycles and provided at low initial cost so as to reach the maximum number of IT system customers." Systems Integrators, Internet Data Center Administrators Face Issues System integrators and Internet Data Center (IDC) administrators face a number of issues, the paper noted. For system integrators, constructing heterogeneous systems is very difficult. Problems include making end-to-end performance predictions and guarantees, ensuring 24/7 availability, provisioning so as to avoid the Internet spike problem, and responding to frequent service specification changes. "It is almost impossible to design and construct robust IT systems in a timely manner," Kishimoto, Savva and Snelling wrote. "Often robustness suffers; hence we see almost weekly reports of service disruptions due to malfunctioning IT systems." For IDC administrators, IT system customers are interested in end-to-end Service Level Agreements (SLA), such as agreements based on the number of end users' requests processed per second or the maximum response time to end users' requests. "Unfortunately, with current tools it is virtually impossible for IDC administrators to determine what resources are needed to ensure that a given SLA is guaranteed," the Fujitsu paper said. "Often the result is over-provisioning." In order to minimize the total cost of ownership, IDC administrators try to increase the utilization ratio of their own resources. However, since they do not have effective predictive tools, the actual utilization ratio is often less than one-third, one-fifth, or even worse, Kishimoto, Savva and Snelling said. Some resources are reserved for failover and provisioning and so are not put to productive use. "It should be possible to share such resources among multiple systems, with physical location not being the single determining factor whether sharing is possible or not," the paper said. IT systems provide infrastructure essential to daily life. "Undisrupted operation must be ensured even in the event of disasters such as earthquakes, fires, or acts of terrorism," Kishimoto, Savva and Snelling wrote. Independent but networked IDCs can be used to provide the necessary physical infrastructure, they said, but the technologies to provide transparent and seamless operation across such an environment are not yet available. OGSA Could Provide A Solution Several cutting-edge technologies and products already in the market attempt to solve one or more of these challenges, they said, citing IBM's Oceano Project, Sun's N1 vision, and Terraspring (acquired by Sun). But they added that "such attempts take a proprietary approach and have limited scope." The Open Grid Services Architecture, which Globus Toolkit 3.0 implements, "is an open, extensible, and comprehensive architecture, which can be used to address the difficult problems described so far," Kishimoto, Savva and Snelling wrote. "In particular, commercial IT systems, with all their differences from scientific IT systems, should form one important Service Domain of OGSA." Part of the solution could be a common hosting environment on top of various hardware and OS platforms, the paper said. Although not a requirement of OGSA, implementations may be based on a Java hosting environment. Commercial IT systems, however, need more than a common hosting environment, they said. Commercial IT systems need support for at least three kinds of job requests, the paper said. Java Program: Business process applications written in Java. The requested job will run as an EJB component in an EJB container hosting environment or as a Java Servlet in a Servlet Container. Most newly written business process applications are expected to be of this type. Job lifetimes vary. Batch Process: Current business process applications also include batch type jobs including periodic ones, such as monthly summary report creation or employee payment transfers. In some cases, workflow management is also required. Requirements may be the same as for job execution using the GRAM interface of Globus Toolkit 2. Job requests of this type have relatively short lifetimes. System composition: An entire business process application may be submitted as a single request. Such requests require the composition of a system suitable for running the application and deployment of the application on it. IDC administrators may submit this kind of request. Such requests are infrequent and are expected to have very long lifetimes. In addition to handling the above job requests, the Fujitsu paper said commercial IT systems require the following OGSA characteristics: -Heterogeneous environment support, such as a variety of hardware, operating systems, and applications; -Scalable and hierarchical organization; -Open standard interfaces; and -Ability to overlay existing, multiple, underlying resource instrumentations (SNMP, CIM). With respect to functionality, the paper said commercial IT systems need the following services: System configuration management: The OGSA Common Resource Model is used to model the system and its resources. Generated events are handled according to user set policy. Job execution management: Time, priority, and space based scheduling of jobs. In case of application failure, jobs are retried based on applicable policy. Resource management: Dynamic and flexible resource management is essential. At the same time, resource isolation between different jobs is crucial, not only for access control but also to ensure that there are no unexpected performance dependencies. Autonomic Management: Clustering features to handle resource failures and provisioning features for adaptive resource allocation should be provided to enable autonomic management. The actual behavior should be based on client provided policies. Infrastructure services: user management, accounting management, logging and tracing. The paper addresses these needs in detail. These "fundamental services," the authors concluded, "are required for effective Grids in a commercial environment."
|