A Facility Providing Flexible Processor Pools for
WWW/Java-based Distributed Applications

Yoshimasa Masuoka, Toyohiko Kagimasa, Katsuyoshi Kitai,
Fumio Noda, Jinghua Min and Shigekazu Inohara
Central Research Lab., Hitachi, Ltd.

Abstract

We propose a facility to form processor pools for WWW- and Java-based distributed applications. In the proposed facility, each user configures his or her own processor pool by using a middleware tool called the Unified Computing Environment Manager, or UCEM, thereby reducing the administrative cost to a minimum. A UCEM can temporarily share its processor pool with other UCEMs, enabling users to flexibly extend their processor pools. This dynamic sharing of processor pools facilitates efficient execution of distributed applications during client-server interaction in a WWW environment.

Keywords: Java; Load Balancing; Processor Pool; Client/Server Application

1. Introduction

The application domains of the World Wide Web (WWW) [1] and of the Java programming language [2] are extending to complex and mission-critical distributed systems, including database access, electronic commerce, and decentralized control. Intensive research and development of new technologies are underway to apply WWW and Java to these new domains. For example, Joe middleware [3] provides interoperability between Java programs and distributed objects via an object request broker like CORBA [4]. The Aglet [5] framework enables Java programs to run on the server side.

By exploring the platform-independent nature of Java, these facilities can distribute applications to multiple computers regardless of their hardware and operating systems heterogeneity. These facilities, however, do not solve the issue of gathering and maintaining processor information from the Internet and large-scale intranets.

Because there is no system that maintains a set of processors for load balancing underlying these facilities, we cannot use them to handle even simple situations like following examples:

As WWW and Java are applied to more complex distributed applications, users may often browse Web pages that contain multiple Java applets. Execution of these Java applets on one computer places a heavy load on the computer. As a result, each Java applet runs very slowly. This problem will become even worse when Web browsers are run on low-cost network computers and mobile PCs whose processors are not as powerful as desktop PCs and workstations.
The emerging use of WWW and Java in high-bandwidth intranets, i.e., replacing existing client-server applications with WWW and Java tools will bring about the same performance issue we face with conventional client-server applications: server computers will become heavily loaded if they run Java programs for many clients concurrently.

A common approach to handling these issues in a LAN environment is to put the computers connected to the network into a pool so that the clients and servers can exploit the combined processing power of the pool. While a number of studies have looked at load balancing in a LAN environment, extending load balancing to the Internet and to intranets will not be easy because of the many more users and computers. Not only will it be harder to form processor pools, but it will be expensive to maintain them.

We thus propose a new facility for providing processor pools for Java programs in the Internet and intranet environments. In our facility, each user is assigned his or her own processor pool to minimize the cost of maintaining the pools. In addition, each processor pool can be dynamically extended by merging with another pool. This makes it easy to maximize processor availability in each request from a client to a server. For example, a WWW server can use the computers available to a user making a request to the server.

2. Issues in Internet and Intranet Load Balancing

Although load balancing for distributed systems has been studied for various applications (e.g., LANs [9, 10] and parallel databases [11]), little has been done for the Internet and intranets. Load balancing for the Internet and intranets is different from that for a local area primarily in terms of scale. It is different from that for a single system because of the heterogeneity of the overall system and because of the lack of a central authority. Our system tackles load balancing for the Internet and intranets, namely, from all three aspects: its scale, heterogeneity and autonomy.

Looking at the scale aspect, in load-balancing systems for LAN environments and for parallel machines (including Condor [6], LoadLeveler [12], and TaskBroker [13]), administrators first define a "master" server and "worker" machines (or processors in parallel machines). The master gathers the load information of the workers (usually periodically) and distributes incoming CPU-intensive processes to workers to maximize the throughput of the overall system. If we apply this method to the Internet and intranets, where the "worker" machines are in the thousands or even in the millions [14], the master server will be overloaded by the information coming from the workers, or it will be unable to obtain information in real time. The net effect in both cases is that the master server is unable to level the load of the workers. Besides, if we were to attempt serving all users in the Internet or in one intranet, the administrative cost (i.e., addition and deletion of users and computers) will become prohibitive.

Looking at the heterogeneity aspect, gathering only the load information is not sufficient for load balancing in the Internet or in an intranet. The information needed to enable processes to be distributed to workers includes:

the kind of operating system (and list of commands available) on each worker's machine,
the bandwidth and latency between the worker and the user,
the worker machine's ability (or access rights) to read and write other workers' files, and
the charge if running a process on the worker is not free (for example, if we attempt to use a parallel machine in a supercomputing center).

Note that some of this information is determined only when a user's identity and location, along with the process to be executed, are given.

Looking at the autonomy aspect, even if technically possible, a central server for load balancing in the Internet or in an intranet is not realistic because of the financial and/or political bounds in these wide-area networks. It is natural for a company to disallow outside users to dispatch their processes to machines owned by the company. We thus have to consider that in the current computing environment, the set of "available" computers differs from user to user.

3. The Unified Computing Environment: the Basis for the Processor Pool

Our proposed middleware framework for wide-area and large-scale distributed systems is called the Unified Computing Environment (UCE for short). A UCE consists of a set of cooperating user-level daemons called UCE Manager (UCEM) running on multiple computers connected to a network: these UCEMs integrate the underlying computing resources (files, processors, etc.). Each set of UCEMs thus works like the distributed operating systems first explored in the mid 80's (for example, the V distributed system [5], Amoeba [15], Sprite [16], and Accent [17]). As shown in Fig. 1, the core features of the UCE are as follows.

Each user and server can construct their own distributed system by running an UCEM on each computer they are allowed to use. This enables them to transparently utilize the underlying computing resources and thus balance the load. In Fig. 1, applications A and B, executed by different users, are both running on computer 2, but on top of a different UCE. These applications are sharing the computing resources belonging to computer 2 without realizing that they are doing so.
If two applications on top of different UCEs start communicating with each other (for example, one establishes a TCP connection to the other), the corresponding UCEMs also start cooperating. These two applications can thus utilize computing resources available to the other application. In Fig. 1, if applications A and B start communicating, two UCEMs on computer 2 exchange information about their computing resources. Applications A and B can then utilize the computing resources belonging to computers 1 to 3. When the communication finishes, these UCEMs discard the exchanged information, reforming the processor pools to the original ones.

Fig. 1. Concept of Unified Computing Environment (UCE).

3.1. Basic Architecture of Processor Pool

Our facility for providing processor pools for WWW- and Java-based applications is based on the UCE. As shown in Fig. 2, a proxy server runs on top of each UCE, and a Web browser (even a commercial browser with no modification) is configured to use the proxy server as its proxy. This proxy server acts as a regular HTTP proxy server, except that it transforms Web pages written in HTML into pages that make the browser communicate with Java programs running on remote computers.

Fig. 2. Architecture of processor pool and mechanism
for balancing load of Java applets.

3.2. Usage of Processor Pools for WWW/Java-based Distributed Applications

In Fig. 2, a browser user can use the processor pool consisting of computers 1 and 2 for Java applets. When the browser sends a request for a Web page to a Web server httpd, the request is first received by the proxy server, which transfers it to the specified Web server (step 1). The httpd returns the Web page to the proxy server (step 2). The proxy server examines the Web page (written in HTML) and if it contains Java applet tags, the server transforms them into tags that invoke a plug-in software module described later. The server then sends the modified Web page to the browser (step 3). The proxy server then requests the Java bytecodes specified in the Web page and receives them from the httpd (step 4). The proxy server then determines on which computers these bytecodes should be executed. This is done by referring to the information maintained by the UCEM. After determining the computers, the proxy server invokes the required Java execution environments (e.g., a Java interpreter like AppletViewer) on the selected computers and sends the Java bytecodes to the computers (step 5).

It is easy to utilize computers shared by many users. As shown in Fig. 3, a proxy server 2 is running on top of an UCE, forming the processor pool consisting of computer 2 (and more) shared by many users. A browser and its proxy server 1 are running on computer 1 which is the private resource of the browser user. When the browser sends a request, it is sent to the server httpd through the proxy server 1 and 2 (step 1). Because this makes the server 1 start communicating with the server 2, the two processor pools are temporarily merged to one consisting of computer 1, 2 and more. The httpd then returns the Web page to the proxy server 2 which transfers it to the server 1 without transformation (step 2). Subsequent steps 3 to 5 are followed as in Fig. 2 except that the proxy server 1 selects computers from the merged processor pool. Other users, whether each has his or her own resource or not, can utilize computer 2 and more by following the same steps.

Fig. 3. Utilizing processors shared by many users.

It is also easy to share processors between clients and servers. As shown in Fig. 4, for example, a server httpd is running on computer 2, along with a proxy server for the server httpd, forming a processor pool consisting of computer 2. A browser and its proxy server are running on computer 1. When the browser sends a Java program as a request to server httpd, the request is first received by the proxy for the browser (step 1). This proxy transfers the request to the proxy for the httpd (step 2). Once the proxy for the browser establishes a connection to the proxy for the httpd, the two UCEMs on computers 1 and 2 start to cooperate (step 3). This enables the proxy for the httpd to utilize computer 1. If the proxy for the httpd decides to execute the Java program on computer 1, it invokes an execution environment for the program (step 4). Subsequent interaction between the running Java program and the httpd takes place through the network.

Fig. 4. Sharing processors between client (browser) and server (httpd).

In these cases, access controls over processors remain reasonable. Note the following points:

The proxy for each user cannot utilize processors not allowed to use because he or she can execute UCEMs only on computers he or she has a permission.
Proxy servers can utilize other processor pools only if they are communicating with another server running on top of a different processor pool.

4. Implementation

In this section, we describe the implementation of the UCE Manager (UCEM) and the plug-in software module.

4.1. UCE Manager

As shown in Fig. 5, in the current implementation, each UCEM has a resource table, containing the following information on computers inside its processor pool:

the current load on the computer and
the network addresses of any cooperating UCEMs.

Because UCE constructs a per-user environment for load balancing, including more information in the UCEMs poses no problem. For example, future UCEMs could maintain such additional information as the kind of operating system, the bandwidth and latency between the worker and the user, the worker machine's ability to read and write other workers' files, and the charge for the worker. Each of the cooperating UCEM updates its own resource table by multicasting new load information (if any) to all other cooperating UCEMs, referring the resource table in itself.

Because there may be multiple UCEMs that are not cooperating with each other running on the same computer, another daemon, called the Local Monitor (LM), runs on each computer, to provide the load information for that computer. This keeps the load information for all UCEMs running on that computer the same. When a UCEM knows that the load on the computer will change (because the UCEM has invoked a Java execution environment on the computer), it notifies the LM of the load change (step 1), and the LM multicasts it to all UCEMs running on the same computer (step 2).

Fig. 5. Current implementation of UCEM.

Two processor pools can share each other's resources by exchanging their resource tables. If a proxy server establishes a TCP connection to another proxy server, it gets the resource table from the corresponding UCEM and send it to the other proxy server. It then receives the resource table from the other server and sends it to the corresponding UCEM. The UCEM then temporarily appends the received resource table to its own resource table.

4.2. Browser Plug-In

When balancing the load of multiple Java applets in a single Web page (as in Fig. 2), a plug-in [8] module is used to let the browser communicate with Java applets on remote computers and thus let the browser user interact with the applets. This module acts as a proxy for the user interfaces: it receives bitmap images from the actual Java programs and transfers them to the underlying window system, and it takes the mouse and keyboard input and transfers it to the Java programs. As described above, the proxy server shown in Fig. 2 does some transformation of the HTML received from Web servers in order to invoke the plug-in module. What the proxy server really does is replace HTML '<APPLET>... </APPLET>' tags with '<EMBED> ... </EMBED>' tags. The invoked modules listen to TCP ports, and wait for actual Java applets to start running and connect to the port. The port number is transferred from the invoked modules to the proxy server, then to the invoked Java programs.

4.3. Current Status

We have almost completed the UCEM and plug-in modules. The current prototype is being built on UNIX desktop workstations, with a cluster parallel machine as a processor pool. The prototype includes a set of UCEMs on these hardware platforms, a WWW proxy server that communicates with the UCEMs and distributes the Java applets, a small program that captures the visual output of the Java applets, and a Netscape plug-in module that accepts output from remote applets and displays it. Fig. 6 is a sample screen snapshot of our current prototype. The browser displays five Java applets, each of which runs on a different processor in the processor pool and computes a 3D image of a rotating DNA helix.

5. Conclusion and Future Work

Our proposed facility provides flexible processor pools for WWW and Java programming environments. Load balancing by forming and utilizing processor pools will become an important solution to the performance issues currently facing WWW and Java applications. The facility described in this paper will work for complex WWW- and Java-based applications because of its minimum administration cost and overhead, and because its flexibility maximizes the availability of processors by sharing resources between clients and servers.

Currently we are working on an advanced prototype, but there are many issues still remaining to be solved. One of them is security. In particular, authentication facility suitable to UCE is needed.

Also needed is a quantitative study of administrative overhead this facility involves. We believe that the cost of this overhead will be negligible, but this conclusion is based only on qualitative considerations. It remains to be proven in a real-world environment.

Fig. 6. Screen snapshot of WWW browser displaying
output of remote Java applets.

References

[1] T. Berners-Lee, A. Cailliau, A. Luotonen, H. Nielsen, and A. Secret, "The World-Wide Web," Comm. ACM, vol. 37, pp. 76-82, 1994.

[2] J. Gosling and H. McGilton, "The Java Language Environment, a White Paper," May 1996, <http://java.sun.com/doc/language_environment/>, (10 Dec. 1996).

[3] Sun Microsystems Inc., "Joe: Developing Client/Server Applications for the Web," <http://www.sun.com/solaris/neo/whitepapers/Joe-wp-new.html>, (10 Dec. 1996).

[4] D. B. Lange and D. T. Chang, "IBM Aglets Workbench," Sep. 1996, <http://www.ibm.co.jp/trl/aglets/whitepaper.htm>, (10 Dec. 1996).

[5] R. Ben-Natan, "CORBA: A Guide to Common Object Request Broker Architecture," McGraw-Hill, 1995.

[6] M. J. Litzkow, M. Livny, and M. W. Mutka, "Condor - A Hunter of Idle Workstations," Proc. 8th Int'l Conf. on Distributed Computing Systems, IEEE, pp. 104-111, 1988.

[7] D. R. Cheriton, "The V Distributed System," Comm. ACM, vol. 31, pp. 314-333, 1988.

[8] Netscape Communications, Co., "The Plug-in Developer's Guide," <http://home.netscape.com/eng/mozilla/3.0/handbook/plugins/pguide.htm>, (12 Jul. 1996).

[9] N. G. Shivaratri, P. Krueger, and M. Shinghal, "Load Distributing for Locally Distributed Systems," IEEE Computer, vol. 25, no. 12, pp. 33-44, 1992.

[10] J. M. Smith, "A Survey of Process Migration Mechanisms," ACM Operating Systems Review, vol. 22, no. 3, pp. 28-40, 1988.

[11] R. Alonso, D. Barbara, and H. Garcia Molina, "Data Caching Issues in an Information Retrieval System," ACM Transactions on Database Systems, vol. 15, no. 3, 1990.

[12] IBM Corporation, "What's new in LoadLeveler 1.2.1 -- technical summary," Sep. 1996, <http://www.rs6000.ibm.com/software/sp_products/llnews.html>, (10 Dec. 1996).

[13] Hewlett-Packard Inc., "HP Task Broker for HP 9000 Servers and Workstations Version1.2," Oct. 1995, <http://www.hp.com/wsg/ssa/task.html>, (10 Dec. 1996).

[14] Network Wizards Inc., "Internet Domain Survey, July 1996," Jul. 1996, <http://www.nw.com/zone/WWW/report.html>, (10 Dec. 1996).

[15] S. J. Mullender, G. van Rossum, A. S. Tanenbaum, R. van Renesse, and H. van Staveren, "Amoeba -- A Distributed Operating System for the 1990s," IEEE Computer, vol. 23, no. 5, pp. 44-53, 1990.

[16] J. Ousterhout, A. Cherenson, F. Douglis, M. Nelson, and B. Welch, "The Sprite Network Operating System," IEEE Computer, vol. 21, no. 2, pp. 23-36, 1988.

[17] R. F. Rashid and G. Robertson, "Accent: A Communication Oriented Network Operating System Kernel," Proc. 8th Symp. on Operating System Principles, pp. 64-75, 1981.

Return to Top of Page
Return to Posters Index

A Facility Providing Flexible Processor Pools for WWW/Java-based Distributed Applications