HAWA: A Client-side Approach to High-Availability Web Access


Yi-Min Wang
AT&T Labs, Research
ymwang@research.att.com
P. Emerald Chung
Bell Laboratories
Lucent Technologies
emerald@bell-labs.com
Chih-Mei Lin
AT&T Labs, Research
cmlin@research.att.com
Yennun Huang
Bell Laboratories
Lucent Technologies
yen@bell-labs.com


Abstract

In this paper, we describe a client-side applet-based approach, named HAWA, to improving the availability and quality of Web accesses. HAWA allows users to bookmark any HTTP requests, organize them into groups of equivalent services, and invoke the services through the group names. With the built-in mechanisms for automatic retry and parallel accesses, HAWA can mask access failures, provide fast responses, and present multiple responses in a customizable fashion. Several examples are used to demonstrate the practical usefulness of this approach. An implementation using applet-based filtering is described.

1. Introduction


The World Wide Web has become a primary source of information for our daily lives. We store in our browser bookmark files those URLs that we frequently access for information such as stock quotes, news, weather forecasts, product prices, etc. With the explosive growth of popularity of the Web, however, we may not be able to obtain the information we want within a reasonable response time either because the Internet is congested or because the Web servers are overloaded. It is therefore important to investigate the availability issues of Web accesses. Consider the scenario where you eagerly want to see the current prices of the stocks that you own, but the quote server is simply not responding. What can be done to improve such a situation?

From the client side, we can view the entire Web as a slow and unreliable information server with heavy redundancy . There are at least three things we can do to improve the quality of Web accesses. First, since server unavailability are sometimes transient problems, automatic retries can relieve users from the frustration of having to repeatedly submit the same request and see the same error messages. Second, for any kind of popular information, it is almost guaranteed that there will be multiple Web sites providing that information. Automatically retrying another equivalent site when one site is not responding can greatly improve availability. Better yet, the response time can also be improved by issuing parallel accesses to multiple equivalent sites at the same time, and presenting to the user the first reply that comes back. Third, not every service unavailability can be automatically detected. For example, it is not uncommon for a Web server to reply a normal HTML page containing arbitrary error messages. In this case, the best solution is to present to the user all the responses from multiple parallel accesses, and let the user decide which response to use. As demonstrated later, presenting side-by-side the responses from multiple equivalent sites also has many other advantages.

In this paper, we describe the design and implementation of HAWA (High-Availability Web Access) , a client-side approach to providing high availability and ease of access. HAWA allows a user to organize URLs that provide similar information into a group. Once a group is specified, the user can then access the information using the group name, and select the capabilities of automatic retry or parallel accesses .


2. The Design of HAWA


HAWA consists of a registration applet , an access applet , and a few auxiliary HTML files. The target application environment is for internet service providers to bundle HAWA with the browser software that are sent to the customers and installed on the customers' machines.

2.1. Registration

The first step in using HAWA is to access the registration page which contains the registration applet. This page allows a user to create groups, add URLs to groups, and specify retry parameters. In addition, it supports the following three enhanced bookmark functions:

2.2. Access

After the registration, the user can go to the access page to invoke HAWA-enabled service. The access page consists of two frames: an access frame and a data frame. The access frame contains the access applet which displays the existing groups, the URLs in the selected group, and the available access modes, as shown in the top frame in Figure 2. After the user clicks on the HAWA Access button, the access applet is responsible for sending out requests according to the specified access mode, and finally displaying the response in the data frame below the access frame. (In Figure 2, the data frame consists of the three lower subframes.) HAWA provides four basic access modes to address the issues discussed in the Introduction.





Figure 3: Parallel-all access mode for on-line shopping.

3. Implementation


3.1. Applet-based request/response filtering

A basic mechanism used in the implementation of HAWA is to intercept and filter HTTP requests and responses. For example, a fragment tag needs to be inserted to a response page to enable auto-scrolling; any outgoing POST request may need to be captured for registration. A natural way to perform request/response filtering is to inject a proxy server that sits between the client and the server, and intercepts every HTTP request and response. An earlier version of HAWA was built using such a proxy-based implementation. It was later migrated to the current applet-based implementation for the following two reasons. First, HAWA is not intended for improving the availability of arbitrary HTTP requests. It is therefore desirable to activate HAWA only when the user requests HAWA-enabled services, without getting in the way of the user's other browsing activities. Applet-based implementation allows HAWA to be activated only when the registration or access page is being accessed. Second, since the target users are customers of commercial internet access providers, it is much easier for them to go to the HAWA pages to access the service rather than to start a separate proxy server on their PCs.

Figure 4 shows the architecture for applet-based filtering used in HAWA. In the access page, the access applet starts a thread which opens a server socket at port number 8282, for example, that listens for requests coming from the browser. Another thread sends out requests and filters responses according to the user selections. When the final response is ready, the applet invokes the showDocumet() call with the URL argument http://localhost:8282/ . The effect is that the browser will send a request to the server socket, and the applet then supplies the final response through that socket to be displayed in the data frame. If security is of concern, the URL can also contain a one-time password generated by the applet. When the server socket receives a connection request, it checks the client IP address and verifies the password to make sure that it is indeed the containing browser making the request.




Figure 4: Applet-based filtering. (The numbers in parentheses indicate the order of events.)

To register a POST request, the user first types in the URL of the site providing the POST form in a text field inside the registration frame. The registration applet calls showDocumet() to ask the browser to fetch the URL and display it in the data frame. In addition, the applet opens a server socket at port number 8383 and gets ready to act as a temporary HTTP proxy to intercept outgoing requests. After the user fills out the form in the data frame, he changes the browser proxy setting by pointing the HTTP proxy to localhost:8383. When he hits the Submit button, the browser sends the entire request message to the proxy socket where the applet receives it and displays it in a text field in the registration frame. The user can then click the Register button to save the request message together with the retry parameters in a group. If the user would like to use the variable substitution feature, he can edit the content of the request message before clicking the button. If desired, the same process can also be used to register GET requests.

3.2. Implementation of access modes

When one of the retry modes is selected, the access applet tries to open a socket connection to the requested site. It also starts a timer thread based on the user-specified timeout value. If the connection request results in an I/O exception or if the timer expires, another connection request is automatically initiated. When the applet successfully receives a response, it performs necessary filtering and supplies the response to the browser through the server socket. The Parallel-any access mode follows a similar procedure except that multiple threads are created, each starting a connection attempt to one of the equivalent sites. Upon the first successful connection, all the other threads are destroyed.

When the Parallel-all(f) option is selected, the applet first supplies the browser with a multi-frame HTML page with the number of frames equal to the number of URLs in the selected group. If auto-scrolling is not specified for an URL, the corresponding frame tag contains that URL which is then directly fetched by the browser without filtering. If auto-scrolling is specified for an URL, the frame tag contains the file name of an empty page and defines the frame name. The applet fetches the URL, parses the response to find the user-specified keyword, and inserts an HTML fragment tag. It then issues a showDocument() call to ask the browser to overwrite the empty frame with the filtered response and scroll to the fragment tag. The Parallel-all(w) mode is implemented by calling showDocument() with undefined frame names.

4. Related Work

Several techniques have been proposed to provide transparent fault tolerance and load balancing by using more than one machines to serve the same URL [1,4,7,9]. This is similar to the idea of Parallel-any. One difference is that these techniques are developed from the page providers' point of view, and the emphasis is on providing information from any of several hosts serving exactly the same content. In contrast, HAWA is developed from the users' point of view, and the emphasis is on obtaining information from any of several hosts serving approximately equivalent contents as defined by each individual user. Moreover, most of those techniques use server-side approaches, while HAWA is a client-side approach. The Smart Client scheme [9] has a similar flavor as HAWA's Parallel-any in that it also provides a Java applet at the client side to dynamically select one of several equivalent sites. However, it is still a service-centric approach: the service provider provides the applet, the name resolver within the applet uses a service-specific mechanism, and the load and fault tolerance information are also service-specific. In contrast, HAWA is a user-centric approach: the applets that HAWA provides are not tied to any specific services, and the group information are all defined by the user.

The Internet Engineering Task Force (IETF) has been working on the issue of location-transparency for many years. The Uniform Resource Name (URN) scheme has been introduced to provide a location-independent naming mechanism [10]. Again, the URN architecture mainly allows a service provider to change the mapping between URLs and the resource names. HAWA's approach can be viewed as providing a personalized URN scheme.

A study by Crovella and Carter [6] showed that the access latency to a given site is not strongly correlated to the physical location or to the number of hops between the client and the server. Given a list of similar services, dynamic selection based on polling in general outperforms static selection. This study confirms that using the Parallel-any access mode on a group of URLs in general provides a better response time than always accessing a particular URL.

Web sites that provide one-stop shopping have a similar flavor as HAWA's Parallel-all. For example, the Computer ESP Web site provides a one-stop shopping for computer software and equipment [5]. It is implemented by indexing several on-line stores and providing a single search engine to the end users. The search engine is set up as a cgi-bin program at their Web site. Compared to HAWA, this kind of domain-specific cgi-bin-based approach to Parallel-all can usually provide a better integration of the responses from multiple sources. However, it is domain-specific and not generally applicable. Also, the availability of the Web site that provides such a service greatly affects the availability of information.

The paper by Ladd et al. [8] described MHTML as an extension to HTML for defining Multi-Head/Multi-Tail (MHMT) links. A multi-headed link points at multiple nodes all of which are opened in separate browser windows when the link is followed. This is similar to the Parallel-all(w) option. Compared to HAWA, the MHTML scheme provides Parallel-all for general HTTP links, instead of specific bookmarked groups. However, it requires an enhanced browser to understand the MHTML extension in order to provide this general functionality.

Finally, proxy-based services for providing value-added content transformations have been quite popular and successful [2,3]. As discussed earlier, HAWA was migrated from a proxy-based implementation to the current applet-based implementation because its focus is on providing value-added filtering only for user-specified URLs, and it needs to be tightly integrated with the browsers for the target application environment.

5. Conclusions

We have defined four access modes for obtaining information from groups of URLs that provide similar services. The Same-site retry mode is useful for masking transient access failures. The Sequential retry mode allows the users to specify a personalized list of backup sites for each type of information. The Parallel-any mode provides a single-service-name image for highly available and fast Web accesses. The Parallel-all mode presents multiple responses in a customizable way to enable the best interpretation of all available information. In addition, we have identified several enhanced bookmark functions such as POST request registration, variable substitution, and auto-scrolling, that further provide flexibility and convenience. We have implemented these functionalities in HAWA using an applet-based filtering architecture to provide ease of use through tight integration with the browsers.


References

1
E. Anderson, D. Patterson, and E. Brewer. The Magicrouter, an Application of Fast Packet Interposing, USENIX Symposium on Operating Systems Design and Implementation (OSDI), 1996.

2
C. L. Brooks, M. Mazer, S. Meeks, and J. Miller Application-specific Proxy Servers as HTTP Stream Transducers, Fourth International World Wide Web Conference, 1995.

3
C. L. Brooks, Wide Area Information Browsing Assistance Final Technical Report.

4
Cisco Local Director, http://www.cisco.com/warp/public/751/lodir/index.html.

5
Computer ESP Search site .

6
M. E. Crovella and R. L. Carter, Dynamic Server Selection in the Internet, In Proceedings Third IEEE Workshop on the Architecture and Implementation of High Performance Communication Subsystems, Aug 1995.

7
T. Kwan, R. McGrath, and D. Reed. NCSA's World Wide Web Server: Design and Performance, IEEE Computers, pp. 68-74, Nov. 1995.

8
B. Ladd, M. Capps, P. Stotts and R. Furuta, Multi-Head Multi-Tail Mosaic: Adding Parallel Automata Semantics to the Web, Fourth International World Wide Web Conference, 1995.

9
C. Yoshikawa et. al., Using Smart Clients to Build Scalable Services, USENIX '97.

10
C. Weider and P. Deutsch. A Vision of an Integrated Information Service.




Return to Top of Page
Return to Posters Index