Journal reference: Computer Networks and ISDN Systems, Volume 28, issues 7–11, p. 1105.

Enhanced Graph Models in the Web: Multi-client, Multi-head, Multi-tail Browsing

Michael Capps, Brian Ladd, P. David Stotts
University of North Carolina at Chapel Hill

Abstract

One key to the wide and rapid acceptance of the World Wide Web is the simplicity of its model. We see this in its short-lived connections and generally stateless servers, as well as in the relationship between nodes (HTML pages) and embedded links. Though these factors contribute to the Web's success, they also come at a cost: precise control of how documents are presented to the user is beyond this basic model.

Richer graph models permit authors to "program" the browsing behavior they want readers to see by turning the hypertext into a hyperprogram with specific semantics. Multiple browsing streams can be started under the author's control and then kept in step through the synchronization mechanisms provided by the graph model. Our current work adds a Semantic Web Graph Layer (SWGL) which allows dynamic interpretation of link and node structures according to graph models.

As a motivating example of the utility of the SWGL, we have chosen to implement the graph model for Colored Petri Nets (CPNs). The previous MMM project [LC95] implemented a limited subset of the Petri net model to give Web authors the ability to control concurrency and synchronization in a single reader's browsing session. CPNs extend this protocol to give control of multiple readers in a like fashion.

This paper details the SWGL and its architecture, some sample protocol implementations, and the latest extensions to MHTML [LC95] that were necessary to support these enhancements.

Keywords

Hypertext, hypermedia, WWW, World Wide Web, Petri nets, collaboration, CSCW, graph semantics, protocol, MMM, SWGL, HTML extensions

Introduction

Precise control of how documents are presented to the user is beyond the basic model of the Web. One can see the problem, and its ad hoc solutions, at many levels. The use of monolithic documents and linear linking to ensure a certain browsing path is used at the expense of clarity, and indeed, loss of the motivating concept of hypertext. Complex server workarounds are built to allow storage of state in a system with a stateless communication protocol, and recently we see plug-in technology to render pages in more complete description formats than HTML. Often only extensions to Web protocols permit authors greater control over readers' browsing options [LC95].

Richer graph models permit authors to "program" the browsing behavior they want readers to see by turning the hypertext into a hyperprogram with specific semantics. Multiple browsing streams can be started under the author's control (not unlike the Netscape "FRAME" extension) and then kept in step through the synchronization mechanisms provided by the graph model.

Our previous work with Multi-head Multi-tail Mosaic [LC95] extended the Web graph model to include the programming power and semantics of parallel finite automata (PFA). This current work extends the browsing streams which can be synchronized beyond those of a single user. Authors can now write documents which keep multiple users' browsing paths synchronized.

Coordinated multi-user browsing streams have obvious utility. A trivial example is a two-player, head-to-head game that requires synchronization between the browsing streams of two separate processes. This behavior can be specified with a simple multi-user graph model such as Colored Petri Nets (CPNs).

Rather than just increase the power of a single extended graph model, our current work also focuses on inserting a semantic graph layer in the browser. The layer provides an API for addition of new graph models, which allows the browser to dynamically reinterpret links and nodes within a new model. The author may suggest, and the reader may choose from, multiple graph models such as PFAs, monochromatic Petri nets, CPNs, And/Or graphs, the Dexter Hypermedia model [HS94], and so forth.

The remainder of this paper is laid out as follows:

Related Work - Addresses work in the hypertext/hyperprogramming field.
Semantic Web Graph Layer - Explains the graph layer concept and its architecture.
Multi-Client Models - Discusses how multi-client graph models are handled and presents a scenario to show their usefulness.
Implementation - Explains how HTML and browsing clients were extended to handle the semantic graph layer.
Conclusion and Future Work - Examines future research directions and conclusions gathered so far.

Related Work

From Vannevar Bush [B45] and his trail-blazers, to collaborative Web surfing projects such as the Sociable Web [DR94] and collaborative authoring tools such as CoReview [MA-W95], the concept of distributed cooperation using hypertext has a long history. These projects attempt to permit cooperation among users in annotating hypertexts, to enable "shoulder-surfing" (in which the contents of one reader's browser reflect that of another's), and the collaborative authoring of hyperdocuments. Our current work supports a more extended model of collaborative reading than shoulder-surfing, permitting document authors to describe the interaction they desire multiple readers to have when browsing their hypertexts.

Permitting authors to script a reader's interaction with a hypertext has been tried before in different ways [T88, Z88]. The current work is similar in approach to Trellis [SF89] where hypertexts are designed as Petri nets with specific browsing behaviors programmed into the network. The act of browsing is then the act of marking the net and firing activated transitions. Our graph layer generalizes this approach by offering authors a number of options for how their nodes and links are to be interpreted.

Our work offers a way to provide multi-head and multi-tail links. One recent non-standard Netscape extension to HTML, Frames [N96b], offers multi-tail capabilities. The functionality of frames is subsumed by our work; we have more complete synchronization between multiple browsing streams and can coordinate the browsing of multiple readers.

Semantic Web Graph Layer

Figure 1: SWGL architecture

Description and Motivation

The Semantic Web Graph Layer (SWGL) is an interpretive layer which filters the interaction between the reader and the content of the Web. This is where the various graph models mentioned above are implemented (see Figure 1). This section will describe the graph layer in greater detail, with an eye toward architecture, its interface, and its extensibility.

The SWGL interprets an extended version of HTML (see Language Extensions below), filtering the interaction between the reader and the Web content in the context of the current graph model. This means nodes and links can be dynamically interpreted as if they were part of a monochromatic Petri net, a parallel finite automata, or even part of the Web graph model. Setting the current graph model is usually handled by the document's author though these settings can be overridden with the SWGL user interface.

What we are calling graph layer protocols are descriptions of specific graph models combined with methods for properly rendering pages in that graph model for the reader. For example, the parallel finite automaton modeled in the original MMM project required different text colors to represent the difference between active and inactive transitions; a PFA protocol would combine that sort of information with a specification of PFA semantics.

This extension of the MMM architecture yields great flexibility for graph theory research. Significant work has already been invested in analyzing graph semantics, and a user-definable, dynamically-switchable interpretive layer offers a significant tool for such research. As well, the SWGL has a straightforward API which facilitates trial specification and implementation of experimental graph protocols. The original MMM project's PFA protocol was based on a specific driving problem in computer aided instruction, as is the CPN for this project, and we believe both graph protocols have proven particularly useful.

The SGWL allows us to combine features of existing semantics with others to investigate additional possible protocols. And in addition, the SWGL was implemented so that its use requires only a version of Netscape, a popular web browser publicly-available on a variety of computing platforms. It is hoped that this easy availability will allow the SGWL to be used as a tool to catalyze graph research projects that we have not envisioned.

Architectural View: Filter Stream

Freely combining graph semantics uses the fact that the SWGL is implemented as a stream of filters which interpret extended HTML as graph content. Filters can be composed, the exact order and number of filters being the specification of a particular graph layer protocol (see Figure 2). Protocol filter streams have been explored in other collaborative tools such as XPEL [M93].

As an example of the power of filter composition, consider that the difference between PFAs and Petri nets is that PFAs do not keep track of multiple markings of a given place; additional tokens are simply discarded. Given the specification of a PFA graph layer protocol, the Petri net protocol requires only the addition of a "token counting" filter to the filter stream implementing the PFA graph layer protocol.

Figure 2: Filter streams implementing graph protocols

We exploit the generally hierarchical nature of the relationship between graph semantics to allow rapid extension of available protocols. The above graphs, as well as CPNs, all share a basic digraph with active/inactive transitions and marking modes. Adding protocols for completely unrelated graph types would still require the addition of a great many filters, but that addition in turn gives a starting point to achieve all protocols which are closely related in the graph hierarchy. In general, the addition of each filter will nearly double the number of available protocols.

Application Programmer's Interface

The Application Programmer's Interface (API) was designed with this extensibility in mind. The entire SWGL is implemented in the object-oriented C++ language, which allows flexible inheritance for filter types in new protocols and convenient re-use of existing protocols. The API is also complete with online documentation, in hopes that other users of the system will be able experiment with protocols with a minimum of programming expense.

Currently, the API's function categories include groups for tasks such as HTTP communications, MHTML parsing, window management, and protocol interpretation (e.g. token counting). The details of specific functions are not within the scope of this paper, but the documentation is included in the public release of the SWGL package.

Multi-Client Models

Concurrent browsing, having multiple browsing paths active at the same time, suggests a logical next step of synchronizing multiple readers using multiple browsers. This addition of synchronization between multiple, remote browsers is an important step in this research and that protocol was a primary motivation for the construction of the graph layer.

Petri Nets

Petri nets are a model of computation used to describe parallel processes. Petri nets are bipartite graphs with place nodes and transition nodes connected by links. Links are constrained to connect only either a place to a transition or a transition to a place. Places can contain tokens, which are markers representing the state of the computation. In a traditional monochromatic Petri net, when all of the places pointing into a transition contain at least one token, the transition is active. Computation in the Petri net proceeds by firing an active transition, removing one token from each of the incoming places and adding one token to each of the outgoing places from the fired transition. Figure 3.a shows a simple Petri net; the transition is active because the incoming places are all marked.

This model of computation can be applied to browsing a hypertext [SF89] document. In Web terms: a page is a place, a page is made visible in our browser when it has at least one token, and our specialized Multi-HTML constructs are transitions, complete with the list of links into and out of the transition. The previous work used a slight simplification of this interpretation, parallel finite automata, that are like Petri nets except that they do not track any more than a single token in a place.

An interesting extension to monochromatic Petri nets is the Colored Petri Net (CPN). Here there are different classifications of tokens, known as colors, and transitions can have more complicated formulae for activation such as "A token in each of the two input places, but they must not be of the same color." The transition in Figure 3.a would not be active if this rule were added, but the transition in Figure 3.b can be fired because the colors of the tokens are different. In Figure 3.c the rule is modified to make the color requirements constant, so the transition is no longer active. Our enhanced browser interprets each person in a session (see Sessions below) to use different-colored tokens. Hence the author can specify the desired browsing semantics for a group by describing the transition activation rules.

Figure 3: Petri net transitions

Sessions

Sharing running applications is a common but difficult problem in distributed computing [H92]. Sharing can be done by sharing the view of a single running program [MA-W95] or by keeping two copies of a program synchronized on equivalent input streams [DR94]. We use the second approach, keeping the multiple browsers synchronized by pointing them at appropriate URLs in the Web space.

A standard problem in synchronous collaboration is "finding" others to collaborate with. Our approach is to group users into collaborative sessions; sessions are maintained by a centralized session manager. A session, then, is a group of running browsers which can communicate among themselves and be synchronized. The session manager handles the allocation of token colors to the various browsers, listing members of a session, and changes to that list during the session. It is implemented as a centralized database server through which the group members communicate distributed events. The manager linearly orders those events to prevent users from taking conflicting actions. Each action that could change the state of the collaboration is checked and ordered, and only if the server deems it acceptable is it then broadcast and applied. Though this causes a minor delay between the time a user clicks on a link and the time the display is updated, we rely on the fact that users of the Web are quite accustomed to such delays and that communication delay may in fact be subsumed by page fetch time.

Scenarios

Given the above descriptions, we will now revisit the simple two-player game example from the introduction. This illustration is sufficient to show how the system works with CPNs. Our research is aimed, however, at exploring how this system can be used in classroom-based computer-aided instruction situations. The Trellis [SF89] project has shown how CPNs can be used to program hypertexts for use in moderated meetings, more complex gameplaying, and other learning tasks. The graphs for those tasks, however, are too complicated to be instructive about CPNs within the scope of this paper.

The CPN for a simple game is shown in Figure 4. Places are shown as circles, transitions as rectangles, and links leading from a place to a transition are marked with the rules for activating that transition. The variables on a transition can be matched with any color (or color constants can be added to the expressions to bind one of the colors to a particular color).

When two potential opponents in the same session enter the Find Opponent page, the Start Game transition becomes active and is displayed as a click-able hyperlink in both browsers. Either opponent clicking it will advance both to the Take Turn page; note that this does mean that the behavior of one user affects the state of another. As each player commits his move, she clicks on the Finish Turn transition and advances to the Awaiting Next Turn page. When both are finished, the Start Next Turn transition becomes active and they can advance. Note that an extension to CPN called timed CPNs (similar to the Netscape client-pull constructs [N96a]) which would fire the activated Start Next Turn transition automatically. The Trellis project contained timing on arcs [SF90, SF91], but we have not yet implemented timed CPNs for the MMM project.

Figure 4: CPN for two-player game

Implementation

Programming

The first MMM project's most basic requirement was that it be easy for users to integrate into their systems. In order to display even the standard Web graph differently, changes to the Web client were unavoidable. However, it was possible to avoid altering the server by keeping the additional multi-link information in the HTML pages. This had the added benefit of keeping link control in the hands of the author, the original goal of the project, rather than in server functions. The great majority of active web browsing software was downloaded from the Internet, so we believed that users might be comfortable downloading our modified client.

Since the time of previous implementation, advancements in browsers have allowed us to do our window management using standard HTML techniques rather than client modifications. The Frames syntax developed by Netscape allows our specially-prepared pages to have controls for history-list traversal and for each content node to be displayed in a distinct "window". The SWGL has been developed as a proxy server rather than a client modification; with caching disabled, all HTTP requests from the client must go through the proxy, and there the SWGL returns the proper pageset to the browser. The SWGL proxy server was designed to be compatible with Netscape 2.0 beta (and now 2.0), and it is currently compiled to run on various popular Unix platforms. Versions for other platforms, and specifically Microsoft Windows 95, are planned.

We knew that the multi-client protocols would require session management. With the requirement that there be no server modifications, our options were reduced to finding a broadcast communications network across the clients. We experimented with a star communications topology, but eventually found that having the first client start a special server was more feasible. Though this creates a communications bottleneck, we envision our collaborative protocols to be used by either very small groups of two to four researchers or classroom environments in which we would expect the instructor to have a higher-performance workstation on the same local network as the students' machines.

HTML Language Extensions

Multi-link: <MHMT [gr] [tx | dtx | tx dtx]> body </MHMT> gr: GRAPH="text" tx: TEXT="text" dtx: DISPLAY_TEXT="text" text: any plain text string body: (text | iref | anchor)* iref: <IREF HREF="url" [act]> act: ACTIVATE=text anchor: <A HREF="url"> text </A> url: Universal Resource Locator e: Empty string

The previous implementation of MMM introduced MHTML[LC95], or Multi-HTML. MHTML adds the <MHMT> (Multi-Head, Multi-Tail) environment tag to standard HTML [H95]. The <MHMT> tag encodes a transition in the contents of an HTML page; it also includes information on how to display the transition. There are attributes for the text to render and what to display on the status line when the pointer passes over the transition. There are also lists of the transition's incoming and outgoing links.

MHTML is designed to permit rational behavior in browsers that do not have a SWGL addition. This takes advantage of the general behavior that browsers ignore tags they fail to recognize. A browser without a SWGL will ignore the <MHMT> and <IREF> tags, rendering only the <A HREF="..."> tags; the MHTML transition is replaced with a set of hyperlinks corresponding to the outgoing links from the transition.

Two major additions to MHTML v2.0 are the GRAPH attribute on the <MHMT> tag and the ACTIVATE attribute on the <IREF> tag. The GRAPH attribute specifies what graph model intended the current multi-link to be browsed with. This can be overridden by the reader; we currently depend on social protocols to "enforce" the intended browsing semantics.

The ACTIVATE attribute specifies the activation conditions for a link leading into a transition (that is the relationship between colors of tokens which must be on that page and the color of the current browser). A simple set of boolean connectives is supported with color constants, variables which are any color but the current browser's color, and the current browser's color ("current"). The default value for ACTIVATE is "current", making a default CPN behave as a monochromatic Petri net.

Conclusion and Future Work

As in the previous paper, we end with a look to the future. Because this project is unsupported research, we have no specific plans for future work; we do have some new research directions, some of which are detailed below. Current information about the MMM project is available at http://www.cs.unc.edu/~stotts/MMM/.

Authoring Environment

The hardest part of using the extended graph models is writing the complex link and node structures. In addition to choosing a suitable graph protocol the author must manually insert the appropriate MHTML link constructs in each page of the document. Automated authoring support is necessary to make these extensions conveniently usable, especially for making graphs that are interesting for more than a single protocol.

Authoring support should provide an overview of the graph structure which can be edited to add nodes, links, and transitions as well as an editable view of the contents of the various nodes. The editor should produce navigable MHTML files with appropriately structured links. The Trellis [SF89] project has a Petri net editor, xTed, which has already implemented many of the visual aspects of such a graph editor. We hope to leverage xTed by adding MHTML export filters so we can have a graph editor as soon as possible.

More protocols

In addition to the editor, another next step for the research is taking advantage of the general graph layer by developing further graph protocols. Timed CPNs and And/Or graphs are obvious choices though we hope to spend some time looking at novel combinations of our various filter capabilities. We also expect to provide these graph protocol layers to user text groups so we can study the real-world implications of the additional semantic power of the various graph models.

Conclusion

To conclude, we feel that the additional semantic power offered by further graph models gives authors more precise control over how their documents are browsed. The Semantic Web Graph Layer offers a convenient abstraction for providing graph model controls, as well as a convenient architecture for implementing such. Though we have not as yet fully explored the potential of the SWGL, we feel that the multi-client requirements of the Colored Petri Net protocol was an adequate test of the architecture. As well, the CPN protocol provides the semantic control needed for specific situations, such as moderated meetings or group browsing sessions, that have often been solved in an ad hoc and limited manner. Initial studies show that the system is well-suited to the classroom environment for which is intended--as well as to multi-player Web games.

Acknowledgements

We would like to thank members of the University of North Carolina at Chapel Hill Computer Science Graduate Department, specifically those enrolled in the Software Engineering course in the 1995-1996 year. These students did the bulk of the programming work on the Semantic Web Graph Layer and the CPN protocol, as well as often giving valuable input into the design and research process. Though they are too many to credit directly for the paper, each of them deserves that distinction. Team members were Lars Bishop, Gentaro Hirota, Jayant Kolhe, Jason Smith, Narendra Tulpule, and Jason Wilson.

References

[B45]: V. Bush, "As We May Think," Atlantic Monthly, July 1945, pp 101-108.
[DR94]: J. Donath and N. Robertson, "The Sociable Web," Electronic Proceedings of the "Second World Wide Web Conference '94: Mosaic and the Web", Chicago, IL, October 1994.
[HS94]: F.G. Halasz, and M. Schwartz. "The Dexter hypertext reference model," Communications of the ACM 37,2 (Feb. 1994), pp. 30-39.
[H92]: R. Hill, "Languages for the Construction of Multi-User Multi-Media Synchronous (MUMMS) Applications," Languages for Developing User Interfaces, Jones & Barlett Publishers: Boston, 1992, pp. 125-143.
[H95]: HTML 2.0 Proposed Standard (RFC 1866).
[LC95]: B. Ladd, M. Capps, P.D. Stotts, R. Furuta. "Multi-Head Multi-Tail Mosaic," Proc. of the 4th Annual World Wide Web Conference, Boston, MA, December 1995, pp. 433-440.
[M93]: J. Menges. "The X Engine Library: A C++ Library for Constructing X Pseudo-servers," Proc. of the 7th Annual X Technical Conference, Boston, MA, January 1993.
[MA-W95]: K. J. Maly, H. Abdel-Wahab, et. al., "Mosaic + XTV = CoReview," Electronic Proceedings of the "Third World Wide Web Conference '95: Technology, Tools and Applications", Darmstadt, Germany, April 1995.
[N96a]: Netscape Communications Corporation, "AN EXPLORATION OF DYNAMIC DOCUMENTS."
[N96b]: Netscape Communications Corporation, "FRAMES: AN INTRODUCTION."
[SF89]: P. D. Stotts and R. Furuta, "Petri Net Based Hypertext: Document Structure with Browsing Semantics," ACM Trans. on Information Systems, vol. 7, no. 1, January 1989, pp. 3-29.
[SF90]: P. D. Stotts and R. Furuta, "Temporal Hyperprogramming," Journal of Visual Languages and Computing (Academic Press), vol. 1, no. 3, September 1990, pp. 237-253.
[SF91]: P. D. Stotts and R. Furuta, "Dynamic Adaptation of Hypertext Structure," Proc. of Hypertext '91 (ACM), December 15-18, 1991, San Antonio, Texas, pp. 219-231.
[T88]: R. Trigg, "Guided Tours and Tabletops: Tools for Communicating in a Hypertext Environment," ACM Transactions on Office Information Systems, vol. 6, no. 4, October 1988, pp 396-414.
[Z88]: P. Zellweger, "Active paths through mulitmedia documents," Document Manipulation and Typography, J.C. van Vliet, Ed. Cambridge University Press, New York, 1988, pp 19-34.

About the authors

Michael Capps is a doctoral student in Computer Science at the University of North Carolina at Chapel Hill, where he also received his B.S. with honors in Computer Science and Creative Writing. His research area is techniques in engineering collaborative virtual reality environments, but enjoys the WWW and hypertext immensely in his off hours. He has also been a WWW site administrator at UNC for three years, and founded a WWW publishing company long before anyone was willing to pay for that service. He is currently funded by an NSF Graduate Research Fellowship.

Brian Ladd
No biographical information available.

David Stotts
No biographical information available.