Using the Total Access System with the World Wide Web

Neil G. Scott,

Senior Research Engineer
The Archimedes Project, CSLI
Stanford University, Stanford, CA, 94305

This presentation will discuss and demonstrate web access using the Total Access System developed by the Archimedes Project at Stanford University. This Total Access System makes it easier for disabled individuals to use any computer or computer-based device by clearly separating access requirements from applications. Individual access needs are handled by a personal "accessor" device which connects to any computer through a standardized Total Access Port (TAP). The Archimedes Project is developing a selection of accessors which cover the needs of disabled individuals, regardless of the type or severity of their disabilities, and a selection of TAPs to provide a standard connection to any computer. An appropriate accessor enables a disabled person to use the same browsing tools as everyone else to access information on the web. Currently available accessors provide a selection of alternative input capabilities including: special keyboards, speech input, head tracking, and eye tracking. Ongoing research is focused on providing blind users with alternative access to visually presented information, and deaf users with alternatives to spoken text.

Introduction

In a perfect world, there would be no need to manufacture special access devices for disabled individuals. Products would automatically include all of the input and output capabilities necessary for them to be useable by anyone, regardless of particular physical and cognitive capabilities. In the real world, however, the variability in the needs of individuals with different disabilities makes it impractical to include all of the necessary input and output capabilities for a single device. One real-world need that is becoming increasingly apparent is access to the Internet and World Wide Web. Many companies are working on building accessible web tools. The Archimedes Project at Stanford University is working on a universal approach to access which automatically ensures access to the web. Stanford's Total Access System provides gives a disabled person the ability to use any computer in the same manner as anyone else. If a person has full access to a computer, it follows that he or she has full access to the web. This paper explains the Total Access System.

There is growing acceptance among access specialists that all products fall into one of three classes:

Class 1

Class 2

Class 3

Most existing products fall into class 3. While the goal is that all products would be class 1 the reality is that most products will fall into class 2. With class2 products, it makes sense include access features that are widely used and do not have a large detrimental effect on performance or cost. It also makes sense to leave out access features that make a product more difficult to use or more expensive while benefiting only a few users. An appropriate cost/benefit analysis of each product will show which features should be built in and which features should be provided by external devices or processes. The key to making class 2 products practical is having a standard for connecting external augmentative devices to any product.

The Total Access System (TAS) developed by the Archimedes Project at Stanford University, directly addresses the need to accommodate class 2 and class 3 products. The basic philosophy behind the Total Access System is that products should incorporate access features that are readily achievable and provide a standard TAS interface to handle everything else. This approach shares the responsibility of providing access in a sensible and cost effective way. Readily achievable access features are those that can be incorporated by the manufacturer within the cost, performance and physical constraints of the product. TAS allows manufacturers to create products which can be used by everyone without the burden of supporting complex features that may be required by only a small number of disabled people. From the perspective of a disabled person, TAS provides a consistent interface which makes it easier for them to use different products.

The main advantage of TAS is that it separates special access features from the application that is being used. It achieves this by adopting a modular, plug-and-play approach. A personal, portable computer-based device called an Accessor performs all input and output functions using a selection of strategies which best suit the needs and capabilities of the particular individual. A standardized data communications protocol has been developed to enable any accessor to interact with any computer-based product. At present, the standardized product interfaces are provided by small add-on devices called Total Access Ports or TAPs. All devices equipped with a TAP appear identical to any Accessor. TAPs should eventually become a built-in component on all computer or computer-based device. This will make it possible for a disabled person with a single personal accessor to be able to use every electronic product that is produced. Nondisabled people will also benefit if TAPs are included in electronic products since it gives them the option to use personally preferred interfaces not anticipated by product designers.

Reasons for Making Products Accessible

All applications of computers are linked in some way to information. Computers are tools which receive, process, store, retrieve and deliver information. Information is at the core of our modern society. Anyone who doesn't have adequate access to information is potentially severely disadvantaged. In itself, however, information is neither accessible, nor is it accessible. It is the form in which it is represented which makes it one way or the other. To succeed in today's world, individuals must have access to information in a form that they can use. Increasingly, worker productivity and effectiveness is directly linked to how well they are able to find and process information. When they have access tools, disabled workers are as capable as everyone else in an information driven society. While there may be many options for making information universally accessible, factors such as product cost and convenience to all users must be considered in making a choice.

In the past, access has been provided on a case-by-case manner. The needs and capabilities of the user are assessed by experts and the working environment is adapted to accommodate individual needs. This is a time-consuming and inefficient way to provide access and it often results in penalizing disabled users by forcing them to work at a single computer in a single location. In the course of a normal day, most of us use a variety of computers, often in quite different guises and locations, and would be greatly inconvenienced if restricted to doing everything through a single personal computer. Researchers at the Archimedes Project have developed a universal access system which directly addresses the problem of enabling a disabled person to use any computer or computer-based device. The Archimedes Total Access System (TAS) separates the special needs of a disabled user from the applications that are being used. It achieves this by breaking the access problem into two parts, one which deals with providing information to, and receiving information from a particular user, and the other which deals with connecting to specific host computers and other electronic devices.

Archimedes researchers solved the first part of the problem by developing a personal, portable device called an accessor. An accessor enables a disabled person to perform all input functions and retrieve all output information in whatever manner or format best suits his or her needs and capabilities. They solved the second part of the problem by developing a Total Access Port (TAP) which provides external connections to all user inputs and user outputs on the host device. The two partial solutions are connected by a standardized communications protocol called the Archimedes Protocol which allows any accessor to interact with any TAP.

The divide-and-conquer strategy provided by the Total Access System leads to many advantages over existing methods for making equipment accessible.

Access solutions are easier to define and implement since attention can be focused on the needs and capabilities of the disabled individual rather than the problems of integrating special hardware and software into an existing computer system.
There is a clear demarcation of skills. On the one side, access specialists can focus on the needs of the disabled person donÍt require a detailed knowledge of computer hardware, operating systems and applications. On the other side, computer specialists don't require detailed knowledge of disability-related issues. In strong contrast to this, existing access solutions require specialists who have a very thorough understanding of both issues.
Systems are easier to maintain. The separation of access functions from applications makes it very easy to identify the source of a problem. For instance, since a single TAP design handles every computer of a given type, it is easy to maintain spares and swap out defective units. With existing access solutions, corporate maintenance personnel are reluctant to touch a disabled person's computer because of the tightly interwoven links that are necessary between the special access software and the standard operating system and applications software.
With the Total Access System, special access tools such as speech recognition and eye tracking can be used with a host computer system without degrading its performance. In conventional systems, the amount of computation required to perform these functions within a host system can severely reduce its overall performance.
The range of solutions required to provide universal access are significantly reduced. The existing approach can be described as a many-to-many solution since a unique solution must be developed for every combination of disability that is to be accommodated and host device that is to be accessed. The Total Access System reduces the solution to the combination of a many-to-one solution for accommodating disabilities, and a one-to many solution for connecting to host devices. The significance of this is shown by the following example. If we wish to enable ten differently disabled individuals to operate ten different computers, the existing approach would require one-hundred unique solutions, i.e., the product of the number of disabilities and the number of computers. The TAS approach requires only ten input solutions and ten output solutions, i.e., the sum of the number of disabilities and the number of computers.

Implementation

The Total Access System consists of two main components:

1. A personal accessor which handles all disability access functions, and

2. A Total Access Port (TAP) which provides a standardized interface to any type of host computer or computer-based device.

Accessors

Accessors are designed to perform input and output functions in whatever manner best suits the needs and capabilities of a particular user. In addition, they may incorporate acceleration techniques such as word prediction to minimize the amount of effort required from a user to perform routine operations. Accessors can be as large or as small, as simple or as complicated, as cheap or as expensive, as is necessary to perform the required functions. Accessors can be designed to use any of the techniques that have been previously developed for making existing computers accessible. Additionally, accessors support the use input and output techniques that would otherwise be impractical due to the computational burden they would place on a host device.

Total Access Port (TAP)

The Total Access Port provides external access to all input and output functions on a host device. It achieves this by emulating the electrical functions of the physical input and output devices connected to the host. Emulation of the physical input/output devices is necessary because it provides full control of the host device without requiring the addition of special hardware or software inside the host. While it is possible to emulate some of the I/O functions on a host computer through its serial or parallel port, this approach was rejected for the TAP because it necessitates special software to be running in the host and it ties up a port that may be required by another device such as a modem or printer. We jokingly refer to the TAP as stealth technology because the host cannot tell that it is connected. There is no difference between signals coming from an accessor and those coming from physical devices such as the keyboard or mouse.

Input Functions

TAPs must provide interfaces to input functions such as keyboards and pointing devices, and output functions provided by a screen display or loud speaker. Our initial research has been focused on the input functions performed by the keyboard and mouse. After investigating a variety of options, we chose to emulate the physical and electrical characteristics of the keyboard and mouse. This leads to a stable design for the TAP since it is not possible for manufacturers to make arbitrary changes to hardware interfaces without causing major compatibility problems. In contrast, software interfaces can be quite unpredictable because they can be arbitrarily changed by any application software.

A unique TAP design is required for each type of computer because of the differences in their physical and electrical interfaces. In general, one TAP design handles all of the systems for each major manufacturer. To date, input TAPs have been produced for Sun, SGI, Mac and IBM PC computers. These four TAPs handle most of the workstations in the workplace because many of the workstation manufacturers are adopting the IBM PS/2 standard in place of their own proprietary keyboard and mouse interfaces. A new TAP is being developed to match the Universal Serial Bus which is due to appear on many new computer products. TAPs are also being developed for a variety of appliances.

Output Functions

TAPs were conceived as devices which provide simultaneous, external access to all input and output functions on a host system. However, because retrieving output information from a computer is considerably more difficult than to entering input data, the project was divided into two phases. The first phase, which is almost completed, focused on providing reliable and transparent input to any supported workstation. The second phase, which is just now beginning, focuses on the problems of retrieving output information from the screen of a host system and translating it into whatever form is most suitable for the individual user.

Individuals who cannot see what is displayed on a screen must rely on alternative output representations such as speech or Braille for text, and sound images or haptic devices for non-textual elements. Visually impaired users may require large character displays for text or highly magnified images for graphical elements. The greatest problem in providing these alternative forms of access is obtaining the information from the computer in an appropriate format. The primary reason for this is that almost all computer software is designed to use the video screen as the default output and there is no provision for direct access to the information that is displayed. Because of this, programs which provide alternative outputs must capture the screen information from within the operating system before it is formatted for the screen. As operating systems and applications continue to become more complex, this is becoming incredibly difficult to do.

The Graphical User Interface (GUI) is the major factor contributing to the difficulties of recovering raw screen data from within a computer. Typical GUI displays are a collection of separate, rectangular windows containing textual or graphical information. Each window is created and managed independently by the operating system at the request of the different applications that are running in the machine. Each application can own windows created at different times and located anywhere on the screen. The windows may be arranged on the screen in many different ways: side-by-side like tiles, cascading patterns of overlapping rectangles, or randomly scattered with some windows fully visible and others either partly or fully obscured. At any time, only one window has the focus, meaning it is the one that the user can send information to. It is incredibly difficult to keep track of which window belongs to which application, and exactly what is visible to the user.

An additional complication with the GUI is that much of the information is represented by graphical elements in the form of small picture icons and various symbols representing selection buttons, information boxes and slider bars for scrolling through the contents of the windows. In many situations, all of the meaningful data is contained in these graphical elements and there is no textual representation available.

Yet another complication of the GUI arises from the multi-tasking capabilities of the computer which leads to many different windows actively displaying and changing data at the same time. In contrast to older systems in which one stream of output information was presented to the user in a sequential manner, a GUI presents many streams of output information simultaneously. Some of the changes are important, e.g., warning messages, and must be brought to the attention of the user immediately. Other changes, such as the moving hands on an image of a clock face, are part of the routine system operation which the user may or may not take any notice of.

The GUI also leads to related input problems for various groups of users. Activities known as pointing and clicking are used to convey the user's intentions to the GUI. Pointing is performed by physically moving a mechanical pointing device, typically a mouse. The pointing device causes a small cursor image, such as an arrow head, to move about the screen. The clicking is performed by pressing a physical button when the cursor is at a desired location. The meaning of a button click depends on what screen object is under the cursor when the button click occurs. Different screen objects are used to represent the various activities available to the user. Making choices from a menu, specifying the way a program is to behave, selecting fragments of text to be edited, are typical operations that are performed with a pointing device. Pointing and clicking operations require a high level dexterity and hand-to-eye coordination. As a consequence, many individuals with visual impairments, blindness and/or physical disabilities find the GUI difficult or even impossible to use.

Current methods for accessing a GUI use an add-in program called a screen reader. This gathers information about what is displayed on the screen and transforms it into a format that has some meaning for the user. Typically, text is converted to spoken speech or Braille, and graphics are either ignored or converted to some form of sound or touch representation. The information required for creating the alternative outputs is obtained keeping track of all the messages that pass between the application program and the operating system. These messages are used to construct a text-based off-screen model of the GUI screen. The screen reader software interacts with the off-screen model instead of the GUI screen. If this sounds complicated, it is. Because of the intimate real-time relationships that must be maintained with the operating system and the applications, screen reading programs are among the most complex pieces of software currently used on computers. The growing complexity of GUI systems, and the specialized knowledge and programming resources required to create and maintain a screen reader makes it extremely difficult for access designers to keep up with the ongoing developments in operating systems. Some developers believe we are rapidly reaching a point where only large organizations like IBM or Microsoft will have the knowledge and resources to build screen reading software in a timely manner.

The Total Access System provides an alternative method for giving blind and visually impaired users access to the output from a computer. Instead of working inside the host computer, the TAS retrieves information directly from the screen display. With appropriate hardware and software, the image on the screen can be captured as a bit map and the information contained within it can be retrieved by pattern recognition and Optical Character Recognition (OCR). While this approach is quite simple in concept, it is extremely challenging to implement due to the amount of information contained in a full screen display and the rapid refresh rates that are used on a video monitor.

This direct retrieval approach is being developed for the Total Access System because it is the only strategy that will meet our design goals of being able to work with any type of host computer without interfering with its operation or requiring the addition of internal software. Several novel approaches have been formulated for reducing the amount of information that must be captured and processed in real time.

Connecting an Accessor to a TAP

The basic philosophy of the Total Access System is that any accessor can be used with any TAP. Since an accessor can be designed to match the needs and capabilities of any individual, and a TAP can be designed to handle the inputs and outputs of any host, this means any individual should be able to operate any host device. The communication path which connects a TAP to an accessor must be able to handle all of the necessary information in a fast, accurate and secure manner. It is immaterial whether a physical or wireless connection is used. However, the communications protocols used for the communication are critical since they must handle a wide diversity of information and they must continue to operate as the system evolves and new types of TAPs and accessors are added to the system.

The communication protocol is the crucial component which gives the Total Access System its universal access capabilities. Information is passed back and forth in an abstract form that is independent of the physical and electrical characteristics of the equipment that is being used and the task that is being performed. For example, input information conveys the user's intention to achieve a desired result, such as press the key that has the letter "a" displayed on it, or move the pointing device upwards at a particular speed. The user, the accessor, and the communications link have no knowledge of how the host achieves these operations or the codes used by the host keyboard or mouse to represent them. It is the user's intentions that are passed to the TAP where they are converted into the idiosyncratic forms required by the particular host. Similarly, when the host generates a particular output, the TAP retrieves the information and transforms it into a neutral format that can be passed to any accessor. A second transformation occurs in the accessor to make the information directly accessible to the particular user.

A fundamental consideration in the development of the communications protocol is that it must handle accessors and TAPs that don't yet exist. For instance, an accessor that exists today must be still be able to operate with a TAP that appears in five years time and contains technology that has not been invented yet. This level of device independence can be achieved if the level of abstraction is chosen carefully and provision is made for enabling accessors and TAPs to automatically communicate their particular needs and capabilities and make the necessary operational adjustments.

Description of the current implementation of TAS

The input functions of TAS have been implemented on four of the most widely used families of computers. TAPs are now commercially available for IBM PC, Macintosh, SGI, and Sun workstations. The TAPs provide identical access to all keyboard and mouse functions on each of the supported platforms.

While it is possible to create many different types of accessor, most of our development effort has been directed to speech accessors. The reason for this is quite simple. If a person can speak reasonably well, speech recognition provides the fastest and most convenient alternative to the keyboard for entering text into a computer. However, speech falls short if a large amount of pointing is required. While we have developed a speech interface that allows all mouse functions to be performed entirely by speech, it is clumsy compared to a the normal use of a mouse. The limitations of the speech driven mouse have been overcome by combining speech recognition with head tracking and eye tracking. One of our researchers has been using a combination of head tracking and speech recognition for more than two years. He works with a Macintosh computer and is able to work as quickly and accurately as anyone using the keyboard and mouse. The Total Access System allows these two access technologies to be combined without any side effects or incompatibilities.

Archimedes researchers have also developed several eye tracking accessors that can be used as stand-alone communicators or as accessors. A user is able to switch from one type of operation to the other merely by looking at a selection button on the display screen. When used as an accessor, the eye tracker enables a severely paralyzed person to perform all keyboard and mouse functions merely by looking at objects displayed on a computer screen. A full range of mouse functions is supported by the eye tracker, but, because of the involuntary movements that occur as a people shift their gazes from one place to another, it is extremely challenging to provide fast and accurate mouse control. We are now working on a system which combines both eye and head tracking with voice. We believe that this system will provide the most natural form of text entry and pointing for anyone with suitable speech and sufficient head and eye movement.

At this stage, the TAPs are functioning very well and new versions are planned for additional workstations such as HP and DEC. The present accessors are limited by the state of the special input technologies such as speech recognition and eye tracking. One major problem is that new speech recognizers are designed to run in the Windows 95 environment and are slow because of the sluggish operation of Windows 95. A great deal of careful software design has been necessary to get the speed performance or the Windows 95 version of the speech accessor close to that of its DOS predecessor. Our highest priority is to develop faster accessors by eliminating the overheads imposed by the Windows 95 operating system.

Setting up a TAS system

Setting up a TAS system is extremely straight forward. After turning off the host computer , the keyboard and mouse cables are disconnected from the host and reconnected to marked inputs on the TAP. New keyboard and mouse cables are connected from the TAP to the input ports on the host. At this point the host can be turned back on and rebooted. Everything on the host should come up and behave as usual.

An accessor is either connected to the TAP with a serial null-modem cable or else it is used in a wireless mode to send input information directly to the TAP. All mouse and keyboard functions on the host are immediately available and the user can perform the equivalent of manual keyboard and mouse entry at this point. However, the full advantages of using the TAS require the user to train a selection of TAP Macros that allow large and complex operations on the host to be performed with the input of a single utterance or eye movement to the accessor.

Training requirements

Training a person to use an accessor is usually quite straightforward since the accessor is performing only two main functions, i.e., receiving and processing user input, and sending messages to a TAP. Modern speech recognizers learn the characteristics of a user's voice in well under an hour but it takes several hours for the person to become confident about how to talk to the host device. In our experience, most new users require between eight and twelve hours of one-on-one training to become proficient at using the speech recognizer. Some computer savvy users become proficient with about four hours of training.

Conclusions

The Total Access System provides a viable, cost-effective method for making any computer or computer-based device accessible. While its initial purpose is to make it easier for disabled individuals to remain at work or to get into the workforce, the long-term potential is to use the system to increase productivity and to prevent computer users from developing CTDs. Initial versions of the system have proven the effectiveness of the concept and we are now receiving serious attention from many large organization, both in the US and overseas.

Ongoing work is focused on improving the speed of the accessors and developing the hardware and software necessary for enabling blind users to retrieve output information from the host without the need to connect to internal hardware or software components.

Return to Top of Page
Return to Access Index