Portable minimal web servers

Gyõzõ Harka
Computing Services Centre, Faculty of Science
University of Pécs
H-7624 Pécs, Ifjúság út 6.
carlos@gamma.ttk.pte.hu
Csaba Z. Béres
Department of Informatics and General Technology
University of Pécs
H-7624 Pécs, Ifjúság út 6.
chubby@gamma.ttk.pte.hu
László Pere *
Department of Informatics and General Technology
University of Pécs
H-7624 Pécs, Ifjúság út 6.
pipas@linux.pte.hu
L. Pere also works at Computer and Automation Research Institute
of the Hungarian Academy of Sciences (MTA SZTAKI)

ABSTRACT

In our poster we present two functional examples of highly portable minimalistic web server applications. One of these is written in ANSI C, while the other one is simply a Bourne Again Shell script. These simple softwares provide a surprisingly large variety of opportunities and services, including all of those which can be regarded as fundamental. They may serve as a counterpoint to nowadays' development towards expansible but highly resource consuming server software.

Keywords

C.2.2Network Protocols Applications
H.4.3 Communication Applications Miscellaneous web server, portability, performance, reliability

1. INTRODUCTION

It has become apparent in the last few years, that World Wide Web became the most frequently used application of the Internet. It has overtaken the traditional functionality of ftp, gopher, USENET news etc. Moreover, many of the e-mail users use SMTP indirectly through a WWW front-end. It would be therefore hard to overemphasize the importance of questions and problems of the development of the development of the underlying server-side software.

Both HTTP standards , and World Wide Web servers have undergone a lot of development and improvement since the appearance of the CERN HTTP Daemon. Regarding the protocols, MIME, support for keep alive requests, server side cache control, and etags may be listed as some of the key steps in this development . Meanwhile named virtual host support, auto-indexing of the filesystem, proxy support, encrypted communication and support for several script languages may be listed amongst natural requirement for an up-to-date server software.

Examining this development of WWW server software more closer, one finds however that they have become rather complicated, and sometimes their operation can be regarded as byzantine in many of the cases. Installing a new functionality on a working web server, one can easily damage its already existing functionality. This may be the consequence of sophisticated dependencies, and less rigorous hierarchy. An average apache server loads about forty modules in average for its operation.

The question naturally arises: is this growing complexity due to a purposive development, or is it simply an ad hoc side effect of the more and more complex, and even conflicting requirements. One may claim that it is a similar situation to that of the operating system of the third generation of integrated circuits and multiprogramming in the 1970's . The appearance of universal operating systems had induced a demand of a huge number of applications. This was beyond the scope of the technical level and programming methodology of the age, which lead to practically useless operating systems with a huge number of untreatable bugs. Only the development of system design concepts and underlying methodology lead to a solution.

The main aim of our poster is to draw the attention to this possibility. We show that it is indeed possible to create web server software supporting surprisingly many of nowadays' requirements, using very limited resources, while keeping the code itself simple and tractable. Our main concern was not to produce a functional code: several such attempts have been made already (see e.g. [1]). We rather intend to demonstrate certain ideas. We have constructed two web server applications, which are functional demonstrations of the latter principles.

This paper is organized as follows: in Section 2, a minimal web server written in C programming language is introduced. Section 3, a working web server written in Bourne Again Shell (bash) is introduced. Section 4 summarizes our conclusions.

2. A minimal web server in C

This server may be regarded as a HTTP skeleton written in POSIX ANSI C. It uses no external libraries, but the standard POSIX features of the operating system only. Therefore it is utmost portable, can be used on any POSIX compliant operating system. (For sake of simplicity, it depends slightly on minor features of the init available on the Debian/GNU Linux platform (version 3.0) on which it has been developed, but this dependency can be easily eliminated.) It is a stand-alone network server, which performs extremely well in case of high hit-rates. It supports html and image files, but may be easily expanded to support all MIME types. Interactions are logged by the syslog function.

Our demonstration version possesses certain limitations, such as no auto-index feature or script languages are supported. These could however be included simply.

The operation of the server can be described simply: it can even be understood by tracing its system calls. The program first sends an accept() system call, which notifies the operating system that it listens connections. As a connection appears, a fork() call is sent, which duplicates the core image of the program, thereby enabling the acceptance of another connection. The treatment of the request is carried out then by a read() call, which reads the request of the client from the network. Then a stat() call examines whether the file found in the header exists. If it is the case, then the file in argument is loaded into the system memory by an open()- read()-close() call triplet. This is followed by two write() calls, sending the header and the file itself. The network connection is closed by a shutdown() call, and then the forked process is terminating by an exit() call. (Remember, that meanwhile another replica of the server is running in order to receive further calls.)

The program source consists of 169 lines providing a single binary of about 6 kilobytes. (This is 0.7% of the size of the apache binary on the same system.) Though it is of limited functionality, It might even be practically useful for e.g. providing local documentation available on the operating system. Though it is a very minimal application, it provides a partial support for the HTTP/1.0 standard , and therefore it provides more resources than the original CERN HTTPD.

3. A web server in bash

Having a system which has a POSIX Internet Superserver Daemon ( inetd or xinetd) running, a single UNIX shell such as Bourne Again Shell (bash), supplemented a few standard POSIX file utilities enables us to create a minimal web server resembling nowadays most frequently used web servers. (In detail, bash ver. 2.0.3, grep ver. 2.5.1, file ver. 3.39, cat ver.4.5.10, cut ver. 4.5.10, date ver. 4.5.10, ls ver. 4.5.10, and rev has been used in the actual implementation.) The resources applied provide an extreme portability of this software. Our aim was to provide a demonstration server producing a full HTTP/1.0, and partial HTTP/1.1 compliance . In addition, it provides logging features, and uses configurable modules providing auto-index and vhost support. The server, in contrast with those used in nowadays' practice, requires practically no memory if no requests are present, therefore it is a good solution for not frequently used servers.

Though this server does still not support features such as etags, it may be regarded as useful in practice. The application of a high-level programming language provides us with the opportunity of easy modification and development of the software, it introduces certain disadvantages. Indeed, our bash-based server, though supporting more features than that introduced in the previous Section, it is much more limited in performance. Security issues are partially transferred towards the operating system itself.

Regarding the operation of the server, first we should note, that the communication with the client is carried out via the standard input and output (STDIN/STDOUT) of the script itself. Thus this program is not a stand-alone application, it is inetd based. The script is practically an infinite loop, terminating if no keep alive requests are received from the client. This is required since starting an interpreted program takes more time.

The header received from the client is read line-by-line, and it is parsed by the standard grep utility. Meanwhile a timeout is taken into account against broken communications with the client (possibly due to an attack), provided by the read builtin. The full header may be logged, if the debug option is enabled in the configuration file.

It is important to verify whether a double dot string is present in the request, which is not allowed by the HTTP/1.0 standard, as it would make it possible to go beyond the document root at the server. A 404 error code is sent as a reply in this case. In the next step, it is verified whether the requested file is a regular file or a directory. If it is a regular file, it is sent if it exists, a 404 error is sent otherwise. If it is a directory, the presence of the standard trailing slash is verified, and an error code 301 is replied in case of error. Next, the presence of the index file is present in the requested directory. The possible names of index files can be set in the configuration file. If the index file is present, it is sent to the client. In the absence of index files, the auto-index module of the script is called, if auto-indexing is enabled. If auto-indexing is disabled, the 403 error code is replied.

MIME-types are identified with the standard file utility, while the last modification time is obtained by the ls program. The locale is set to C, in order to be able to give HTTP time. The header is composed with builtin commands, and it is sent in an utmost simple way in a binary fashion using the cat utility.

The named virtual host support is provided by a loadable module, which can be enabled in the configuration file. The actual hostname is extracted from the header, and compared with the list of virtual hosts in the configuration file. According to this data, the appropriate document root is set.

The base script all together consists of 106 lines. In addition, there is a 6 lines long module for auto indexing and a 18 lines long module for vhosts. This strikingly illustrates the extreme simplicity of the program, especially compared with its capabilities.

4. Conclusion

We have presented two functional examples of portable minimal web servers, which can possibly be applied in everyday practice. We believe that these illustrations may serve as a notification to developers, drawing their attention to certain points. On one hand it shows, that existing high-level programming tools are useful in application protocol testing and development. Even though sometimes more resources are needed, it might be indeed worth to tend towards simple and tractable codes as well. The programs may be even of practical use in their current form as well. Potential application include providing local documentation on a non-networked computer, application on less frequently used web-servers, and providing static html pages.

The softwares introduced on our poster are available under the GNU General Public License at URL:
ftp://linux.pte.hu/pub/minwebservers/.

ACKNOWLEDGMENTS

The authors thank Mátyás Koniorczyk at Institute of Physics, University of Pécs for inspiring discussions.

BIBLIOGRAPHY

  1. M. T. Abzug. BASH-httpd FAQ, April 2002. http://linux.unbc.edu/~mabzug1/bash-httpd.html.
  2. T. Berners-Lee, R. Fielding, and H. Frystyk. Hypertext Transfer Protocol - HTTP/1.0, May 1996. Request for Comments: 1945.
  3. R. Fielding, U. Irvine, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee. Hypertext Transfer Protocol - HTTP/1.1, June 1999. Request for Comments: 2616.
  4. H. F. Nielsen. HTTP/1.0, 1.1 and beyond, an evolutionary perspective on http. talk at the W3C Mobile Access Workshop, April 7-8, 1998, Tokyo, Japan. http://www.w3.org/Mobile/1998/Workshop/Slide/HTTP.
  5. A. S. Tannenbaum. Modern operating systems. Prentice Hall, 2nd. edition, 2001.