Web Server Overview



Background Info

Before we jump into the core of HTTP, lets sit back and look at what a web server is.

Web Server is not just a "big" machine sitting somewhere in the server room with big processing power and memory.

Whole Web Server = Computer Hardware + Operating System + Web Server Software + Info Content

As a software engineer, we will concentrate on all the software, thus, we could alter the definition as:

Web Server = Operating System + Web Server Software + Info Content



The Three Components

Operating System distinguishes which platform (Windows, UNIX, AS/400, Mainframe) the web server is running on.

The Web Server Software is a continuously looping server waiting for requests by web client (browsers or other) for documents over the network. It will parse the request, classify the request and take the corresponding actions. The actions might be executing a script, returning a document, accessing a remote database, etc.

As you could see, the web server has to understand the request from the clients. We sometimes say that they speak the same language or same protocol. And the protocol used by web server is Hypertext Transfer Protocol (HTTP). As we mentioned in the background information session, Web (with HTTP) is running on top of the TCP/IP internet network protocol. As a result, some of the same terminologies are used in a web server setting as well. The most important one should be TCP port.

A port is analogous to a telephone extension. When a client program connects to the server on the network, it requests the server program's port. This port indicates exactly which server program the originating client wants to connect to. Data then is written to the outgoing port (on the client's computer) arrives at the server program's incoming port, where it is read. Well, you probably will not see people specifically state which port to connect to when, say, going to http://www.openloop.com. The reason is because Web Server has a default port, if the client does not specific which port to connect to, it will default to port 80. So, the url http://www.openloop.com:80 is the same as http://www.openloop.com. Another word for default port is well known port. Thus, the well known port of web server is 80.

Info Content is the information being served by a web server. It could be data info from a local disk or data from database or remote location transparent to the clients.



Traditional Web Server

Traditionally, web server understand how to receive and reply to HTTP requests, serve up local file information and execute scripts as necessary. A traditional web server does not understand the content of the documents. In other words, all documents are served as they are stream of data from the server to the client. In particular, web server does not know about hyper links within the documents, the links are just part of the document from the point of view of a web server. When user clicks on a hyperlink, the browser will request another document from the hyperlink. Thus, the browser "manages" the hyperlink instead of the web server.

Also, the web server does not know about inline images, movie or voice pieces. All these MIME pieces are requested separately from the browser. These requests are just like any other request as far as web server is concerned.

(In a way, we could say that traditional web server does not have the ability to understand what the content of the documents are. We will in a different tutorial with XML to see how XML can help transforming a traditional web server to an truely interactive one and serve up meaningful contents to different clients. In this tutorial, we will concentrate on the basis. Yes, you could not run without knowing how to walk.)



Document Tree

The documents, images, and other information that the web server is to serve are organized into a tree or some form of hierarchy structure. The root of the tree is the starting point with child (or called node) following it. Please note that the web organized tree is usually different than the actually local file system tree.

Web Document Tree (Using FrontPage Viewer)

webviewDoc.jpg (55869 bytes)


File System Tree (Using Windows Explorer)

fileviewDoc.jpg (37805 bytes)



Information Type

Attempting to show an audio file as an image will not work, nor treating an ASCII text as image will not work as well. And, the browser has to know ahead of time that a document is a HTML document, so it could decode the tag into the correct format to display to a user.

First, the web server will send out a header of info and tell the client what kind of information is coming before the information arrives.

Content-type: application/postscript
Content-encoding: gzip

We called this Content-type MIME info and will be discussed using the HTTP example.

Before the web server send out those metadata of data, it has to find out what kind of data is being sent. This is kind by a convention as follows:

Extension Contents
.html, .htm HTML document
.txt Unformatted ASCII
.ps PostScript
.gif GIF image
.mpeg MPEG image
.wrl, vrml VRML scene description
.class Java applet
....etc ... ... etc ...


Copyright 1996-2001 OpenLoop Computing. All rights reserved.