HTTP defines a simple request-response language.
A web client establishes a connection with a web server by using HTTP.
HTTP defines how to correctly phrase the request and how the response should look like
HTTP does not define how the network connection is made or managed, nor how the information is actually transmitted; it is done by the lower-level protocols such as TCP/IP
An HTTP request consists of the following:
(The method is to be executed on the object named by the URI. Actually the web uses a subset of URI names, the Universal Resource Locator (URL))
This table shows a sample of valid HTTP methods:
Method | Action |
GET | Return the object; that is, retrieve the information |
HEAD | Return only information about the object, such as how old it is, but not hte object itself |
POST | Send information to be stored on the server. Many servers do not allow information to be POSTed except as input to scripts |
PUT | Send a new copy of an existing object to the server. Many servers do not allow documents to be PUT. |
DELETE | Permanently delete the object. Like PUT, this method is not allowed by most servers |
... etc ... | ... etc ... |
For example, to request the document /sinn/index.htm from www.openloop.com, the web client will send the following request:
GET /sinn/index.htm HTTP/1.0
User-Agent: NCSA Mosaic for Windows 95/3.0
Accept: text/plain
Accept: text/html
Accept: application/postscript
Accept: image/gif
Header Field | Description |
User-Agent | What kind of browser is making the request |
If-Modified-Since | Asks that the object be returned only if it is newer than the specified data. This saves the cost of retrieving a document that has already been acquired and has not changed |
Accept | The Mulitpurpose Internet Mail Extensions (MIME) types and formats of
information that the browser is prepared to accept. This may save the cost of transferring
documents that the client cannot or will not use. The client will then decode the data
according to the rules of MIME. Please note that Accept: */* could be used |
Authorization | User password or other authentication as required |
An HTTP response consist of the following:
(The server replies to the request with a description of what is being returned, followed by the information requested)
The status line has the form:
HTTP-version Status-code Reason
Field | Description |
HTTP-version | The version of the HTTP |
Status-code | Number indicates the result of the request |
Reason | A short phrase that explains what the number means |
Metadata (Metainformation) | Indicates to the browser what it must know to interpret and display the information |
Selected HTTP status code
Code | Reason | Description |
200 | Document follows | The request succeeded. The information requested follows. |
301 | Moved Permanently | The document has moved to a new URL |
302 | Moved Temporarily | The document has moved temporarily to a new URL |
304 | Not Modified | The document has not been modified since the date specified in a GET request with if-modified-since. |
404 | Not Found | The information could not be found or permission was denied. This error is returned if the requested URL does not exist or was misspelled |
401 | Unauthorized | The information is restricted; please retry with proper authentication. |
402 | Payment Required | The information requires paying a fee; please retry with proper payment (not used often) |
403 | Forbidden | Access is forbidden |
500 | Server Error | The server experienced an error |
For example, the response for the /sinn/index.htm request might be the following:
HTTP/1.0 Status 200 Document follows
Server: NCSA/2.0
Date: Wed, 23 Jun, 1999 18:08:08 GMT
Content-type: text/html
Content-length: 5800
Last-modified: Tue, 22 Jun, 1999 12:00:00 GMT<html>
<head>
<title>Richard P. Sinn</title>
</head>
<body>
<p><br>
<br>
</p>
<a href="subhtml/signature.html">
<p align="center"></a><a href="sinn.htm"><img src="images/homepage_title.gif" ALT WIDTH="554" HEIGHT="131"></a><br>
<a href="sinn.htm"><small><em><strong>Click the above Picture for Personal Information</strong>
... other content of /sinn/index.htm ...
From the point of view of a server, any document is just a stream of bytes delivered over the Internet. A simple ASCII text document is the same as a complicated multimedia presentation. It is up to the web client (browser) to decode and understand the doucment, and present it to the users.
Field | Description |
Server | The type of server software providing the response |
Date | The date and time of the response |
Content-Length | How many bytes of data will be sent to the client |
Content-Type | The MIME type of the information being returned, such as HTML or an image |
Content-Language | The language of the information, such as English or French |
Content-Encoded | Additional encoding, such as data compression |
Last-Modified | The date and time that the information was most recently modified |
Step 1: Loop and Wait for a new request
The httpd waits for a request to arrive from a web client in the Internet
Step 2: A Request arrives from a Web Client
When a user click on a hyperlink such as http://www.openloop.com/index.htm, the network software (TCP stack) of the web client computer locates the server computer (using DNS or host file) and sets up a bi-directional network (socket) connection from the client to the server www.openloop.com . A request header such as
GET /index.htm HTTP/1.0
is sent.
Step 3: The Request is Parsed by the Web Server
The Web Server parses and understand the request is a GET for information.
Step 4: Parses the Rest of the Header
The web server now understnad the protocol version is 1.0. It is a Netscape browser for NT, etc. Since this is a normal example, no further action is needed. (Think about what you could do with the header and XML)
Step 5: Do the method requested
The httpd in this stage will fulfills the request or send back error messages. In this example, web server will search in the file system for /index.htm.
Success: The document is sent
Failure: An error such as the following will be sent:
Step 6: Finish up: close the file; close the network connection
Close file and connection (socket). Client will then due with the data received.
Step 7: Back to Step 1 and Loop and Wait
...
It is sometimes desirable to run multiple web servers on the same machines.
We should use different ports for different servers, for example:
http://www.isp.com:8080/companyX/index.htm AND
http://www.isp.com:8081/companyY/index.htm
But we actually want to make the URLs looks like:
http://www.companyX.com/index.htm AND
http://www.companyY.com/index.htm
We called the above fictional servers - virtual servers
Using multiple servers address on the same computer is done by an operating system extension called virtual host support. It requires separate IP addresses for each virtual server just as if they were real computers on the Internet, except the multiple IP addresses are assigned to one computer. Connections to the different addresses are routed to the appropriate server software. This feature may not be supported by all operating systems, and the implementation might be very different (jobs vs process vs keneral vs ...)
As the web server grows, it is quite easy to migrate from a virtual server to a physical server (a separate machine) without changing any URL or updating the DNS entry in the primary and secording DNS servers.
How connection are setup (TCP, etc)
How inline Images are retrieved (by client)
The interaction between web server and file system (by OS interface)
How access control can be done (file system, os level, user profile, etc)
ftp://www.jfdkafjdk.something.com/somfile.h will work as well (interaction with the FTP server, not the web server)
Channel base connection (one for control info, one for real data)
Batch info for the whole page (instead of separation socket connection for images, etc)
Copyright 1996-2001 OpenLoop Computing. All rights reserved.