Data Model

The data model of WWW is what the http protocol implements.

Until now, WWW had a simple data model consisting of two kinds of data:

The following addition is made: These node types are discussed in more detail below. UDIs in the text are informally represented by a readable phrase.

Classical hypertext node

This consists only of text, with links going out from it or links coming in to it, maybe anchored to a selection of the text rather than the whole node,

Index

This is to be thought of as a set of links to documents, and the link to be followed is chosen by typing a string of text into a search panel. An index is usually implemented by a data base.

Indirect node

This is a document which, instead of containing the desired contents, contains the indications necessary to get to the desired document. Thus, when a UDI is used referring to this indirect node, the actual document returned is the result of the elaboration of the contents of the indirect node.

Indirect links come in three types:

Indirect nodes are used to solve: When PUT is in the HTTP protocol, ther is no reason why a more sophisticated user could not write his/her own indirect documents to this effect, or would even be allowed to store them at the server site.

The three types are discussed in more detail below:

Forward nodes

When document D1 on machine M1 gets relocated to document D2 on machine M2, all links to D1 become invalid. To avoid massive updating, a document D1.f is put on M1. It contains the UDI of D2. When a client wants D1, the server on M1 will try D1, fail, then try D1.f, succeed, and thereby know that D1 has been relocated. It then sends the client the contents of D1.f, preceded by the indication that the document found was a forward. The browser then uses this to find the real document. Note that if there is a chain of forwards, it is the browser that follows this chain, not the server! The browser can then decide to tell the user, or even automatically update the old link to the new location. Once the document D2 is displayed, making a link to it will be a link to D2, not to D1.f

Redirected nodes

Sometimes a document needs an alias, e.g. because it is one of a growing series: in the case of monthly reports, one may well want to link to "this month's report", and always get what is in the latest report. There exist documents for, say, January, February, March, ... which can also be linked to. A redirecting node works much like a forward node in that it contains the UDI of the aliased document. So, if a document TMR is requested, the server would not find TMR, would look for TMR.r and return that with the indication that it is a redirected document. The browser would then use the UDI (e.g. March.html) to find the current real document. The difference is in the behaviour when a link is made. Suppose March.html is on the screen. A link to this would result in a link to TMR if March.html was found through TMR, and in a link to March.html if it was found directly. Thus it is possible to make new links to "this month's report" rather than to March.html.

Queries

Queries of a complex nature are bound to be in some form of a programming language (e.g. SQL). To ensure independence of these languages, the complete text of the query should not be part of the UDI. The UDI should represent the desired contents. Take the example of a query that returns the current age distribution of staff in a given category as a table. This table could well be called "Physicists & Engineers" but require a quite lengthy SQL program. If at some time the personnel data is transferred to a new database that does not use SQL, the link to this document becomes invalid. It will also be difficult to produce the query in the first place through the only use of search panels.

A solution is to put the entire SQL program into a file PE.q, and give that document the UDI "Physicists & Engineers". The server will again not find this document and search for PE.q. It will return the contents with the indication that it is a query.

There are two solutions: one puts the UDI of the query document plus the UDI of the server that is to execute it in the anchor. This is unacceptable because the first server to receive the request may have to follow the links to a potentially infinite chain, servers need to contain browsers, the query is recursive. Advantages are that the scheme works with old browsers and that the query document itself does not contain indications of where it is to be executed, making it possible to re-use the same query text on several servers.

The other solution, which is the one adopted, uses the same mechanism as for forwards: the browser gets the query returned and then sends it to the server. The browser also follows the chains, there is no recursion. Disadvantages are that the query document now must contain the address of the server and cannot be reused without introducing some type of include-file mechanism, and that the scheme does not work with old browsers.

Linking with indirect nodes

In the following paragraphs, document A contains an anchor that has UDI b in it. Following the link should lead to document B which is identified by b.

Forward:

B is displaced and now identified by b1. A forwarding document F is identified by the old UDI b, it contains b1.

When B appears, the browser knows it through b1. Making a link to B means using b1. In exceptional cases one might want to link using b.

For changing B, the PUT command normally uses b1.

Redirected:

UDI b points to redirecting document R, which contains b1. When B appears, the browser knows it through b. Exceptionally one might prefer to link using b1, namely when one wants to link to the current value of b.

For changing B, the PUT command normally uses b1.

Query:

Links are always using b. There is no general way to use PUT on results of queries.
RC