C H A P T E R 1
Describing Data: The Concept
A relation between two objects can be described in many ways; one way is to
draw on a piece of paper two circles representing the objects, and a line
between them representing the relation. This is actually the beginning of a well-
established analysis method for applications, using Unified Modeling Language
(UML) to describe the relations. However, once you have established a descrip-
tion of the relationships, it is possible to do much more with them.
A description of a Web site is like the morning promenade of a German philoso-
pher (not because neither can whistle ). These two far-flung concepts are con-
nected in that the same theory can be used to relate the description of one Web
site to another.
First, the description of the Web site. There are a seemingly (if not actually)
infinite number of ways to describe a Web site. However, they all have one
thing in common: Each describes a Web site. (In case it describes an object that
is not a Web site, the logic is the same.) The descriptions can be broken down
into logical elements, as follows: Instead of writing this is my really fun site
that talks about lots of great stuff that I found in other places on the Web and
that I really liked, you can write site has property=fun (according to: you);
property=links (to: stuff from the Web, appreciated by: you). You are describ-
ing an object (the site) that has properties (fun, links) that have values (that
you think it was fun, that the stuff is from the Web). So far, anyone familiar with
object-oriented analysis will not have discovered anything remarkable.
If you draw your description on paper, you can draw it as dots connected with
lines: In mathematics, this is called a graph. Since the description can be thought
of as a graph, the branch of mathematics that includes graph theory topology
can be applied to it. I will take this reasoning a little further, but I will not try to
give you a real introduction into graph theory, because that would go too far.
The simplest graph you can imagine is two dots connected with a line. If the
dots represent something, they can be nodes, and the line connecting them an
arrow. If the line is an arrow, it has a direction. The Web is a directed graph in
which the nodes are documents, and the arrows are the links between them
(because a link has a single direction, at least in HTML). As a matter of fact, the
entire Web is a big directed graph, because the links are unidirectional (they go
from one document to another). This is changing with the introduction of bi-
directional links in HTML 4.0 and Xlink/Xpointer, but the Web as you know it is
a one-way street. Links go only forward, not back.
If you add attributes to the arrow called an arc in mathematician-speak you
are creating a labeled directed graph. This is the format that RDF uses to
69528_CH01Ix 4/6/2001 8:15 AM Page 6