Now that you've seen an XML document and a schema to describe its structure,
let's create an application to use the data. I'll walk you through the two basic
ways to deal with an XML document: as an in-memory tree object and as an object
that generates events as it is processed by the parser and any attached
applications.
The Document Object Model
The Document Object Model, or DOM, exposes an XML document as a tree
structure in memory and provides an easy-to-use environment for the programmer.
The DOM provides an accessible object that you can interrogate and manipulate
like any other object in modern-day programming languages.
The DOM defines a standard set of objects and interfaces that you can use to
manipulate XML, providing access to documents, elements, and attributes. The DOM
lets you express an XML document as an object, so you can work with it as you
can any other object on your system—by using a well-documented application
programming interface (API) with useful properties and methods.
As you learned in earlier chapters, the DOM is a World Wide Web Consortium
(W3C) Recommendation. Because the DOM is a large project, the W3C DOM Working
Group faced and still faces a daunting task. To better manage the project, the
group broke up the work into multiple parts, adopting the first part in October
1998; I expect that the second part will be complete in the first half of
2000.
The W3C recommendation is useful as a blueprint for a common object model,
but it does not go far enough in defining a specific implementation of the DOM.
Each implementation of the DOM, therefore, will probably consist of a different
view of the document object. For example, there are some key interfaces missing
from the W3C version of the DOM that Microsoft felt were important to include. I
use two methods, selectNodes and selectSingleNode, that are not in
the W3C specification. Several parser providers offer implementations of the DOM
in their products. Because the environment in which you implement each parser
has different requirements, each of these implementations is different.
Now I will describe the Microsoft implementation of the DOM, since it has by
far the best documentation and support. The Microsoft DOM is part of the
Microsoft XML parser object. Microsoft includes the Microsoft XML DOM object in
Microsoft Internet Explorer 5, Microsoft Office 2000, and Microsoft Windows
2000. The XML DOM object is also a redeployable object that you can include in
your own applications. The DOM's filename is msxml2.dll (see the sidebar), and
it is registered as a COM object with the name MSXML2.DOMDocument. Since the DOM
is a COM object, you can invoke it wherever you would invoke a COM object in any
COM-enabled application. You can access it as an ActiveX control in scripting
using Microsoft.XMLDOM.
XMLDOM Versions
Because development of the XML DOM is ongoing, you
might find a number of versions of xmldom.dll on your machine. If you program to
MSXML.DOMDocument, you are accessing version 2.5 of the DLL. Using
MSXML2.DOMDocument lets you access version 2.6 of the DLL, and
MSXML.DOMDocument30 gives you access to version 3 of the DLL. You can run
the file xmlinst.exe after installing the latest version of MSXML to point all
of your registry entries to the latest version of the file. In this chapter
we'll program to MSXML2.DOMDocument.
The DOM in Action
Think of the DOM as a dynamic hierarchical object with a set of interfaces,
properties, and methods. It is important to note that the computer sees the
object we call an XML document as just a serial collection of bytes. Because
this collection of bytes takes the form of plain text, it's easy to read and
easy to move around our networks and over the Internet.
However, for the computer to interrogate and manipulate the information in an
XML document, you must turn the document into an in-memory object that is better
suited for treatment by high-level programming languages. You do this by
instantiating a copy of the DOM, which invokes a parser to break up the XML
document into pieces. Figure 5-1 shows the operation of the parser in creating
the DOM object.

Figure 5-1. The DOM provides a standard set of interfaces
that allows a programmer to access the hierarchical objects represented by an
XML stream.
At the top of Figure 5-1 is a small XML document representing a joke. You can
see elements for Joke, Setup, and Punchline, and attributes
for author and firstTold. The XML parser reads this document one
character at a time, determining which characters are markup and which are
content. If the parser doesn't find a schema, it follows the rules of
well-formed XML. If the parser does find a schema, it reads the schema and then
ensures that the document adheres to the structure described by the schema.
Once the parser is satisfied that the document is properly defined, it
creates a set of nodes that have certain properties. (I discussed XML nodes in
Chapter 4.) Table 5-1 lists the 12 node types in the Microsoft DOM
implementation.
Table 5-1. Node types defined by the Microsoft implementation of
the W3C DOM.
| Value |
Name |
Description |
| 1 |
NODE_ELEMENT |
The node represents an element. An element node can have the
following child node types: Element, Text, Comment, ProcessingInstruction,
CDATA-Section, and EntityReference. An element node can be the child of
the Document, Document-Fragment, EntityReference, and Element nodes. |
| 2 |
NODE_ATTRIBUTE |
The node represents an attribute of an element. An attribute
node can have the following child node types: Text and EntityReference.
The attribute does not appear as the child node of any other node type;
note that it is not considered a child node of an element. |
| 3 |
NODE_TEXT |
The node represents the text content of a tag. A text node
cannot have any child nodes. The text node can appear as the child node of
the Attribute, DocumentFragment, Element, and EntityReference nodes. |
| 4 |
NODE_CDATA_SECTION |
The node represents a CDATA section in the XML source. CDATA
sections are used to escape blocks of text that would otherwise be
recognized as markup. A CDATA section node cannot have any child nodes.
The CDATA section node can appear as the child of the DocumentFragment,
Entity-Reference, and Element nodes. |
| 5 |
NODE_ENTITY_REFERENCE |
The node represents a reference to an entity in the XML
document. This node type applies to all entities, including character
entity references. An entity reference node can have the following child
node types: Element, ProcessingInstruction, Comment, Text, CDATASection,
and EntityReference. The entity reference node can appear as the child of
the Attribute, DocumentFragment, Element, and EntityReference nodes. |
| 6 |
NODE_ENTITY |
The node represents an expanded entity. An entity node can
have child nodes that represent the expanded entity (for example, Text and
Entity-Reference nodes). The entity node can appear as the child of the
DocumentType node. |
| 7 |
NODE_PROCESSING_INSTRUCTION |
The node represents a processing instruction (PI) from the
XML document. A PI node cannot have any child nodes. The PI node can
appear as the child of the Document, DocumentFragment, Element, and
EntityReference nodes. |
| 8 |
NODE_COMMENT |
The node represents a comment in the XML document. A comment
node cannot have any child nodes. The comment node can appear as the child
of the Document, DocumentFragment, Element, and EntityReference
nodes. |
| 9 |
NODE_DOCUMENT |
The node represents a document object, which, as the root of
the document tree, provides access to the entire XML document. It is
created by using the ProgID "MSXML2.DOMDocument", or through a data island
using <SCRIPT LANGUAGE=XML> or <XML>. The document node can
have the following child node types: Element (maximum of one), Processing
Instruction, Comment, and Document-Type. The document node cannot appear
as the child of any node types. |
| 10 |
NODE_DOCUMENT_TYPE |
The node represents the document type declaration, indicated
by the <!DOCTYPE> tag. The document type node can have the following
child node types: Notation and Entity. The document type node can appear
as the child of the Document node. |
| 11 |
NODE_DOCUMENT_FRAGMENT |
The node represents a document fragment. The document
fragment node associates a node or subtree with a document without
actually being contained within the document. The document fragment node
can have the following child node types: Element, ProcessingInstruction,
Comment, Text, CDATASection, and EntityReference. The DocumentFragment
node cannot appear as the child of any node types. |
| 12 |
NODE_NOTATION |
A node represents a notation in the document type
declaration. The notation node cannot have any child nodes. The notation
node can appear as the child of the DocumentType node. |
The properties and methods available for each node depend on the type of node
it is. For example, you can load a document node with a serialized (raw text)
XML document, but you can't load an element or attribute node directly. To
access an element node, you must first successfully read the document into a
document node.
In Figure 5-1, Joke has four nodes. The first two are the element nodes Setup
and Punchline. The second two nodes are the attribute nodes author and
firstTold. In the <Joke> start tag, a namespace points to a schema
on an external site. This schema is specified using XML Data Reduced (XDR)
syntax. Notice the nodeDataType property of each node. All are strings
except for the firstTold attribute, which has been declared a
dateTime data type. If you don't specify a namespace, all
nodeDataType properties are strings. By accessing the typedValue
property of nodeDataType , the object will return the value of the date
as a date variant, so your application does not need to validate the data type
or convert the value for processing.
You'll find the full Microsoft DOM API at http://msdn.microsoft.com. Like
most APIs, the DOM API is rich, allowing you to do a number of things including
loading and saving, creating elements and nodes, and of course, parsing XML
documents. And as with most APIs, the DOM API has only a couple of methods and
properties that you will use in your day-to-day work. Let's see how to use some
of these more common properties and methods.
Creating a DOM Object
The first requirement when you work with the DOM is to instantiate a copy of
the XML parser/DOM object in your application. In JavaScript, you use the
ActiveXObject function to create the object, as shown in the following
code:
var objDocument = new ActiveXObject("MSXML2.DOMDocument");
objDocument.async = false;
The first line instantiates the object and creates a variant called
objDocument. This object will contain the document node after you've
loaded the document. You can test this by accessing the value of the
objDocument.nodeType property. In this case, the property contains the
value 9, which maps to the NODE_DOCUMENT node type in Table 5-1.
The async property indicates whether the parser should load the entire
document before making it available to the programmer. Setting the async
property to false ensures that no actions will be taken against a
document that is not fully loaded. This is the safe and easy way to program, but
it might make your application work more slowly. When set to true (the
default setting), the control returns to the caller before the download is
finished. You can then use the readyState property to check the status of
the download. You can also attach an onreadystatechange handler or
connect to the onreadystatechange event to notify you when the ready
state changes and the download are complete.
For loading large documents, you will probably want to set the async
property to true so that you can continue to do other processing while
the object loads. In effect, the load is spun off as a separate thread. If you
do set the async property to true, you should check the value of
the readyState property before you try to access the document. Table 5-2
describes the values of readyState.
Table 5-3 describes the two ways to load a stream of XML text into the
object.
Table 5-2. Values returned by the
readyStateproperty.
| Value |
State |
Description |
| 1 |
LOADING |
The object is bootstrapping, which means it is reading any
persisted properties, not parsing data. |
| 2 |
LOADED |
The object is finished bootstrapping and is beginning to read
and parse data. |
| 3 |
INTERACTIVE |
Some data has been read and parsed, and the object model is
now available on the partially retrieved data set. |
| 4 |
COMPLETED |
The document has been loaded, successfully or
unsuccessfully. |
Table 5-3. Methods for loading an XML document into a DOM
object.
| Method |
Description |
| load(url) |
This method loads an XML document from the location specified
by the URL. If the URL cannot be resolved or accessed or does not
reference an XML document, the documentElement property is set to
null and an error is returned. |
| loadXML(xmlString) |
This method loads an XML document using the supplied string.
The xmlString argument can be a well-formed or valid document. If
the XML within xlmString cannot be loaded (because of parsing
errors), the documentElement property is set to null and an
error is returned. |
The following code loads an XML string into a DOM object:
objDocument.loadXML("<fact verified='2000-01-24'>Movies are " +
"better than books because you can't " +
"spill coffee on them.</fact>");
Once the object is loaded, the parseError object should be checked. A
correctly parsed object will return 0, as in this example:
if (objDocument.parseError.errorCode != 0)
{
alert("Error: " + objDocument.parseError.reason +
" on line " + objDocument.parseError.line);
}
Table 5-4 describes the properties of this read-only parseError
object.
Table 5-4. Properties of the parseError object.
| Property |
Description |
| errorCode |
The error code number in decimal format |
| url |
The URL of the XML file containing the error |
| reason |
The reason for the error in human-readable form |
| srcText |
The full text of the line containing the error |
| line |
The number of the line containing the error (Note that the
line number is relative to the top of the document, so if you have a
document type definition (DTD), the numbering will start counting at the
point immediately following the DTD, not necessarily at the first line of
the document content.) |
| linepos |
The character position within the line where the error
occurred |
| filepos |
The absolute character position in the file where the error
occurred |
Accessing the documentElement
Once we are satisfied that the document has been loaded properly, we can
start to access the contents of the object. We have many ways to do this. The
easiest approach is to access the properties of the document element. The
following code adds the documentElement.nodeName and the
documentElement.text properties to the result string:
result += "objDocument.parseError.errorCode: "
+ objDocument.parseError.errorCode + "\n";
result += "objDocument.documentElement.nodeName: "
+ objDocument.documentElement.nodeName + "\n";
result += "objDocument.documentElement.text: "
+ objDocument.documentElement.text + "\n";
alert (result);
The nodeName and text properties work on any element node. The
objDocument object is a document node. The documentElement
property of this node gives us the element node. We can set a variant to this
element to make our coding a little simpler:
var rootElem = objDocument.documentElement;
Attributes are contained in the XMLDOMElement object as a collection
of named items. This collection belongs to the element in which the attributes
are specified. You can think of the attributes collection as an
associative array—that is, a collection of like objects, each keyed by a string
rather than an offset index. To access the value of the verified
attribute, you need to use the getNamedItem method:
result += "rootElem.attributes.getNamedItem('verified').nodeValue: "
+ rootElem.attributes.getNamedItem("verified").nodeValue + "\n";
It's easy to access the properties of the document element, but what about
other elements in the document? They are a bit harder to access, but in the next
section I'll show you some methods that help you access elements directly.
Getting Items in the Document
Let's load a more complex document. How about our favorite duck-bar joke from
Chapter 4.
<?xml version="1.0"?>
<joke type="story" keywords="duck bar grapes nails">
<scene number="1">
A duck walks into a bar, goes to the bartender,
and says, "Do you have any grapes?" The
bartender says, "No, this is a bar, of course
we don't have any grapes."
</scene>
<scene number="2">
The next day, the duck walks into the bar, goes
up to the bartender, and says, "Do you have any
grapes?" The bartender says, "I told you
yesterday, 'no, we don't have any grapes.'
If you come in here one more time asking for
grapes, I'm going to nail your beak to that bar!"
</scene>
<scene number="3">
The next day, the duck walks into the bar, goes
up to the bartender, and asks, "Do you have any
nails?" The bartender says, "No, this is a bar,
of course we don't have any nails." Then the
duck says, "Do you have any grapes?"
</scene>
</joke>
Assume this document is loaded into our object and the parser returns an
errorCode of 0. We can easily access the element and attributes of
any element node, as you learned in the previous section, but what about the
scene elements? To get them, we can use the childNodes property.
Then we can interrogate the scene elements to get the information we
need:
result += " rootElem.childNodes.item(1).text: "
+ rootElem.childNodes.item(1).text + "\n"
The item property returns the child node. The collection of nodes
returned from the childNodes method is zero-based, so the example here
returns the text of the second scene in which the bartender threatens our little
hero with physical violence.
You can access an array of child nodes through the XMLDOMNodeList
object. Table 5-5 lists the properties and methods available.
Table 5-5. The XMLDOMNodeList interface exposes these
properties and methods.
| Property |
Description |
| length |
Returns the number of nodes in the node list. The length of
the list will change dynamically as children or attributes are added and
deleted from the parent element. |
| Method |
|
| item(index) |
Returns the node in the node list with the specified index.
Index is zero-based. |
| nextNode |
Returns the next node in the node list based on the current
node. |
| reset |
Returns the iterator to the uninstantiated state; that is,
before the first node in the node list. |
We can use the length property to iterate through the collection one
member at a time:
for (i = 0; i < rootElem.childNodes.length; i++)
{
result += " rootElem.childNodes.item(" + i + ").text: "
+ rootElem.childNodes.item(i).text + "\n";
}
To make the preceding code more readable, we can create a new variant that
contains the collection of items:
var colScenes = rootElem.childNodes;
for (i = 0; i < colScenes.length; i++)
{
result += "colScenes.item(" + i + ").text: "
+ colScenes.item(i).text + "\n";
}
The colScenes variant is a DOM NodeList object containing all
the direct child elements of the root element joke. Using this object
makes accessing elements in the DOM very straightforward.
What if you want to access one of the scenes, but only if its attribute is a
certain value? Here's one approach:
var colScenes = rootElem.childNodes
for (i = 0; i < colScenes.length; i++)
{
if (colScenes.item(i).attributes.getNamedItem("number").nodeValue == "1")
{
result += "colScenes.item(" + i + ").text: "
+ colScenes.item(i).text + "\n"
}
}
It works, but it's pretty clumsy. Let's take a look at an alternative
approach for accessing nodes, and then I'll show you an easy way to get a
particular node. The Microsoft DOM implements two handy methods for accessing
exactly what you want: selectNodes(query) and
selectSingleNode(query). These methods are described in
Table 5-6.
Table 5-6. You can access an element or set of elements by using
the selectNodes and selectSingleNode methods in the Microsoft DOM
implementation. The query argument contains a pattern defined by the W3C XPath
specification.
| Method |
Description |
| selectNodes(query) |
Returns a node list containing the results of the query
indicated by the query string by using the current node as the
query context. If no nodes match the query, an empty node list is
returned. If the query string has an error, DOM error reporting is
used. |
| selectSingleNode(query) |
Returns a single node that is the first node the node list
returned from the query, using the current node as the query context. If
no nodes match the query string, null is returned. If the
query string has an error, an error is
returned. |
The query strings passed to the methods in Table 5-6 are Extensible
Stylesheet Language (XSL) patterns. I'll discuss XSL patterns in Chapter 6. For
now, to access our document, let's take a look at the selectNodes method
as an alternative to childNodes:
var colScenes = rootElem.selectNodes("scene")
for (i = 0; i < colScenes.length; i++)
{
result += "colScenes.item(" + i + ").text: "
+ colScenes.item(i).text + "\n"
}
Using the selectNodes method is a little bit of an improvement over
using the childNodes method, and it is clearly more self-explanatory. The
advantages of the selectNodes method become more obvious once you start
delving deeper into a complex document. For example, you can easily access a
collection of line items deep in an invoice document by using a complex XSL
query pattern such as the following:
selectNodes("/invoice/body/items/item")
Accessing an item using the childNodes method to drill down through
the hierarchy would require quite a bit of code. We would need to iterate
through our node list a number of times to select the nodes that we want. To
eliminate most of that code, you can use the selectSingleNode method:
result += rootElem.selectSingleNode("scene[@number='2']").text;
The XSL pattern here returns the first scene, which has an attribute (@)
named number that has a value of 2.
Now you have enough practical knowledge of the DOM to create an application
that produces actual results.
Exercise: Using the DOM in Visual Basic
The DOM provides an interface into an XML document, allowing you to access
the document's contents. Because the Microsoft XML processor is a COM object and
contains support for the W3C DOM, you can instantiate it in any tool that can
use COM objects. Thus you can use the XML processor in Microsoft Visual Basic,
as in the this exercise, and you can use it in a Web browser function written in
JavaScript, in a C++ program, or even in a Java application.
In this section, you will build a Visual Basic program that uses the DOM to
access the contents of an XML document. The program instantiates a DOM object,
loads an XML document into the object, creates a collection of elements, and
iterates through the collection. You'll find all files—including the Architag
XRay XML Editor—on the companion CD. Before you begin this exercise, install the
editor from the XRaySetup.exe file.
The Microsoft implementation of the W3C DOM is contained in a DLL named
msxml2.dll. This object is registered as a COM object as
MSXML2.DOMDocument. You can use it wherever you can instantiate a COM
object.
- Load mvp.xml document into the XRay XML Editor. You'll find this file in
the \Samples\Ch05\ directory on the companion CD. You'll see the screen shown
in Figure 5-2.
Notice that Figure 5-2 is a well-formed XML document containing more than
150 entries. There is one entry for each baseball player who has won
baseball's Most Valuable Player award. (For those of you who are counting,
there have been three such awards: the Chalmers Award, given from 1911-1914;
the League Award, given from 1922-1929; and finally the Baseball Writer's
Award, also known as the MVP Award, first given in 1931.) Each entry in this
example has a unique identifier on the entry element and seven elements
describing the player. Our task is to calculate the batting average of each
player that year and display the best and worst averages in a window.

Figure 5-2. The file mvp.xml in the Architag XRay XML Editor.
- Start Visual Basic as shown in Figure 5-3, click on Standard EXE, and then
click Open.

Figure 5-3. Microsoft Visual Basic 6.0.
- Select Form1 from the Project Explorer window as shown in Figure 5-4. From
the Project menu, choose Remove Form1.

Figure 5-4. The Project Explorer window.
- Add the DOM form to the project by selecting Project, choosing Add File,
and then choosing \Samples\Ch05\DOM.frm.
- Open the form by clicking the plus sign in the Project window.
Double-click on the BaseballStats form, as shown in Figure 5-5.

Figure 5-5. Getting to the BaseballStats form.
- Make this form the startup form by selecting Project1 Properties from the
Project menu. Choose BaseballStats from the Startup Object pull-down menu, as
shown in Figure 5-6. Click OK.

Figure 5-6. Making the BaseballStats form your startup form.
- From the Project menu, choose References. Add the MSXML parser to the
project by choosing Microsoft XML, v3.0 from the Available References list, as
shown in Figure 5-7. Click OK.

Figure 5-7. Adding the MSXML parser to your project.
You should see an environment that looks like Figure 5-8.

Figure 5-8. The BaseballStats form.
- Double-click the form just to the right of the Compute button. This will
bring up the Form Load subroutine. There are two lines missing. Enter
them as follows:
Set oXMLDoc = CreateObject("MSXML2.DOMDocument")
oXMLDoc.Load ("c:\mvp.xml")
Replace the path c:\ with the path to the mvp.xml file on your machine.
Notice that the IntelliSense processor pops up all the available properties
and methods when you key in the second line, as shown in Figure 5-9.

Figure 5-9. The IntelliSense processor shows all available
properties and methods in a pop-up menu.
Your subroutine should look like Figure 5-10 when you are finished.

Figure 5-10. The Form_Load subroutine.
The first line creates an instance of the Microsoft XML parser,
"Microsoft.XMLDOM". This parser is a COM object that you can use in any
program. Think of XML's processing capabilities as a part of the operating
system.
The second line loads the MVP document into the object to make it available
for scripting.
- Notice that at the top of the BaseballStats(Code) window there is a
drop-down list box that displays Form. Click this, and select Compute. Doing
so reveals the subroutine shown in Figure 5-11. This is the subroutine that
will execute when the user clicks the Compute button.

Figure 5-11. The Compute_Click subroutine.
- Notice the blank line as the first line in the subroutine. Enter the
following code:
Set Entries = oXMLDoc.selectNodes("//Entry")
As you enter code, you will see that the IntelliSense processor is helping
you, as shown in Figure 5-12.

Figure 5-12. The IntelliSense processor helps determine the
parameters of the selectNode method.
The Compute_Click subroutine creates a collection of element nodes
containing all 154 baseball player entries. Now we can access this collection
one player at a time to find which players have the highest and lowest
averages.
Once we find the highest and lowest averages, the HighPlayer and
LowPlayer objects contain the entire entries for the appropriate
players. These objects are interrogated at the end of the subroutine and
presented in the text box of the application.
- Start the program by pressing F5. You should see the window shown in
Figure 5-13.

Figure 5-13. The main window of the MVP program.
- Click the Compute button, and see your beautiful results, shown in Figure
5-14.

Figure 5-14. The results of the MVP program.
- Close this window, and save the project as DOM.vbp.
Event-Driven Models
The DOM provides a compact, easy-to-use set of interfaces for processing the
contents of an XML document. However, sometimes using the DOM API is not the
best approach. Suppose, for example, that you have a huge XML document to
process. Because the DOM is a tree-based API, it does not allow you to process
any of the document until the entire document is read successfully into the
object. Event-driven APIs can report parsing events directly to the calling
application, which can save a lot of processing time on large documents.
For example, suppose you need to get just the first few elements at the top
of a document. Loading the entire document if you want to process only the first
few elements wastes cycles. The event-driven model allows you to access the
elements as the parser encounters them during processing. You can access an
element in this manner whether or not an error lurks below. Remember that with
the in-memory DOM API, an error in the last element of the document will render
the entire document in error.
The tools that use this approach generate events that can be captured by
rules. These rules then process the elements as they are encountered in the
document.
One event-driven model that has been competing for the attention of
developers is SAX, the Simple API for XML. SAX was developed by members of the
XML-DEV mailing list, hosted by OASIS. In some ways, SAX is considered a
competitor of DOM. However, I don't see it that way. Sometimes the DOM is more
appropriate for a given situation, and sometimes SAX is more appropriate. In
fact, Microsoft has begun to implement some features of SAX in the Microsoft XML
Parser (MSXML). You can find out more about SAX at http://www.megginson.com/SAX/.
SAX is a great interface, but I want to concentrate on OmniMark, which is a
more mature, event-driven, object model programming language. I'll describe
OmniMark in detail in Appendix A.