built-in methods for doing commonly performed tasks such as configuring the parser
and handling error conditions.
XSL performs a functionsupplying human-readable datathat is every bit as im-
portant as XML itself. With the single exception of robotics, all the software in the
world is ultimately used to display data to humans. Nothing is ever stored in a data-
base without the expectation that it will be extracted and presented in some human-
readable format. In fact, presenting readable data is the entire purpose of the Internet.
Operating systems and computer language compilers only exist to support and create
other programs that, in turn, directly or indirectly present data in a form that can be un-
derstood by humans.
SAX
SAX stands for Simple API for XML. It is a collection of Java methods that can be used
to read an XML document and parse it in such a way that each of the individual pieces
of the input are supplied to your program. It is a very rudimentary form of parsing that
is not much more than a lexical scan: It reads the input, determines the type of things
it encounters (it recognizes the format of the nested tags and separates out the text that
is the data portion of the document), and supplies them to your program in the same
order in which they appear in the document.
The form of the data coming from a SAX parser can be very useful for streaming op-
erations such as a direct translation of tags or text into another form, with no changes
in order. If your application needs to switch things around, however, it will be neces-
sary for it to keep copies of the data so it can be reorganized. In many cases, it would
be easier to parse using the DOM parser. The SAX parser has the advantage of being
fast and small because it doesnt hold anything in memory once it has moved on to the
next input item in the input document.
There are two versions of SAX. The original version is SAX 1.0 (also called SAX1).
The current version is SAX 2.0 (also called SAX2). SAX2 is an extension of the defini-
tions of SAX 1.0 to include things such as the ability to specify names using name-
spaces. Both SAX1 and SAX2 are a part of JAXP. Because SAX1 is still a part of JAXP,
programs based on it will work , but much of it has been deprecated in the API to pro-
mote use of SAX2 in all newly written programs. Only SAX2 is discussed in the fol-
lowing chapters because it does everything SAX1 does and more.
DOM
DOM stands for Document Object Model. It is a collection of Java methods that enable
your program to parse an input document into a memory-resident tree of nodes that
maintains the relationships found in the original input document. There are also meth-
ods that enable your application to walk freely about the tree and extract the informa-
tion stored there.
Introduction to XML with JAXP
7
3851 P-01 1/28/02 10:32 AM Page 7