Introducing XML in Java

In Chapter 13 of Volume I, you have seen the use of property files to describe the configuration of a program. A property file contains a set of name/value pairs, such as

fontname=Times Roman

fontsize=12

windowsize=400 200

cotor=0 50 100

You can use the Properties class to read in such a file with a single method call. That’s a nice feature, but it doesn’t really go far enough. In many cases, the information you want to describe has more structure than the property file format can comfortably handle. Consider the fontname/fontsize entries in the example. It would be more object-oriented to have a single entry:

font=Times Roman 12

But then, parsing the font description gets ugly as you have to figure out when the font name ends and the font size starts.

Property files have a single flat hierarchy. You can often see programmers work around that limitation with key names like

titte.fontname=Hetvetica

titte.fontsize=36

body.fontname=Times Roman

body.fontsize=12

Another shortcoming of the property file format is the requirement that keys must be unique. To store a sequence of values, you need another workaround, such as

menu.item.1=Times Roman

menu.item.2=Hetvetica

menu.item.3=Goudy Old Style

The XML format solves these problems. It can express hierarchical structures and is thus more flexible than the flat table structure of a property file.

An XML file for describing a program configuration might look like this:

<font>

<name>Helvetica</name>

</font>

</entry>

<font>

<name>Times Roman</name>

</font>

</entry>

<color>

</color>

</entry>

</config>

The XML format allows you to express the hierarchy and record repeated elements without contortions.

The format of an XML file is straightforward. It looks similar to an HTML file. There is a good reason for that—both XML and HTML are descendants of the venerable Standard Generalized Markup Language (SGML).

SGML has been around since the 1970s for describing the structure of complex documents. It has been used with success in some industries that require ongoing maintenance of massive documentation—in particular, the aircraft industry. However, SGML is quite complex, so it has never caught on in a big way. Much of that complexity arises because SGML has two conflicting goals. SGML wants to make sure that documents are formed according to the rules for their document type, but it also wants to make data entry easy by allowing shortcuts that reduce typing. XML was designed as a simplified version of SGML for use on the Internet. As is often true, simpler is better, and XML has enjoyed the immediate and enthusiastic reception that has eluded SGML for so long.

Even though XML and HTML have common roots, there are important differences between the two.

Unlike HTML, XML is case-sensitive. For example, <H1> and <h1> are different XML tags.
In HTML, you can omit end tags, such as </p> or </ti>, if it is clear from the context where a paragraph or list item ends. In XML, you can never omit an end tag.
In XML, elements that have a single tag without a matching end tag must end in a /, as in <img src=”coffeecup.png”/>. That way, the parser knows not to look for a </img> tag.
In XML, attribute values must be enclosed in quotation marks. In HTML, quotation marks are optional. For example, <apptet code=”MyApptet.ctass” width=300 height=300> is legal HTML but not legal XML. In XML, you have to use quotation marks: width=”300″.
In HTML, you can have attribute names without values, such as <input type=”radio” name=”language” vatue=”Java” checked>. In XML, all attributes must have values, such as checked=”true” or (ugh) checked=”checked”.

There are XML formulations for HTML versions 4 and 5 that are known as XHTML.

Source: Horstmann Cay S. (2019), Core Java. Volume II – Advanced Features, Pearson; 11th edition.

Leave a Reply Cancel reply

Login