XSL Transformations in Java

The XSL Transformations (XSLT) mechanism allows you to specify rules for transforming XML documents into other formats, such as plain text, XHTML, or any other XML format. XSLT is commonly used to translate from one machine-readable XML format to another, or to translate XML into a presentation format for human consumption.

You need to provide an XSLT stylesheet that describes the conversion of XML documents into some other format. An XSLT processor reads an XML document and the stylesheet and produces the desired output (see Figure 3.4).

The XSLT specification is quite complex, and entire books have been written on the subject. We can’t possibly discuss all the features of XSLT, so we will just work through a representative example. You can find more information in the book Essential XML by Don Box et al. The XSLT specification is available at www.w3.org/TR/xstt.

Suppose we want to transform XML files with employee records into HTML documents. Consider this input file:

<staff>

<name>Carl Cracker</name>

</employee>

<name>Harry Hacker</name>

</employee>

<name>Tony Tester</name>

</employee>

</staff>

The desired output is an HTML table:

<tr>

<td>Carl Cracker</td><td>$75000.0</td><td>1987-12-15</td>

</tr>

<tr>

<td>Harry Hacker</td><td>$50000.0</td><td>1989-10-1</td>

</tr> <tr>

<td>Tony Tester</td><td>$40000.0</td><td>1990-3-15</td>

</tr>

</table>

A stylesheet with transformation templates has this form:

<?xml version=”1.0″ encoding=”ISO-8859-1″?>

<xsl:stylesheet

xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”

version=”1.0″>

<xsl:output method=”html”/>

template1

template2

…

</xsl:stylesheet>

In our example, the xsl:output element specifies the method as HTML. Other valid method settings are xml and text.

Here is a typical template:

<xsl:template match=”/staff/employee”>

</xsl:template>

The value of the match attribute is an XPath expression. The template states: Whenever you see a node in the XPath set /staff/employee, do the following:

Emit the string <tr>.
Keep applying templates as you process its children.
Emit the string </tr> after you are done with all children.

In other words, this template generates the HTML table row markers around every employee record.

The XSLT processor starts processing by examining the root element. Whenever a node matches one of the templates, it applies the template. (If multiple templates match, the best matching one is used; see the specification at www.w3.org/TR/xslt for the gory details.) If no template matches, the processor carries out a default action. For text nodes, the default is to include the contents in the output. For elements, the default action is to create no output but to keep processing the children.

Here is a template for transforming name nodes in an employee file:

<xsl:template match=”/staff/employee/name”>

</xsl:template>

As you can see, the template produces the <td>. . .</td> delimiters, and it asks the processor to recursively visit the children of the name element. There is just one child—the text node. When the processor visits that node, it emits the text contents (provided, of course, that there is no other matching template).

You have to work a little harder if you want to copy attribute values into the output. Here is an example:

<xst:temptate match=”/staff/emptoyee/hiredate”>

<td><xst:vatue-of setect=”@year”/>-<xst:vatue-of

setect=”@month”/>-<xst:vatue-of setect=”@day”/></td>

</xst:temptate>

When processing a hiredate node, this template emits

The string <td>
The value of the year attribute
A hyphen
The value of the month attribute
A hyphen
The value of the day attribute
The string </td>

The xst:vatue-of statement computes the string value of a node set. The node set is specified by the XPath value of the select attribute. In this case, the path is relative to the currently processed node. The node set is converted to a string by concatenation of the string values of all nodes. The string value of an attribute node is its value. The string value of a text node is its contents. The string value of an element node is the concatenation of the string values of its child nodes (but not its attributes).

Listing 3.10 contains the stylesheet for turning an XML file with employee records into an HTML table.

Listing 3.11 shows a different set of transformations. The input is the same XML file, and the output is plain text in the familiar property file format:

employee.1.name=Carl Cracker

employee.1.salary=75000.0

employee.1.hiredate=1987-12-15

employee.2.name=Harry Hacker

employee.2.salary=50000.0

employee.2.hiredate=1989-10-1

employee.3.name=Tony Tester

employee.3.salary=40000.0

employee.3.hiredate=1990-3-15

That example uses the position!) function which yields the position of the current node as seen from its parent. We thus get an entirely different output simply by switching the stylesheet. This means you can safely use XML to describe your data; if some applications need the data in another format, just use XSLT to generate the alternative format.

It is simple to generate XSL transformations on the Java platform. Set up a transformer factory for each stylesheet. Then, get a transformer object and tell it to transform a source to a result:

var styteSheet = new Fite(fitename);

var styteSource = new StreamSource(styteSheet);

Transformer t = TransformerFactory.newInstance().newTransformer(styteSource);

t.transform(source, result);

The parameters of the transform method are objects of classes that implement the Source and Result interfaces. Several classes implement the Source interface:

DOMSource

SAXSource

StAXSource

StreamSource

You can construct a StreamSource from a file, stream, reader, or URL, and a DOMSource from the node of a DOM tree. For example, in the preceding section, we invoked the identity transformation as

t.transform(new DOMSource(doc), result);

In our example program, we do something slightly more interesting. Instead of starting out with an existing XML file, we produce a SAX XML reader that gives the illusion of parsing an XML file by emitting appropriate SAX events. Actually, our XML reader reads a flat file, as described in Chapter 1. The input file looks like this:

Carl Cracker|75000.0|1987|12|15

Harry Hacker|50000.0|1989|10|1

Tony Tester|40000.0|1990|3|15

Our XML reader generates SAX events as it processes the input. Here is a part of the parse method of the EmployeeReader class that implements the XMLReader interface:

var attributes = new AttributesImpl();

handler.startDocument();

handler.startElement(“”, “staff”, “staff”, attributes);

while ((line = in.readLine()) != null)

{

handler.startElement(“”, “employee”, “employee”, attributes);

var tokenizer = new StringTokenizer(line, “|”);

handler.startElement(“”, “name”, “name”, attributes);

String s = tokenizer.nextToken(); handler.characters(s.toCharArray(), 0, s.length());

handler.endElement(“”, “name”, “name”);

…

handler.endElement(“”, “employee”, “employee”);

}

handler.endElement(“”, rootElement, rootElement);

handler.endDocument();

The SAXSource for the transformer is constructed from the XML reader:

t.transform(new SAXSource(new EmployeeReader(),

new InputSource(new FileInputStream(filename))), result);

This is an ingenious trick to convert non-XML legacy data into XML. Of course, most XSLT applications will already have XML input data, and you can simply invoke the transform method on a StreamSource:

t.transform(new StreamSource(file), result);

The transformation result is an object of a class that implements the Result interface. The Java library supplies three classes:

DOMResult

SAXResult

StreamResult

To store the result in a DOM tree, use a DocumentBuilder to generate a new document node and wrap it into a DOMResult:

Document doc = builder.newDocument();

t.transform(source, new DOMResult(doc));

To save the output in a file, use a StreamResult:

t.transform(source, new StreamResult(file));

Listing 3.12 contains the complete source code.

This example concludes our discussion of XML support in the Java library. You should now have a good perspective on the major strengths of XML—in particular, its automated parsing and validation as well as its powerful transformation mechanism. Of course, all this technology is only going to work for you if you design your XML formats well. You need to make sure that the formats are rich enough to express all your business needs, that they are stable over time, and that your business partners are willing to accept your XML documents. Those issues can be far more challenging than dealing with parsers, DTDs, or transformations.

In the next chapter, we will discuss network programming on the Java platform, starting with the basics of network sockets and moving on to higher-level protocols for e-mail and the World Wide Web.

Source: Horstmann Cay S. (2019), Core Java. Volume II – Advanced Features, Pearson; 11th edition.

Leave a Reply Cancel reply

Login