XML Tutorial
Volume 8 : The XSLT Stylesheet and XPath

Tomoya Suzuki

Index

Node Selection and Pattern Matching

Node Selection

XPath Notation Method

Review Questions

Node Selection and Pattern Matching

In XSLT stylesheets, template rules for node selection and pattern matching are applied via the select attribute of the xsl:apply-templates command and the match attribute of the xsl:template element, respectively. A specification can be created to determine how to resolve issues in the event that a multiple number of applicable template rules exist, or alternately, when there are no applicable template rules at all.

Node Selection

With the select attribute of xsl:apply-templates command, an XPath description can be used to either (1) select a multiple number of nodes with identical names, or (2) select a multiple number of nodes with differing names. Under scenario (1), using XPath to designate "ProductList/ Product" results in the selection of two Product element nodes.

scenario 1

Under scenario (2), designating "ProductList/Product/*" results in selecting the ProductName element node and UnitPrice element node.

scenario 2

The "*" represents all element nodes. A selection of a multiple number of nodes in this manner is called a "node set."

In the event that the xsl:apply-templates command and the xsl:template element are defined for a node that does not exist within the XML document, the template rule will not be applied.

Applying Templates

Let's look at what happens when we apply the LIST2 XSLT stylesheet to the LIST1 XML document:

LIST1:XML Document(list1.xml)

1 <?xml version="1.0" ?>
2 <?xml-stylesheet type="text/xsl" href="list2.xsl"?>
3 <SalesReport>
4  <Header>
5   <InputDate>2006/09/09</InputDate>
6   <PropertyName>ABC Services Co., Ltd.</PropertyName>
7   <SalesPerson>Taro Yamada</SalesPerson>
8  </Header>
9  <Body>
10   <Results>
11    <ProductName>XML Database</ProductName>
12    <UnitPrice Units="US$">1230</UnitPrice>
13    <Volume>1</Volume>
14   </Results>
15   <Results>
16    <ProductName>XML Editor</ProductName>
17    <UnitPrice Units="US$">15</UnitPrice>
18    <Volume>10</Volume>
19   </Results>
20  </Body>
21 </SalesReport>

LIST2:XSLT Stylesheet(list2.xsl)

1 <?xml version="1.0"?>
2 <xsl:stylesheet version="1.0"
3      xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
4   <xsl:template match="/">
5     <html>
6       <body>
7         <h1>Sales Report</h1>
8         <xsl:apply-templates select="SalesReport/Header/*"/>
9         <xsl:apply-templates select="SalesReport/Body"/>
10       </body>
11     </html>
12   </xsl:template>
13   <xsl:template match="InputDate">
14     - InputDate:<xsl:value-of select="." /><br/>
15   </xsl:template>
16   <xsl:template match="PropertyName">
17     - PropertyName:<xsl:value-of select="." /><br/>
18   </xsl:template>
19   <xsl:template match="SalesPerson">
20     - SalesPerson:<xsl:value-of select="." /><br/>
21   </xsl:template>
22   <xsl:template match="Body">
23     <table border="1" width="300">
24       <tr><th>ProductName</th><th>Price</th><th>Volume</th></tr>
25       <xsl:apply-templates select="Results"/>
26     </table>
27   </xsl:template>
28   <xsl:template match="Results">
29     <tr>
30       <td><xsl:value-of select="ProductName" /></td>
31       <td align="right"><xsl:value-of select="UnitPrice" />
32         (<xsl:value-of select="UnitPrice/@Units" />)</td>
33       <td align="right"><xsl:value-of select="Volume" /></td>
34     </tr>
35   </xsl:template>
36 </xsl:stylesheet>

When a node set has been selected using the select attribute of the xsl:apply-templates command, templates are applied individually to each node. The xsl:apply-templates command at LIST2 line 26 selects the Results element of lines 10 and 15 in LIST1. The template rule at line 30 of LIST2 is applied to each of the Results element nodes.

The xsl:apply-templates command at line 8 of LIST2 selects the InputDate element node, the PropertyName element node and the SalesPerson element node from lines 5 through 7 of LIST1, to which the template rules from lines 13, 16 and 19 of LIST2 are applied.

Pattern Matching

So which template rule is applied to which individual node within the node set selected when using the select attribute of the xsl:apply-templates command? That determination is made according to which nodes within the node set match the pattern designated within the match attribute of the xsl:template element. In the diagram below, two ProductName element nodes are selected, for which there are corresponding "Product/ProductName" and "Auxiliary/ProductName" pattern template rules. For each of the selected nodes, the template rule that matches that node pattern is applied.

Pattern Matching

Resolving Conflicting Template Rules

What do we do when there is more than one applicable template rule? Rules can be assigned priorities (such as with the following table), allowing the rule with higher priority to be applied.

Resolving Template Rule Conflicts

Pattern Example Priority
"*", "@*", "text ()" and others for which a specific name is not designated
-0.5
"Prefix:*", "@prefix:*" and others for which namespace is designated, but not a specific name
-0.25
"Product", "@Units" and others where element and attribute names are specified
0
"Header/SalesPerson", "Body/Results/Product", "UnitPrice/@Units" and others for which a hierarchy has been designated
0.5

*Note:For situations when a multiple number of template rules have the same priority, either an error is thrown, or the template rule occurring last is selected.

In other words, the more specific rule has the higher priority. According to the specification, "when a multiple number of template rules having the same priority exist, either an error is thrown, or the template rule occurring last is applied." General XSLT processors are designed to apply the template rule written last in such cases in order to continue processing.

The priority of a template rule is affected according to whether only a certain node is designated, or whether a hierarchy is designated. For example, compared to "ProductName" (priority 0), "Product/ProductName" (Priority 0.5) has a higher priority. However, since "ProductList/Product/ProductName" and "Product/ProductName" are both designated in hierarchical fashion, both have a priority of 0.5. In this case, the template rule written last is the one that is applied.

Built-In Template Rule

What should we do when there is no applicable template rule? Under XSLT, several template rules are already provided in order to allow for transformation in this type of situation. This is called a "built-in template rule."

Built-In Template Rule

(1) In the event that a corresponding element node/ root node pattern does not exist, apply the template rule to a child node

〈xsl:template match="*|/"〉
〈xsl:apply-templates/〉
〈/xsl:template〉

(2) In the event that a corresponding text node/ attribute node pattern does not exist, output the node value

〈xsl:template match="text()|@*"〉
〈xsl:value-of select="."/〉
〈/xsl:template〉

(3) In the event that a corresponding process command node/ comment node pattern does not exist, do nothing

〈xsl:template match="processing-instruction()|comment()"/〉

Under (1) above, the template rule is applied to the root node and all element nodes. In this case, the current node and all child nodes are selected, and an attempt is made to apply the template rule to them. Through this repeated cycle, the child nodes are traced in order, searching for a matching template rule. During the course of this tracing process, there is a chance that a text node is encountered. In this case, rule (2) is applied, and that value is displayed.

In the event that a template rule matching the attribute node selected by the select attribute of the xsl:apply-templates command does not exist, then rule (2) is applied. When there is no matching template rule when the process command node and comment node have been selected, then rule (3) is applied.

If there is no matching template rule, then the child nodes are traced in order; the text string of any text node subsequently encountered is then output.

XPath Notation Method

XPath is used to designate node selection and template rule patterns. The skillful use of XPath allows for more flexible stylesheets. Node selection can be further refined or can be designated by tracing the hierarchy in order.

A number of different definitions are provided through XPath in addition to node hierarchies. These include value calculations and the ability to extract portions of text strings. Here, we will introduce the location path notation method and the functions available through XPath:

Location Path and Location Steps

Under XPath, the XML data structure is considered to be a hierarchy tree structure, and a node in any certain hierarchical position can be designated. The "location path" is used to describe the path through this hierarchy.

The location path can be expressed as an "absolute path" or "relative path." The node matching the template rule is the current node, while the description of the path leading to the target node (using the current node as the origin) is the "relative path." The description of the path leading to the target node from the root node (as the origin) is the "absolute path."

The location path is designated by separating the path segments using the "/" character, with each individual segment called a "location step." The location step is described using "axis," "node test," and "predicate" (LIST3).

LIST3: Location Path and Location Steps

Location Path = Absolute Path or Relative Path
Absolute Path = /Relative Path
Relative Path = Location Step/ Location Step/ …
Location Step = axis::node test[predicate]

(Notation Example)
Absolute Path:"/" "/ProductList/Product"
Relative Path:"ProductList/Product" "Product" "UnitPrice/@Units"

To this point we have only defined the node test, tracing through the hierarchy in order. By designating the axis, we can move up through the hierarchy, or use the predicate to filter nodes.

Current Node and Context Node

The node serving as the base point when processing template rules is called the "current node." When the relative path is expressed in XPath, the current node is considered to be the base point. The node for each location step along the location path is called the "context node."

Axis

The "axis" is used to designate the relative positional relationship from the context node to the next location step.

Table : Location Step Axis

Axis Name Meaning
self Context node
parent Parent node
ancestor Ancestor node
ancestor-or-self Ancestor node and context node
child Child node
descendant Descendant node
descendant-or-self Descendant node and context node
preceding-sibling Sibling node preceding self
preceding Node preceding context, excluding ancestor and attribute and namespace
following-sibling Sibling node following self
following Node following context, excluding descendant and attribute and namespace
attribute Attribute node
namespace Namespace node

The axis can be used to designate a parent node, or a preceding/ following node in the same hierarchy.

If the axis is omitted, then the child axis is designated by default. In the examples above, we did not designate an axis, so the child axis, or in other words the child node, was designated by default. The axis is designated by "axisname::".

Symbols are available to simplify the axis designation.

Table : Axis Simplification

Simplified Symbol  
Axis omitted axis child::
Simplified Expression Product/ProductName
Expression using an axis child::Japan/ child::Information
@ axis attribute::
Simplified Expression UnitPrice/@Units
Expression using an axis child::PrefecturName/ attribute::Read
// axis /descendant-or-self::node()/
Simplified Expression //PrefecturalFlower
Expression using an axis /descendant-or-self::node() /child::PrefecturalFlower
. axis self::node()
Simplified Expression ./PrefectureName
Expression using an axis self::node()/ child::PrefectureName
.. axis parent::node()
Simplified Expression ../PrefecturalFlower
Expression using an axis parent::node()/ child::PrefecturalFlower

In most cases, the axis may be expressed using a simplified symbol; accordingly, the axis is generally designated specifically only in those special cases where a simplified method cannot be used.

Node Test

Designates the node name.

Table : Node Test

Node Test Description Example
Element Name "Product" "ns:SalesPerson"
All Element Nodes "*" "ns:*"
All Nodes node()
Text Nodes text()
Comment Nodes comment()

The axis and node test are used to designate the position and node related to the context node. The axis and node test are written as "axisname::nodename".

For example, "parent::Product" represents the Product element node located one hierarchical level up. Defining "descendant::ProductName" represents the ProductName element node located in a descendant node.

Predicate

Describe conditions in a predicate to filter the node set. Use the XPath method within brackets ("[]"). The XPath notation can be categorized as follows:

  • (1) Conditions represented by Boolean value (true/false)
  • (2) Conditions that are numerical values
  • (3) Conditions that are node sets
  • (4) Conditions that are text strings

(1) Conditions represented by Boolean value (true/false)

When represented as in (1) Boolean value, only those nodes regarded as "true" will be selected. Relational operators and logical operators are provided for describing conditions in XPath, and these can be used to obtain Boolean value results.

Table : XPath Relational Operators and Logical Operators

●Relational Operators

Symbol Explanation
= Equals
!= Does not equal
< Less than (noted as &lt;)
> Greater than (may also be noted as &gt;)
<= Less than or equal to (noted as &lt;)
>= Greater than or equal to (may also be noted as &gt;=)

●Logical Operators

Symbol Explanation
or Logical addition (when left side is true, right side is not evaluated)
and AND operation (when left side is false, right side is not evaluated)

Here, the "<" character cannot be used within an attribute or element (interpreted as a start tag), so we use the predefined entity reference "&lt;" (LIST4(1)).

LIST4 : XPath Predicate Notation Example

●XML Document
<SalesChart>
  <ProductCode="XML001">
    <ProductName>Mousepad</ProductName>
    <UnitPrice>500</UnitPrice>
    <Volume>2</Volume>
  </Product>
  <ProductCode="XML002">
    <ProductName>XML Pen</ProductName>
    <UnitPrice>150</UnitPrice>
    <Volume>20</Volume>
 </Product>
</SalesChart>

●Notation Example
<xsl:apply-templates select="SalesChart/Product[Volume&lt;5]" /> …… (1)
<xsl:apply-templates select="SalesChart/Product[@Code='XML001']" /> …… (2)
<xsl:apply-templates select="SalesChart/Product[UnitPrice*Volume&lt;2000]" /> …… (3)
<xsl:apply-templates select="SalesChart/Product[1]" /> …… (4)
<xsl:apply-templates select="SalesChart/Product[Auxiliary]" /> …… (5)

When comparing text strings, the text strings are enclosed in quotes (LIST4(2)). When the node is a numerical value, it can be used as a basis for calculations via arithmetic operators (see the table below):

Table : Arithmetic Operators in XPath

Symbol Explanation
+ Add
- Subtract
* Multiply
div Divide
mod Modulus(remainder)

Because the "/" character often associated with the division operation is the same as the separation character used in XPath, the characters "div" are used for arithmetical operations. The computational results can be described as conditions using a relational operator (LIST4(3)).

(2) Conditions that are Numerical Values

Indicates the position with respect to the context node. Positions are counted beginning from "1" (LIST4 (4)).

(3) Conditions that are Node Sets

When the node set is not empty, the value is "true," or "false" when the node set is empty. In other words, this represents whether that node exists or not. (LIST4 (5)).

(4) Conditions that are Text Strings

The value is "true" for text strings of one character or longer, or "false" when the text string contains no characters.

The predicate is described in each location step. For example, the notation "ProductList[@SaleStart='October']/Product[@Code='XML001']" designates the Product element with a Code attribute value of "XML001" within the Product element that is the child element of the ProductList element having a SaleStart attribute value of "October".

A multiple number of descriptions can be included in a predicate. For example, the notation "Product[Volume>10][UnitPrice>1500]" designates the Product element where the UnitPrice element value is greater than 1500 for the Product element when the Volume element (child element of the Product element) value is greater than 10.

XPath Functions

Functions have been provided for the XPath method, which can be used to perform various processes. The following table shows the more common XPath functions:

Table : Common XPath Functions

Function Explanation
count(node-set) Returns number of nodes included in node set
position() Returns position for context. Starts with "1"
last() Returns the last position for context
string(object) Returns the text string of the object designated in arguments
starts-with(string,string) "True" if the text string of the first argument starts with the text string of the second argument; otherwise "False"
substring-before(string,string) Returns the substring of the first argument string that precedes the position of matching text string in the second argument
substring-after(string,string) Returns the substring of the first argument string that comes after the position of matching text string in the second argument
substring(string,num,num) Returns a substring from the first argument string of the length designated in the third argument, beginning from a position designated in the second argument. Returns all text to the end of the string if the second argument is omitted.
normalize-space(string) Removes whitespaces before and after a designated string, and returns a string for which a sequence of whitespaces is converted into a single whitespace
not(boolean) Returns "False" when the argument value is true, and vice-versa
sum(node-set) Returns a sum of node set string values transformed into a numerical value

For functions in which the node-set is defined within the function's arguments, the number of node-set elements can be counted (or totaled) by having the node set selected through the location path.

For functions using text strings as arguments, the string value can be defined directly, and/ or the text node value of the element can be defined through the location path. For functions where "num" is the prescribed argument, a numerical value can be defined, or the value of the text node of the element can be defined as a numerical value through the location path.

For example, if we provide sum(shopping/sub-total), then we can calculate the total of all sub-total elements located within child elements of the shopping element. When data is output as XML from a database, there are times when extraneous whitespace is added to data output as a fixed number field value. In this situation, use normalize-space(Product/ProductName) to remove extraneous whitespace from before and after the ProductName element, leaving only the desired value.

Data Output

With an XSLT stylesheet, the xsl:value-of command or attribute value template can be used to extract data from an XML document, and outputting the results in a separate XML document.

Data Output using xsl:value-of

Use the xsl:value-of command to output a value defined in the select attribute. If the node has been defined, the text string of that node can be output.

Table : Node Text String Value

Node Type Text String Value
Root Node Value of all descendant text nodes arranged in order
Element Node Value of all descendant text nodes arranged in order
Attribute Node Normalized attribute value
Namespace Node Namespace URI
Process Command Node Text string from the character immediately following the target name and whitespace to the character before the ?> character combination
Comment Node Comment string data
Text Node Text string data

The output value is the same as the return value of the string() function.

If the node set has been selected, the output process will be performed on the first node. The XPath arithmetic operators (discussed above) can be used to output a calculation result or XPath function return value.

Data Output via Attribute Value Template

The xsl:value-of command can be used to output data as the content of an element; however, the attribute template is used when outputting the content as an attribute value. By defining the attribute value as "{Xpath}", the defined XPath value will be output.

Data Output via Attribute Value Template

Review Questions

Question 1

Select which of the following is the appropriate transformation result when the following XML Document is transformed through the XSLT stylesheet.

[XML Document]
<?xml version="1.0" ?>
<ProductList>
  <Title>XML Series Commemorative Goods</Title>
    <Product>
      <ProductName>XML Pen</ProductName>
      <UnitPrice>200</UnitPrice>
    </Product>
</ProductList>

[XSLT Stylesheet]
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <html>
      <body>
        <xsl:apply-templates select="ProductList/Product"/>
        <xsl:apply-templates select="ProductList/Auxiliary"/>
      </body>
    </html>
  </xsl:template>
  <xsl:template match="/ProductList/Product">
    - ProductName:<xsl:value-of select="ProductName" /><br/>
    - UnitPrice:<xsl:value-of select="UnitPrice" />$<br/>
  </xsl:template>
  <xsl:template match="ProductList/Product">
    § ProductName:<xsl:value-of select="ProductName" /><br/>
    § UnitPrice:<xsl:value-of select="UnitPrice" />USD<br/>
  </xsl:template>
  <xsl:template match="Auxiliary">
    Auxiliary:<br/>
    <xsl:value-of select="." /><br/>
  </xsl:template>
</xsl:stylesheet>

  1. <html>
      <body>
        - ProductName:XML Pen<br>
        - UnitPrice:200USD<br>
      </body>
    </html>
  2. <html>
      <body>
        § ProductName:XML Pen<br>
        § UnitPrice:200USD<br>
      </body>
    </html>
  3. <html>
      <body>
        - ProductName:XML Pen<br>
        - UnitPrice:200USD<br>
        Auxiliary:<br>
      </body>
    </html>
  4. <html>
      <body>
        § ProductName:XML Pen<br>
        § UnitPrice:200USD<br>
        Auxiliary:<br>
      </body>
    </html>

Comments

Since "/ProductList/Product" and "ProductList/Product" have the same priority, the last template rule is the one that is applied. In addition, no Auxiliary element node exists in the XML document, so the template rule for the Auxiliary element node is not applied. Accordingly, the correct answer is B.

Question 2

Select which of the following is the correct XPath notation that applies to (1) for outputting "XML Pen" when the following XML Document is transformed by the XSLT Stylesheet.

[XML Document]
<?xml version="1.0" ?>
<?xml-stylesheet type="text/xsl" href="inventory.xsl"?>
<SalesChart>
  <Product Code="XML001">
    <ProductName>XML Pen</ProductName>
    <UnitPrice>150</UnitPrice>
    <Volume>20</Volume>
  </Product>
  <Product Code="XML002">
    <ProductName>Mousepad</ProductName>
    <UnitPrice>500</UnitPrice>
    <Volume>2</Volume>
  </Product>
</SalesChart>

[XSLT Stylesheet]
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <html>
      <body>
        <xsl:apply-templates select="

[  (1)  ]

        " />
      </body>
    </html>
  </xsl:template>
  <xsl:template match="Product">
    <xsl:value-of select="ProductName" /><br/>
  </xsl:template>
</xsl:stylesheet>

  1. SalesChart/Product[Volume<5]
  2. SalesChart/Product[@Code=XML001]
  3. SalesChart/Product[UnitPrice* Volume>1500]
  4. SalesChart/Product[0]

Comments

Since the "<" character cannot be directly input in the XML document, the "&lt;" entity must be used. In addition, text strings to be compared are enclosed in quotes. When quotes are not used, the notation is viewed as an XPath method representing an element node.

Under the XPath method, calculations can be made using "*" and other operators, and the ">" character can be input directly. The position related to a context node begins with "1." Accordingly, the correct answer is C.

Question 3

Which notation can correctly be placed in (1) below to produce the desired Transformation Result when the XSLT Stylesheet is used to transform the XML Document below?

[XML Document]
<?xml version="1.0" ?>
<?xml-stylesheet type="text/xsl" href="test03.xsl"?>
<ProductList>
  <Title>XML Series Commemorative Goods</Title>
  <Product>
    <ProductName>XML Pen</ProductName>
    <UnitPrice>200</UnitPrice>
    <Image>XMLMasterPen.jpg</Image>
  </Product>
</ProductList>

[XSLT Stylesheet]
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <html>
      <body>
        <h1><xsl:value-of select="ProductList/Title" /></h1>
        Featured Product<br/>
        <table border="1" width="400">
          <tr><th>Product Image</th>
          <th>ProductName</th><th>Price</th></tr>
          <xsl:apply-templates select="ProductList/Product"/>
        </table>
      </body>
    </html>
  </xsl:template>
  <xsl:template match="Product">
      <td>

[    (1)    ]

        </td>
        <td><xsl:value-of select="ProductName" /></td>
        <td><xsl:value-of select="UnitPrice" /></td>
    </tr>
  </xsl:template>
</xsl:stylesheet>

[Transformation Result]
<img src="XMLMasterPen.jpg">

  1. <img src="<xsl:value-of select='Image'/>" />
  2. <img src="string(Image)" />
  3. <img src="XPath(Image)" />
  4. <img src="{Image}" />

Comments

An element cannot be described within an attribute. Even the direct notation of an XPath method in an XSLT stylesheet will not result in any processes, when written in a location that is not an XSLT command.

The attribute value template is used to output the XML document data to an attribute that is not an XSLT command. Accordingly, the correct answer is D.


Tomoya Suzuki

Toshiba OA Consultant, Ltd. Training Solutions Engineering Department. Mr. Suzuki works mainly as an instructor in XML and programming language seminars. He laments that, for some reason, all of the rainy days during rainy season were on the weekends this year, preventing him from playing tennis, and causing him to suffer from a lack of exercise. He does predict, however, that he will have a deep, dark tan by the time this article is published. Don't forget the sun block, Tomoya.


The content presented here is an HTML version of an article that originally appeared in the October 2006 issue of DB Magazine published by Shoeisya.

XML Master Tutorial Indexs

Go To HOME