Writing Your Own Functions in XSLT 2.0
September 03, 2003
Most XSLT 1.0 processors, particularly the ones written in Java, let you write extension functions in the processor’s host language, link them in, and then call those functions from stylesheets. The XSLT 1.0 spec spells out specific ways to check whether a particular extension function is available and how to recover gracefully if not. In the September 2001 “Transforming XML” column. I presented examples of extension elements and functions.
If you wanted to write your own functions within a stylesheet, there were ways to fake it with named templates, but faking it won’t be necessary with XSLT 2.0, which lets you write your own functions using XSLT syntax. These functions return values that can be used all over your spreadsheet, even in XPath expressions.
Let’s look at a simple example. The following stylesheet creates a result tree upon seeing the root of any document, so you can run it with itself as input. It declares a function called foo:compareCI. which does a case-insensitive comparison of two strings and returns the same values as the XSLT 2.0 compare() function described in last month’s column .
Also in Transforming XML
The first thing to notice is that the declared function must come from a namespace outside of the XSLT namespace. In the example I assigned a namespace prefix of foo to the whatever URL to make it clear that you can use any namespace, as long as it’s not the XSLT namespace. The URL I specified wasn’t serious, but works anyway. You’ll probably want to pick a URL associated with your company or project.
The actual function declaration in the sample stylesheet is in an xsl:function element. Its structure is pretty straightforward: a name attribute stores the function’s name, and optional xsl:param child elements name parameters that can be passed to the function, just like xsl:param elements do in XSLT 1.0’s xsl:template elements.
In the example above, the two parameters passed are the two strings to be compared.
The function’s only remaining line is an xsl:value-of instruction, which uses XPath 2.0’s compare() and upper-case() functions to perform its comparison and output the result. The return value of the function is the sequence of nodes that it outputs. If you want, you can add an as attribute to the xsl:function element to indicate a specific data type that the function returns. Because my foo:compareCI() function returns the integer returned by its call to the compare() function, I could have added an as=”xs:integer” attribute value to the xsl:function element (which would have required declaration of the w3.org/2001/XMLSchema namespace to go with that “ns” prefix), but I wanted to keep my first example function as simple as possible.
When run with Saxon 7’s experimental XSLT 2.0 support, this stylesheet creates the following output:
The third line is the most important here because it shows that the function considers “red” and “Red” to be equal. (See last month’s column for the meaning of the various return values.)
XSLT 2.0 functions can be recursive. The following stylesheet includes a substring function that expects you to pass it a string ( inString ) and the length of a substring to pull from that string ( length ), starting at its first character.
Instead of always breaking after length characters, though, this function only breaks there if it finds a word boundary character. Otherwise, it breaks at the last word boundary before that. It does this by calling itself with the same inString value and a length value of length – 1. Before making each recursive call, the function’s xsl:choose element’s first xsl:when element checks whether $length is less than or equal to 0 and returns the entire string if so, because if $length was decremented that far, there’s no point in continuing. The second xsl:when element checks whether the passed string is already shorter than the requested length, in which case it just returns the whole string. The third and last xsl:when element checks whether character number $length in $inString is a member of the list of delimiter characters defined near the beginning of the stylesheet, and if so, returns the string up to that point, because its job is done. If none of these conditions are true, the xsl:otherwise element makes the recursive call.
The four strings passed to the function test several possible outcomes. With any source document, the stylesheet creates this result:
What happens if we pass a bad parameter to the function? For example, what if we added this new line after the “no boundaries” line, passing the string “five” instead of a numeric digit as the second parameter?
Without executing the function on any of the legitimate input, Saxon 7 immediately tells us about the following problem:
The stronger typing offered by XSLT 2.0 lets us plan for this a little better. By adding an as attribute to the function’s declaration for the length parameter, like this,
we tell the XSLT processor to check the types of the parameters when they’re passed, instead of waiting for the bad data to blow up in some line of the stylesheet that doesn’t know what to do with it. (Don’t forget to add xmlns:xs=”w3.org/2001/XMLSchema” to the other namespace declarations in the stylesheet’s start-tag.) With length declared using this typing, Saxon 7 catches the error sooner and delivers a more informative error message:
Nearly all serious programming languages offer the ability to declare and use your own functions; most programmers have become accustomed to the modularity and scalability advantages that this gives them. Now XSLT 2 developers will have these advantages as well.
After the following stylesheet declares these two functions, it outputs the sample input list delimited by pipe characters. It then tests the functions individually and combines them into a more complex expression to extract the third member of the list sequence:
The output shows that it works. It may not look particularly useful, but it should provoke a smirk from some of the grayer-haired developers out there: