A collection of code for/with/in (E)XSLT. This software is free software. Nevertheless it is copyrighted material. Every piece of code has it's own permissions to use and copy. Please read the particular license information and please respect the rights of the copyright holders.
A program that enables you to do literate programming in XSLT. The manual page is currently available in German only. Translation in progress...
...available on the xsltdoc page
java org.apache.xalan.xslt.Process -param xpathExpression "your_xpath_expression" -in your_xml_instance -xsl xpath-query.xslt
Query the XML-Specification for it's author names:
$ java org.apache.xalan.xslt.Process\ -param xpathExpression '//author/name'\ -in http://www.w3.org/TR/REC-xml/REC-xml-20040204.xml\ -xsl http://www.linkwerk.com/pub/xslt/lib/xpath-query.xslt <?xml version="1.0" encoding="utf-8"?> <lw:query-result xmlns:lw="http://www.linkwerk.com/namespaces/xpath-query"> <name>Tim Bray</name> <name>Jean Paoli</name> <name>C. M. Sperberg-McQueen</name> <name>Eve Maler</name> <name>François Yergeau</name> </lw:query-result>
xsltproc, version information:
> xsltproc -V Using libxml 20602, libxslt 10100 and libexslt 800 xsltproc was compiled against libxml 20602, libxslt 10100 and libexslt 800 libxslt 10100 was compiled against libxml 20602 libexslt 800 was compiled against libxml 20602
BSD-style license; check source code for detailed information.
Please read the exslt.org regular expression description for details first; then return here.
To use the com.linkwerk.util.Regexp extension in your stylesheets, follow the three steps:
Make your XSLT program look like this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:regexp="com.linkwerk.util.Regexp" extension-element-prefixes="regexp"> ... </xsl:stylesheet>
Add lw-regexp-util-1.0.0.jar to the classpath, i.e. java -cp /path/to/lw-regexp-util-1.0.0.jar org.apache.xalan.xslt.Process ...
Have fun :-)
You'll find examples of the usage on the exslt.org site as well. But the examples there are pretty simple. They don't reveal the power of this extensions. The reason is that they focus just on the regexp extension, which is fine to explain what it does; therefore you should start reading on exslt.org. But the power comes from combining XPath with regexp's:
The main strength of the regular expression extension is to extend the ability of
XPath to deal with structured information by adding means to deal with
information (a.k.a text ;-). You will need this power if you have to transform XML documents
with only a flat structure. It's well suited for up-translating flat XHTML documents or
word processor documents which have been converted to a flat XML representation.
If you have repeating phrases (which is implicit structure) within all the documents,
you can now easily transform such documents. BTW: Often legal or official documents
use repeating phrases.
I'm sorry that the following example is German; anyone using the regexp extension for English documents? Send me your examples, please.
<xsl:template match="/"> <ausschreibung> <gegenstand> <xsl:apply-templates select="/html/body/table[(preceding-sibling::table[.//td/h2[regexp:test(text(), 'Ausschreibung de[sr]', 'i')]] or self::table[.//td/h2[regexp:test(text(), 'Ausschreibung de[sr]', 'i')]] ) and (following-sibling::table[.//tr/td[regexp:test(string(.), 'den Bewerber:\s*$')]] or self::table[.//tr/td[regexp:test(string(.), 'den Bewerber:\s*$')]] )]"/> </gegenstand> <xsl:apply-templates select="..."/> ... </ausschreibung> </xsl:template>
This (real world) example origins from a XSLT program that up-translates a given
flat structured XHTML file into a well structured XML document. The template shown
generates a gegenstand-element from certain /html/body/table-elements. For each table-element
the following predicate must be true: The table element (self::table) or one of
its preceding siblings must contain a td/h2-element which contains the words
Ausschreibung des or
Ausschreibung der (regular expression; case insensitive)
the table element (self::table) or one of its following-siblings must end
(case sensitive) with the words
den Bewerber:, followed optionally by white space (\s).
As you can guess by looking at this example, the source document uses many tables for
layout purposes, but not for structuring the content. Thus the templates collects all the
text of the source document within
Ausschreibung de[rs] ... den Bewerber: and places
it within a gegenstand-element.
The zip-file contains a license text; read it carefully. It's BSD-style.