XML-XPathScript

XML-XPathScriptversion 1.45YanickChampouxDominiqueQuatravauxMattSergeant XPathScript is a stylesheet language similar in many ways to XSLT (in concept, not in appearance), for transforming XML from one format to another (possibly HTML, but XPathScript also shines for non-XML-like output). Like XSLT, XPathScript offers a dialect to mix verbatim portions of documents and code. Also like XSLT, it leverages the powerful ``templates/apply-templates'' and ``cascading stylesheets'' design patterns, that greatly simplify the design of stylesheets for programmers. The availability of the XPath query language inside stylesheets promotes the use of a purely document-dependent, side-effect-free coding style. But unlike XSLT which uses its own dedicated control language with an XML-compliant syntax, XPathScript uses Perl which is terse and highly extendable. The result of the merge is an extremely powerful tool for rendering complex XML documents into other formats. Stylesheets written in XPathScript are very easy to create, extend and reuse, even if they manage hundreds of different XML tags. XML::XPathScript - a Perl framework for XML stylesheets

SYNOPSIS new; my $transformed = $xps->transform( $xml, $stylesheet ); # having the output piped to STDOUT directly my $xps = XML::XPathScript->new( xml => $xml, stylesheet => $stylesheet ); $xps->process; # caching the compiled stylesheet for reuse and # outputting to multiple files my $xps = XML::XPathScript->new( stylesheetfile => $filename ) foreach my $xml (@xmlfiles) { my $transformed = $xps->transform( $xml ); # do stuff with $transformed ... }; # Making extra variables available to the stylesheet dialect: my $xps = XML::XPathScript->new; $xps->compile( qw/ $foo $bar / ); # in stylesheet, $foo will be set to 'a' # and $bar to 'b' $xps->transform( $xml, $stylesheet, [ 'a', 'b' ] );]]>

STYLESHEET WRITER DOCUMENTATION If you are interested to write stylesheets, refers to the XML::XPathScript::Stylesheet manpage. You might also want to take a peek at the manpage of xpathscript, a program bundled with this module to perform XPathScript transformations via the command line.

STYLESHEET UTILITY METHODS Those methods are meants to be used from within a stylesheet.

current current]]> This class method returns the stylesheet object currently being applied. This can be called from anywhere within the stylesheet, except a BEGIN or END block or similar. Beware though that using the return value for altering (as opposed to reading) stuff from anywhere except the stylesheet's top level is unwise.

interpolation interpolation $interpolate = $XML::XPathScript::current->interpolation( $boolean )]]> Gets (first call form) or sets (second form) the XPath interpolation boolean flag. If true, values set in and may contain expressions within curly braces, that will be interpreted as XPath expressions and substituted in place. For example, when interpolation is on, the following code set( link => { pre => '', post => '' } );]]> is enough for rendering a ]]> element as an HTML hyperlink. The interpolation-less version is slightly more complex as it requires a : findvalue('@url'); $t->set({ pre => "", post => "" }); return DO_SELF_AND_KIDS(); };]]> Interpolation is on by default.

interpolation_regex interpolation_regex $XML::XPathScript::curent->interpolation_regex( $regex )]]> Gets or sets the regex to use for interpolation. The value to be interpolated must be capture by $1. By default, the interpolation regex is qr/{(.*?)}/. Example: interpolation_regex( qr#\|(.*?)\|# ); $template->set( bird => { pre => '|@name| |@gender| |@type|' } );]]>

binmode Declares that the stylesheet output is not in UTF-8, but instead in an (unspecified) character encoding embedded in the stylesheet source that neither Perl nor XPathScript should have any business dealing with. Calling current()->binmode()]]> is an irreversible operation with the consequences outlined in The Unicode mess.

TECHNICAL DOCUMENTATION The rest of this POD documentation is not useful to programmers who just want to write stylesheets; it is of use only to people wanting to call existing stylesheets or more generally embed the XPathScript engine into some wider framework. XML::XPathScript is an object-oriented class with the following features: an embedded Perl dialect that allows the merging of the stylesheet code with snippets of the output document. Don't be afraid, this is exactly the same kind of stuff as in Text::Template, HTML::Mason or other similar packages: instead of having text inside Perl (that one print()s), we have Perl inside text, with a special escaping form that a preprocessor interprets and extracts. For XPathScript, this preprocessor is embodied by the xpathscript shell tool (see xpathscript Invocation) and also available through this package's API; a templating engine, that does the apply-templates loop, starting from the top XML node and applying templates to it and its subnodes as directed by the stylesheet. When run, the stylesheet is expected to fill in the template object $template, which is a lexically-scoped variable made available to it at preprocess time.

METHODS

new new( %arguments )]]> Creates a new XPathScript translator. The recognized named arguments are xml => $xml $xml is a scalar containing XML text, or a reference to a filehandle from which XML input is available, or an XML::XPath or XML::libXML object. An XML::XPathscript object without an xml argument to the constructor is only able to compile stylesheets (see SYNOPSIS). stylesheet => $stylesheet $stylesheet is a scalar containing the stylesheet text, or a reference to a filehandle from which the stylesheet text is available. The stylesheet text may contain unresolved ]]> constructs, which will be resolved relative to ".". stylesheetfile => $filename Same as stylesheet but let XML::XPathScript do the loading itself. Using this form, relative ]]>s in the stylesheet file will be honored with respect to the dirname of $filename instead of "."; this provides SGML-style behaviour for inclusion (it does not depend on the current directory), which is usually what you want. compiledstylesheet => $function Re-uses a previous return value of compile() (see SYNOPSIS and compile), typically to apply the same stylesheet to several XML documents in a row. interpolation_regex => $regex Sets the interpolation regex. Whatever is captured in $1 will be used as the xpath expression. Defaults to qr/{(.*?)}/.

transform transform( $xml, $stylesheet, \@args )]]> Transforms the document $xml with the $stylesheet (optionally passing to the stylesheet the argument array @args) and returns the result. If the passed $xml or $stylesheet is undefined, the previously loaded xml document or stylesheet is used. E.g., ...'; my $stylesheet = '<% ... %>'; my $transformed = $xps->transform( $xml, $stylesheet ); # transform many documents $xps->set_stylesheet( $stylesheet ); for my $xml ( @xml_documents ) { my $transformed = $xps->transform( $xml ); # do stuff with $transformed ... } # do many transformation of a document $xps->set_xml( $xml ); for my $stylesheet ( @stylesheets ) { my $transformed = $xps->transform( undef, $stylesheet ); # do stuff with $transformed ... }]]>

set_xml set_xml( $xml )]]> Sets the xml document to $xml. $xml can be a file, a file handler reference, a string, or a XML::LibXML or XML::XPath node.

set_stylesheet set_stylesheet( $stylesheet )]]> Sets the processor's stylesheet to $stylesheet.

process process $xps->process( $printer ) $xps->process( $printer, @varvalues )]]> Processes the document and stylesheet set at construction time, and prints the result to STDOUT by default. If $printer is set, it must be either a reference to a filehandle open for output, or a reference to a string, or a reference to a subroutine which does the output, as in ', 'transformed.txt' or die "can't open file transformed.txt: $!"; $xps->process( $fh ); my $transformed; $xps->process( \$transformed ); $xps->process( sub { my $output = shift; $output =~ y/<>/%%/; print $output; } );]]> If the stylesheet was compile()d with extra varnames, then the calling code should call process() with a corresponding number of @varvalues. The corresponding lexical variables will be set accordingly, so that the stylesheet code can get at them (looking at SYNOPSIS) is the easiest way of getting the meaning of this sentence).

extract extract( $stylesheet ) $xps->extract( $stylesheet, $filename ) $xps->extract( $stylesheet, @includestack ) # from include_file() only]]> The embedded dialect parser. Given $stylesheet, which is either a filehandle reference or a string, returns a string that holds all the code in real Perl. Unquoted text and ]]> constructs in the stylesheet dialect are converted into invocations of XML::XPathScript->current()->print(), while ]]> constructs are transcripted verbatim. ]]> constructs are expanded by passing their filename argument to include_file along with @includestack (if any) like this: include_file($includefilename,@includestack);]]> @includestack is not interpreted by extract() (except for the first entry, to create line tags for the debugger). It is only a bandaid for include_file() to pass the inclusion stack to itself across the mutual recursion existing between the two methods (see include_file). If extract() is invoked from outside include_file(), the last invocation form should not be used. This method does a purely syntactic job. No special framework declaration is prepended for isolating the code in its own package, defining $t or the like (compile does that). It may be overriden in subclasses to provide different escape forms in the stylesheet dialect.

read_stylesheet read_stylesheet( $stylesheet )]]> Read the $stylesheet (which can be a filehandler or a string). Used by extract and exists such that it can be overloaded in Apache::AxKit::Language::YPathScript.

include_file include_file( $filename ) $xps->include_file( $filename, @includestack )]]> Resolves a ]]> directive on behalf of extract(), that is, returns the script contents of $filename. The return value must be de-embedded too, which means that extract() has to be called recursively to expand the contents of $filename (which may contain more ]]>s etc.) $filename has to be slash-separated, whatever OS it is you are using (this is the XML way of things). If $filename is relative (i.e. does not begin with "/" or "./"), it is resolved according to the basename of the stylesheet that includes it (that is, $includestack[0], see below) or "." if we are in the topmost stylesheet. Filenames beginning with "./" are considered absolute; this gives stylesheet writers a way to specify that they really really want a stylesheet that lies in the system's current working directory. @includestack is the include stack currently in use, made up of all values of $filename through the stack, lastly added (innermost) entries first. The toplevel stylesheet is not in @includestack (that is, the outermost call does not specify an @includestack). This method may be overridden in subclasses to provide support for alternate namespaces (e.g. ``axkit://'' URIs). Compiles the stylesheet set at new() time and returns an anonymous CODE reference. varname1, varname2, etc. are extraneous arguments that will be made available to the stylesheet dialect as lexically scoped variables. SYNOPSIS shows how to use this feature to pass variables to AxKit XPathScript stylesheets, which explains this feature better than a lengthy paragraph would do. The return value is an opaque token that encapsulates a compiled stylesheet. It should not be used, except as the compiledstylesheet argument to new() to initiate new objects and amortize the compilation time. Subclasses may alter the type of the return value, but will need to overload process() accordingly of course. The compile() method is idempotent. Subsequent calls to it will return the very same token, and calls to it when a compiledstylesheet argument was set at new() time will return said argument.

print print($text)]]> Outputs a chunk of text on behalf of the stylesheet. The default implementation is to use the second argument to process. Overloading this method in a subclass provides yet another method to redirect output.

get_stylesheet_dependencies get_stylesheet_dependencies]]> Returns the files the loaded stylesheet depends on (i.e., has been included by the stylesheet or one of its includes). The order in which files are returned by the function has no special signification.

FUNCTIONS #=head2 gen_package_name # #Generates a fresh package name in which we would compile a new #stylesheet. Never returns twice the same name.

document document( $uri )]]> Reads XML given in $uri, parses it and returns it in a nodeset.

LICENSE This is free software. You may distribute it under the same terms as Perl itself.

SEE ALSO XML::XPathScript::Stylesheet , XML::XPathScript::Processor , XML::XPathScript::Template , XML::XPathScript::Template::Tag Guide of the original Axkit XPathScript: http://axkit.org/wiki/view/AxKit/XPathScriptGuide XPath documentation from W3C: http://www.w3.org/TR/xpath Unicode character table: http://www.unicode.org/charts/charindex.html

XML::XPathScript XML::XPathScript::Stylesheet - XPathScript's Stylesheet Writer Guide

STYLESHEET SYNTAX An XPathScript stylesheet is written in an ASP-like format; everything that is not enclosed within special delimiters are printed verbatim.

Delimiters

<% %> Evaluates the code enclosed without printing anything. Example: set( 'foo' => { pre => 'bar' } ); %>]]>

<%= %> Evaluates the code enclosed and prints out its result. Example: ]]>

<%# %> Comments out the code enclosed. The code will not be executed, nor show in the transformed document.

<%~ %> A shorthand for <%= apply_templates( ) %> Example: ]]>

<%- -%>, <%-= -%>, <%-~ -%>, <%-# -%> If a dash is added to a delimiter, all whitespaces (including carriage returns) predeceding or following the delimiter are removed from the transformed document. This is useful to keep a stylesheet readable without generating transformed document with many whitespace gaps. The dash can be added independently to the right and left delimiter. Example: <%-~ /doc/title -%> ]]>

Insert the content of the file into the stylesheet. The path is relative to the stylesheet, not the processed document.

PRE-DEFINED VARIABLES This section describes pre-defined variables accessible from within a XPathScript stylesheet. $template, $t, $XML::XPathScript::trans All three variables point to the stylesheet's template. See section TRANSFORMATION TEMPLATE. $XML::XPathScript::xp The DOM of the xml document unto which the stylesheet is applied. $XML::XPathScript::current The XML::XPathScript object from which the stylesheet has been invoked. See the XML::XPathScript manpage for a list of utility methods that can be called from within the stylesheet.

TRANSFORMATION TEMPLATE The transformation template defines the modification that will automatically be brought on document elements when 'apply_templates' is called. See the XML::XPathScript::Template manpage for details on how to configure the template.

Special tags In addition to regular tag names, three special tags can be used in the template: text() and comment(), that match the corresponding nodes in the document, and '*', a catch-all tag.

text(), #text Matches text nodes. Note that text nodes can be assigned a special action. See section action of this manpage. Example: set( 'text()' => { pre => '\begin{comment}', post => '\end{comment}', ); %>]]>

comment() Matches comment nodes.

'*' Matches any regular tag (that is, not comments nor text) that isn't explicitly matched.

Tag Attributes The tags' attributes define how the associated nodes are transformed by the template.

pre, intro, prechildren, prechild, postchild, postchildren, extro, post Define the text to be printed around a node. All defined attributes are outputed in the following order: # if showtag == 1 intro prechildren # if has children prechild # for each child [ child node ] postchild # for each child postchildren # if has children extro # if showtag == 1 post]]> If interpolation is enabled, XPath expressions delimited by curly braces can be imbedded in any of these attributes. set( 'movie' => { pre => 'title: {./@title}, year: {./year}' } );]]> Interpolation is enabled via the XML::XPathScript object's method interpolation. The expressions' delimiter can be modified via the XML::XPathScript object's method interpolation_regex.

showtag If set to true, the original tag is printed out.

action Dictate how the node and its children are processed. The allowed values are: DO_SELF_AND_KIDS Process the current node and its children. DO_SELF_ONLY Process the current node, but not its children. DO_NOT_PROCESS Do not process either the current node or any of its children. DO_TEXT_AS_CHILD Only meaningful for text nodes. When this value is given, the processor pretends that the text is a child of the node, which basically means that {pre}]]> and {post}]]> will frame the text instead of replacing it. Example: ( 'text()' => { pre => 'replacement text' } ); # will transform blah # into replacement text $template->( 'text()' => { action => DO_TEXT_AS_CHILD, pre => 'text: ' } ); # will transform blah # into text: blah]]> xpath expression Process the current node and all its children that match the xpath expression. The XPath expression is anchored on the current node. Example: set( 'foo' => { action => './*[@process = "yes"]' } );]]>

testcode A reference to a subroutine that will be executed upon visiting the tag. When invoked, the subroutine is passed two parameters: the current node's object and a tag object holding all the attributes of the visited tag. Modifications to the tag object only affect the transformation of the current node. To change the transformation of all subsequent tag of the same type, use the stylesheet $template instead. Also, the return value of the subroutine overrides the value of the 'action' attribute. Example: set( '*' => { testcode => \&uppercase_tag } ); sub uppercase_tag { my( $n, $tag ) = @_; my $name = uc $n->getName; $tag->set({ pre => "<$name>", post => "", }); return DO_SELF_AND_KIDS; } %>]]>

rename Renames the tag to the given value. Implicitly sets 'showtag' to true. Example: .. to # ... <% $t->set( foo => { rename => 'bar' } ); %>]]>

STYLESHEET WRITING GUIDELINES Here are a few things to watch out for when coding stylesheets.

XPath scalar return values considered harmful XML::XPath calls such as findvalue() return objects in an object class designed to map one of the types mandated by the XPath spec (see XML::XPath for details). This is often not what a Perl programmer comes to expect (e.g. strings and numbers cannot be treated the same). There are some work-arounds built in XML::XPath, using operator overloading: when using those objects as strings (by concatenating them, using them in regular expressions etc.), they become strings, through a transparent call to one of their methods such as -value() >. However, we do not support this for a variety of reasons (from limitations in overload to stylesheet compatibility between XML::XPath and XML::LibXML to Unicode considerations), and that is why our findvalue and friends return a real Perl scalar, in violation of the XPath specification. On the other hand, findnodes does return a list of objects in list context, and an XML::XPath::NodeSet or XML::LibXML::NodeList instance in scalar context, obeying the XPath specification in full. Therefore you most likely do not want to call findnodes() in scalar context, ever: replace with

Do not use DOM method calls, for they make stylesheets non-portable The findvalue() such functions described in XML::XPathScript::Processor are not the only way of extracting bits from the XML document. Objects passed as the first argument to the tag attribute and returned by findnodes() in array context are of one of the XML::XPath::Node::* or XML::LibXML::* classes, and they feature some data extraction methods by themselves, conforming to the DOM specification. However, the names of those methods are not standardized even among DOM parsers (the accessor to the property, for example, is named in XML::LibXML and in XML::XPath!). In order to write a stylesheet that is portable between XML::libXML and XML::XPath used as back-ends to XML::XPathScript , one should refrain from doing that. The exact same data is available through appropriate XPath formulae, albeit more slowly, and there are also type-checking accessors such as in XML::XPathScript::Processor .

THE UNICODE MESS Unicode is a balucitherian character numbering standard, that strives to be a superset of all character sets currently in use by humans and computers. Going Unicode is therefore the way of the future, as it will guarantee compatibility of your applications with every character set on planet Earth: for this reason, all XML-compliant APIs (XML::XPathScript being no exception) should return Unicode strings in all their calls, regardless of the charset used to encode the XML document to begin with. The gotcha is, the brave Unicode world sells itself in much the same way as XML when it promises that you'll still be able to read your data back in 30 years: that will probably turn out to be true, but until then, you can't :-) Therefore, you as a stylesheet author will more likely than not need to do some wrestling with Unicode in Perl, XML::XPathScript or not. Here is a primer on how.

Unicode, UTF-8 and Perl Unicode is not a text file format: UTF-8 is. Perl, when doing Unicode, prefers to use UTF-8 internally. Unicode is a character numbering standard: that is, an abstract registry that associates unique integer numbers to a cast of thousands of characters. For example the "smiling face" is character number 0x263a, and the thin space is 0x2009 (there is a URL to a Unicode character table in SEE ALSO). Of course, this means that the 8-bits- (or even, Heaven forbid, 7-bits-?)-per-character idea goes through the window this instant. Coding every character on 16 bits in memory is an option (called UTF-16), but not as simple an idea as it sounds: one would have to rewrite nearly every piece of C code for starters, and even then the Chinese aren't quite happy with "only" 65536 character code points. Introducing UTF-8, which is a way of encoding Unicode character numbers (of any size) in an ASCII- and C-friendly way: all 127 ASCII characters (such as "A" or or "/" or ".", but not the ISO-8859-1 8-bit extensions) have the same encoding in both ASCII and UTF-8, including the null character (which is good for strcpy() and friends). Of course, this means that the other characters are rendered using several bytes, for example "e" is "é" in UTF-8. The result is therefore vaguely intelligible for a Western reader.

Output to UTF-8 with XPathScript The programmer- and C-friendly characteristics of UTF-8 have made it the choice for dealing with Unicode in Perl. The interpreter maintains an "UTF8-tainted" bit on every string scalar it handles (much like what perlsec does for untrusted data). Every function in XML::XPathScript returns a string with such bit set to true: therefore, producing UTF-8 output is straightforward and one does not have to take any special precautions in XPathScript.

Output to a non-UTF-8 character set with XPathScript When binmode is invoked from the stylesheet body, it signals that the stylesheet output should not be UTF-8, but instead some user-chosen character encoding that XML::XPathScript cannot and will not know or care about. Calling current()->binmode() > has the following consequences: presence of this "UTF-8 taint" in the stylesheet output is now a fatal error. That is, whenever the result of a template evaluation is marked internally in Perl with the "this string is UTF-8" flag (as opposed to being treated by Perl as binary data without character meaning, see perlunicode), translate_node in XML::XPathScript::Processor will croak; the stylesheet therefore needs to build an "unicode firewall". That is, blocks have to take input in UTF-8 (as per the XML standard, UTF-8 indeed is what will be returned by findvalue in XML::XPathScript::Processor and such) and provide output in binary (in whatever character set is intended for the output), lest translate_node() croaks as explained above. The Unicode::String module comes in handy to the stylesheet writer to cast from UTF-8 to an 8-bit-per-character charset such as ISO 8859-1, while laundering Perl's internal UTF-8-string bit at the same time; the appropriate voodoo is performed on the output filehandle(s) so that a spurious, final charset conversion will not happen at print() time under any locales, versions of Perl, or phases of moon.

XML::XPathScript::Stylesheet The XML::XPathScript distribution offers an XML parser glue, an embedded stylesheet language, and a way of processing an XML document into a text output. This package implements the latter part: it takes an already filled out template object and an already parsed XML document (which come from XML::XPathScript behind the scenes), and provides a simple API to implement stylesheets. In particular, the apply_templates function triggers the recursive expansion of the whole XML document when used as shown in SYNOPSIS. XML::XPathScript::Processor - the XML transformation engine in XML::XPathScript

SYNOPSIS In a stylesheet {testcode}]]> sub for e.g. Docbook's ]]> tag: set({ pre => "", post => '' }); return DO_SELF_AND_KIDS; } else { $t->set({ pre => "$url", post => '' }); return DO_SELF_ONLY; };]]> At the stylesheet's top-level one often finds: ]]>

XPATHSCRIPT LANGUAGE FUNCTIONS All of these functions are intended to be called solely from within the {testcode}]]> templates or ]]> or ]]> blocks in XPathScript stylesheets. They are automatically exported to both these contexts.

findnodes Returns a list of nodes found by XPath expression $path, optionally using $context as the context node (default is the root node of the current document). In scalar context returns a NodeSet object (but you do not want to do that, see XPath scalar return values considered harmful in XML::XPathScript ).

findvalue Evaluates XPath expression $path and returns the resulting value. If the path returns one of the "Literal", "Numeric" or "NodeList" XPath types, the stringification is done automatically for you using xpath_to_string.

xpath_to_string Converts any XPath data type, such as "Literal", "Numeric", "NodeList", text nodes, etc. into a pure Perl string (UTF-8 tainted too - see is_utf8_tainted). Scalar XPath types are interpreted in the straightforward way, DOM nodes are stringified into conform XML, and NodeList's are stringified by concatenating the stringification of their members (in the latter case, the result obviously is not guaranteed to be valid XML). See XPath scalar return values considered harmful in XML::XPathScript on why this is useful.

findvalues Evaluates XPath expression $path as a nodeset expression, just like findnodes would, but returns a list of UTF8-encoded XML strings instead of node objects or node sets. See also XPath scalar return values considered harmful in XML::XPathScript .

findnodes_as_string Similar to findvalues but concatenates the XML snippets. The result obviously is not guaranteed to be valid XML.

matches Returns true if the node matches the path (optionally in context $context)

apply_templates This is where the whole magic in XPathScript resides: recursively applies the stylesheet templates to the nodes provided either literally (last invocation form) or through an XPath expression (second and third invocation forms), and returns a string concatenation of all results. If called without arguments at all, renders the whole document (same as ). Calls to apply_templates() may occur both implicitly (at the top of the document, and for rendering subnodes when the templates choose to handle that by themselves), and explicitly (because routines require the XML::XPathScript::Processor to DO_SELF_AND_KIDS). If appropriate care is taken in all templates (especially the routines and the text() template), the string result of apply_templates need not be UTF-8 (see binmode in XML::XPathScript ): it is thus possible to use XPathScript to produce output in any character set without an extra translation pass.

call_template EXPERIMENTAL - allows routines to invoke a template by name, even if the selectors do not fit (e.g. one can apply template B to an element node of type A). Returns the stylesheeted string computed out of $node just like apply_templates would.

is_element_node Returns true if $object is an element node, false otherwise.

is_text_node Returns true if $object is a "true" text node (not a comment node), false otherwise. Returns true if $object is an XML comment node, false otherwise.

is_pi_node Returns true iff $object is a processing instruction node.

is_nodelist Returns true if $node is a node list (as returned by findnodes in scalar context), false otherwise.

is_utf_tainted Returns true if Perl thinks that $string is a string of characters (in UTF-8 internal representation), and false if Perl treats $string as a meaningless string of bytes. The dangerous part of the story is when concatenating a non-tainted string with a tainted one, as it causes the whole string to be re-interpreted into UTF-8, even the part that was supposedly meaningless character-wise, and that happens in a nonportable fashion (depends on locale and Perl version). So don't do that - and use this function to prevent that from happening.

get_xpath_of_node Returns an XPath string that points to $node, from the root. Useful to create error messages that point at some location in the original XML document.

XML::XPathScript::Processor A stylesheet's template defines the transformations and actions that are performed on the tags of a document as they are processed. The template of a stylesheet can be accessed via variables $t, $template and $XML::XPathScript::trans. XML::XPathScript::Template - XML::XPathScript transformation template

SYNOPSIS set( 'important' => { 'pre' => '', 'post' => '', 'prechild' => '', 'postchild' => '', } ); # urgent and annoying share the 'pre' and 'post' # of important $t->copy( 'important' => [ qw/ urgent annoying / ], [ qw/ pre post / ], ); # redHot is a synonym of important $t->alias( 'important' => 'redHot' ); %> <%= apply_templates() %>]]>

METHODS

new new]]> Creates and returns a new, empty template.

set ( $tag, \%attributes ) $template->set_template( \@tags , \%attributes )]]> Update the $tag or @tags in the template with the given %attributes. Example: set( 'foo' => { pre => '', post => '' } );]]>

copy copy( $original_tag, $copy_tag ); $template->copy( $original_tag, $copy_tag, \@attributes ); $template->copy( $original_tag, \@copy_tags ); $template->copy( $original_tag, \@copy_tags, \@attributes );]]> Copies all attributes (or a subset of them if @attributes is given) of $original_tag to $copy_tag. Note that subsequent modifications of the original tag will not affect the copies. To bind several tags to the same behavior, see alias . Example: copy( 'important' => [ qw/ urgent redHot / ], [ qw/ pre post / ] );]]>

alias alias( $original_tag => $alias_tag ) $template->alias( $original_tag => \@alias_tags )]]> Makes the target tags aliases to the original tag. Further modifications that will be done on any of these tags will be reflected on all others. Example: alias( 'foo' => 'bar' ); # also modifies 'foo' $template->set( 'bar' => { pre => '' } ); ]]>

is_alias is_alias( $tag )]]> Returns all tags that are aliases to $tag.

unalias unalias( $tag )]]> Unmerge $tag of its aliases, if it has any. Further modifications to $tag will not affect the erstwhile aliases, and vice versa. Example: alias( 'foo' => [ qw/ bar baz / ] ); $template->set( 'foo' => { pre => '' } ); # affects foo, bar and baz $template->unalias( 'bar' ); $template->set( 'bar' => { pre => '' } ); # affects only bar $template->set( 'baz' => { pre => '' } ); # affects foo and baz]]>

clear clear() $template->clear( \@tags )]]> Delete all tags, or those given by @tags, from the template. Example: clear([ 'foo', 'bar' ]);]]>

dump dump() $template->dump( @tags )]]> Returns a pretty-printed dump of the templates. If @tags are specified, only return their templates. Example: dump( 'foo' ) %> # will yield something like # # $template = { # foo => { # post => '', # pre => '', # } # }; ]]>

namespace namespace( $uri );]]> Returns the sub-template associated to the namespace defined by $uri. Example: set( 'foo' => { 'pre' => 'within default namespace' } ); my $subtemplate = $template->namespace( 'http://www.www3c.org/blah/' ); $subtemplate->set( 'foo' => { 'pre' => "within 'blah' namespace" } );]]>

resolve resolve( $namespace, $tagname ); $tag = $template->resolve( $tagname );]]> Returns the tag object within $template that matches $namespace and $tagname best. The returned match is the first one met in the following list: $namespace:$tagname $namespace:* $tagname * undef Example: set( foo => { pre => 'a' } ); $template->set( '*' => { pre => 'b' } ); $template->namespace( 'http://blah' )->set( foo => { pre => 'c' } ); $template->namespace( 'http://blah' )->set( '*' => { pre => 'd' } ); $template->resolve( 'foo' )->get( 'pre' ); # returns 'a' $template->resolve( 'baz' )->get( 'pre' ); # returns 'b' $template->resolve( 'http://meeh', 'foo' )->get( 'pre' ); # returns 'a' $template->resolve( 'http://blah', 'foo' )->get( 'pre' ); # returns 'c' $template->resolve( 'http://blah', 'baz' )->get( 'pre' ); # returns 'd']]>

BACKWARD COMPATIBILITY Prior to version 1.0 of XML::XPathScript, the template of a stylesheet was not an object but a simple hash reference. Modifications to the template were done by manipulating the hash directly. {important}{pre} = ''; $t->{important}{post} = ''; for my $tag ( qw/ urgent redHot / ) { for my $attr ( qw/ pre post / ) { $t->{$tag}{$attr} = $t->{important}{$attr}; } } $t->{ alert } = $t->{ important }; %>]]> Don't tell anyone, but as an XML::XPathScript::Template is a blessed hash reference this way of doing things will still work. However, direct manipulation of the template's hash is deprecated. Instead, it is recommended to use the object's access methods. set( important => { pre => '', post => '', showtag => 1 } ); $t->copy( important => [ qw/ urgent redHot / ], [ qw/ pre post / ] ); $t->alias( important => alert ); %>]]>
XML::XPathScript::Template The XML::XPathScript::Tag class is used to represent tags within an XPathScript template. XML::XPathScript::Template::Tag - XPathScript Template Element
SYNOPSIS set( 'foo' => { testcode => \&frumble } ); sub frumble { my( $n, $t ) = @_; $t->set({ 'pre' => '' }); return DO_SELF_AND_CHILDREN(); } %> <%= apply_templates() %>]]>

CALLED AS ARGUMENT TO THE TESTCODE FUNCTIONS Typically, the only time you'll be exposed to those objects is via the testcode functions, which receive as arguments a reference to the current node and its associated template entry. Note that changing any of the tag's attributes only impacts the current node and doesn't change the tag entry in the template. To modify the template, you'll have to access $template directly. Example: set( 'foo' => { testcode => \&frumble } ); sub frumble { my( $n, $t ) = @_; if( $n->findvalue( './@bar' ) eq 'whonk' ) { # we've been whonk'ed! This foo must # blink $t->set({ 'pre' => '', 'post' => '' }); # and the next foos will be in italic $template->set( foo => { pre => '', post => '' } ); } return DO_SELF_AND_CHILDREN(); } %>]]>
METHODS
new new]]> Creates a new, empty tag.

set set( \%attributes )]]> Updates the tag's attributes with the values given in \%attributes Example: set({ pre => '', post => '' });]]>

get get( @attributes )]]> Returns the values of @attributes. Example: get( 'pre', 'post' );]]>

BACKWARD COMPATIBILITY As for XML::XPathScript::Template, prior to release 1.0 of XPathScript, the tags within the template of a stylesheet were not objects but simple hash references. Modifications to the tag attributes were done by manipulating the hash directly. {foo}{testcode} = sub { my( $n, $t ) = @_; $t->{pre} = ''; $t->{post} = ''; return DO_SELF_AND_CHILDREN; }; %>]]> Don't tell anyone, but as an XML::XPathScript::Template::Tag is a blessed hash reference this way of doing things will still work. However, direct manipulation of the tag's hash is deprecated. Instead, it is recommended to use the object's access methods. set( foo => { testcode => \&tc_foo } ); sub tc_foo { my( $n, $t ) = @_; $t->set({ pre => '', post => '' }); return DO_SELF_AND_CHILDREN; }; %>]]>
XML::XPathScript::Template::Tag Apache2::TomKit::Processor::XPathScript - XPathScript Processor for TomKit
SYNOPSIS Apache2::TomKit::Processor::XPathScript" PerlFixupHandler Apache2::TomKit PerlSetVar AxAddProcessorDef "text/xps=>stylesheet.xps" ]]>
Apache2::TomKit::Processor::XPathScript YPathScript is a fork of the original AxKit's XPathScript using XML::XPathScript as its transforming engine. As it is mostly backward compatible with the classic Axkit XPathScript module, the definitive reference for XPathScript, located at http://axkit.org/docs/xpathscript/guide.dkb, also applies to YPathScript, excepts for the differences listed in the sections below. Apache::AxKit::Language::YPathScript - An XML Stylesheet Language
SYNOPSIS \ Apache::AxKit::Language::YPathScript"]]>

PRE-DEFINED STYLESHEET VARIABLES AND FUNCTIONS
VARIABLES $r A copy of the Apache::AxKit::request object -- which is itself a wrapper around the Apache::request object -- tied to the current document. args() %>
args: <%= join ' : ', map "$_ => $args{$_}", keys %args %>
]]>

FUNCTIONS $node = XML::XPathScript::current->document( $uri ) Fetch the xml document located at $uri and return it as a dom node.

Functions $xps = new Apache::AxKit::Language::YPathScript($xml_provider, $style_provider) Construct a new YPathScript language interpreter out of the provided providers. $rc = handler( $class, $request, $xml_provider, $style_provider ) $file_content = include_file( $filename ) $file_content = include_file( $filename, @includestack ) Overloaded from XML::XPathScript in order to provide URI-based stylesheet inclusions: $filename may now be any AxKit URI. The AxKit language class drops support for plain filenames that exists in the ancestor class: this means that include directives like ]]> in existing stylesheets should be turned into ]]> in order to work with AxKit. $doc = get_source_tree( $xml_provider ) Read an XML document from the provider and return it as a string. $string = read_stylesheet( $stylesheet ) Retrieve and return the $stylesheet (which can be a filehandler or a string) as a string. $self->debug( $level, $message ) Print $message if the requested debug level is equal or smaller than $level. $self->die( $suicide_note ) Print the $suicide_note and exit; $nodeset = $self->document( $uri ) Read XML document located at $uri, parse it and return it in a node object. The $uri can be specified using the regular schemes ('http://foo.org/bar.xml', 'ftp://foo.org/bar.xml'), or the Axkit scheme ('axkit://baz.xml'), or as a local file ('/home/web/foo.xml', './foo.xml' ).

Apache::AxKit::Language::YPathScript If the second type of call is used, xpathscript assumes that the xml source file and the XPathScript stylesheet are named <name>.xml and <name>.xps.
ARGUMENTS -i Enable interpolation -q=<query_string> query_string is passed as if it was a query string. E.g., will act as if the document was requested from the web server with the url 'http://your.server.org/doc.xml?page=3&images=no'
xpathscript - XPathScript command-line utility
SYNOPSIS xpathscript [-i] [-q=<query_string>] <xml_file> <stylesheet_file> xpathscript [-i] [-q=<query_string>] <name>

SEE ALSO XML::XPathScript
xpathscript