<norman.walsh@marklogic.com>
<alex@milowski.org>
<ht@inf.ed.ac.uk>
This document is also available in these non-normative formats: XML, automatic change markup from the previous draft courtesy of DeltaXML.
Copyright © 2014 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark and document use rules apply.
This specification describes the standard step vocabulary of XProc 2.0: An XML Pipeline Language.
This document is an editor's draft that has no official standing.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document is a product of the XML Processing Model Working Group as part of the W3C XML Activity. This draft is a first attempt to address some of the requirements of [XProc V2.0 Requirements]. It is in many ways substantially incomplete. The Working Group is publishing it in order to establish an intended direction and to provide an official opportunity for comment.
Please report errors in this document by raising issues on the specification repository. Alternatively, you may report errors in this document to the public mailing list public-xml-processing-model-comments@w3.org (public archives are available).
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 14 October 2005 W3C Process Document.
This specification describes the standard, atomic XProc steps of [XProc 2.0].
Some aspects of documents are generally unchanged by steps:
When a step in this library produces an output document,
the base URI of the output is the base URI of the step's primary
input document unless the step's process explicitly sets an
xml:base
attribute or the step's
description explicitly states how the base URI is constructed.
Unless otherwise specified, steps in this library do not modify the document propertiesXP of the documents that flow through them.
Also, in this specification, several steps use this element for result information:
<c:result>
string
</c:result>
When a step uses an XPath to compute an option value, the XPath context is as defined in Section 2.7, “XPaths in XProc”XP.
When a step specifies a particular version of a technology, implementations must implement that version or a subsequent version that is backwards compatible with that version. At user-option, they may implement other non-backwards compatible versions.
This section describes standard steps that must be supported by any conforming processor.
The p:add-attribute
step adds a single attribute to
a set of matching elements. The input document specified on the
source
is processed for matches specified by the match
pattern in the match
option. For each of these
matches, the attribute whose name is specified by the
attribute-name
option is set to the attribute value
specified by the attribute-value
option.
The resulting document is produced on the result
output port and consists of a exact copy of the input with the
exception of the matched elements. Each of the matched elements is
copied to the output with the addition of the specified attribute
with the specified value.
<p:declare-step
type
="
p:add-attribute
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
<p:option
name
="
attribute-name
"
required
="
true
"
as
="
xs:QName
"
/>
<p:option
name
="
attribute-prefix
"
as
="
xs:NCName
"
/>
<p:option
name
="
attribute-namespace
"
as
="
xs:anyURI
"
/>
<p:option
name
="
attribute-value
"
required
="
true
"
as
="
xs:string
"
/>
</p:declare-step>
The value of the match
option
must be an XSLTMatchPattern. It
is a dynamic error (err:XC0023
) if the match pattern does
not match an element.
The value of the attribute-name
option
must be a QName
.
If the lexical value does not contain a colon, then the attribute-namespace
may be used to specify the
namespace of the attribute. In that case, the attribute-prefix
may be specified to suggest a
prefix for the attribute name. It is a
dynamic error (err:XD0034
XP) to specify a new namespace or
prefix if the lexical value of the specified name contains a
colon.
The corresponding expanded name is used to construct the attribute.
The value of the attribute-value
option
must be a legal attribute value according to XML.
If an attribute with the same name as the expanded name
from the attribute-name
option exists on the matched
element, the value specified in
the attribute-value
option is used to set the
value of that existing attribute. That is, the value of the
existing attribute is changed to the attribute-value
value.
If multiple attributes need to be set on the same
element(s), the p:set-attributes
step can be used to set them
all at once.
This step cannot be used to add namespace declarations. It is a dynamic error (err:XC0059
) if the QName
value in the attribute-name
option uses the prefix
“xmlns
”
or any other prefix that resolves to the namespace name
http://www.w3.org/2000/xmlns/
.
Note, however, that while namespace declarations cannot be
added explicitly by this step, adding an attribute whose name is in a
namespace for which there is no namespace declaration in scope on the
matched element may result in a namespace binding being added by
Section 2.5.1, “Namespace Fixup on XML Outputs”XP.
If an attribute named
xml:base
is added or changed, the base URI
of the element must also be amended accordingly.
The p:add-xml-base
step exposes the base URI via
explicit xml:base
attributes. The input document from the
source
port is replicated to the result
port
with xml:base
attributes added to or corrected on each element as specified
by the options on this step.
<p:declare-step
type
="
p:add-xml-base
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
all
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
relative
"
select
="
'true'
"
as
="
xs:boolean
"
/>
</p:declare-step>
The value of the all
option
must be a boolean.
The value of the relative
option
must be a boolean.
It is a dynamic error (err:XC0058
)
if the all
and relative
options are
both true
.
The p:add-xml-base
step modifies its input as follows:
For the document element: force the element to have an xml:base
attribute with the document's [base URI] property's value as its value.
For other elements:
If the all
option has the value
true
, force the element to have an xml:base
attribute with the element's [base
URI] value as its value.
If the element's [base URI] is different from the its parent's
[base URI], force the element to have an xml:base
attribute with the following
value: if the value of the relative
option is
true
, a string which, when resolved against the
parent's [base URI], will give the element's [base URI], otherwise the
element's [base URI].
Otherwise, if there is an xml:base
attribute present, remove it.
The p:cast-content-type
step changes the media type
of its input.
<p:declare-step
type
="
p:cast-content-type
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
content-type
"
as
="
xs:string
"
/>
</p:declare-step>
The input document is transformed from one media type to another.
It is a dynamic
error (err:XC1002
) if the supplied content-type
is not
a valid media type of the form
“
”.type
/subtype
+ext
Casting from one XML media type to another simply changes the
“content-type
” document
propertyXP.
Casting from a non-XML media type to an XML media type produces an
XML document with a c:data
document element. The original
media type will be preserved in the
content-type
attribute on the
c:data
element.
<c:data
content-type = ContentType
charset? = string
encoding? = string>
string
</c:data>
The content of the c:data
element is the base64 encoded
representation of the non-XML content.
Casting from an XML media type to a non-XML media type
must support the case where the input document is
a c:data
document. The resulting document will
have the specified media type and a representationXP that
is the content of the c:data
element after decoding the base64
encoded content.
It is a dynamic
error (err:XC1004
) if the c:data
contains content is not
a valid base64 string.
It is a dynamic
error (err:XC1005
) if the c:data
element does not have
a content-type
attribute.
It is a dynamic
error (err:XC1006
) if the content-type
is supplied and is
not the same as the content-type
specified on
the c:data
element.
Casting from an XML media type to a non-XML media type when
the input document is not a c:data
document is
implementation-definedXP.
What happens when one non-XML media type is cast to another
non-XML media type is implementation-definedXP.
It is a dynamic
error (err:XC1003
) if the p:cast-content-type
step
cannot perform the requested cast.
In all cases except when the input document
is a c:data
element, it is a dynamic
error (err:XC1007
) if the content-type
is not supplied.
The p:compare
step compares two documents for
equality.
<p:declare-step
type
="
p:compare
"
>
<p:input
port
="
source
"
primary
="
true
"
/>
<p:input
port
="
alternate
"
/>
<p:output
port
="
result
"
as
="
xs:boolean
"
/>
<p:option
name
="
fail-if-not-equal
"
select
="
'false'
"
as
="
xs:boolean
"
/>
</p:declare-step>
The value of the fail-if-not-equal
option must be a boolean.
This step takes single documents on each of two ports and compares them
using the fn:deep-equal
(as defined in
[XPath 2.0 Functions and Operators]). It is a
dynamic error (err:XC0019
) if the documents are not equal, and the value
of the fail-if-not-equal
option is
true
. If the documents are equal, or if the value
of the fail-if-not-equal
option is
false
, a c:result
document is produced with contents true
if the documents
are equal, otherwise false
.
The p:count
step counts the number of documents in
the source
input sequence and returns a single document
on result
containing that number. The generated document
contains a single c:result
element whose contents is the
string representation of the number of documents in the
sequence.
<p:declare-step
type
="
p:count
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
sequence
="
true
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
limit
"
select
="
0
"
as
="
xs:integer
"
/>
</p:declare-step>
If the limit
option is specified
and is greater than zero, the p:count
step will count at most
that many documents. This provides a convenient mechanism to discover,
for example, if a sequence consists of more than 1 document, without
requiring every single document to be buffered before processing can
continue.
The p:delete
step deletes items specified by a match
pattern from the
source
input document and produces the resulting document,
with the deleted items removed, on the result
port.
<p:declare-step
type
="
p:delete
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
</p:declare-step>
The value of the match
option must be an
XSLTMatchPattern. A match pattern may match multiple items to be
deleted.
If an element is selected by the match
option, the
entire subtree rooted at that element is deleted.
This step cannot be used to remove namespaces. It is a dynamic error (err:XC0062
) if the
match
option matches a namespace node.
Also, note that deleting an attribute named
xml:base
does not change the base URI
of the element on which it occurred.
The p:directory-list
step produces a list of the
contents of a specified
directory.
<p:declare-step
type
="
p:directory-list
"
>
<p:output
port
="
result
"
/>
<p:option
name
="
path
"
required
="
true
"
as
="
xs:anyURI
"
/>
<p:option
name
="
include-filter
"
as
="
xs:string
"
/>
<!--
RegularExpression -->
<p:option
name
="
exclude-filter
"
as
="
xs:string
"
/>
<!--
RegularExpression -->
</p:declare-step>
The value of the path
option
must be an anyURI
. It is interpreted
as an IRI reference. If it is relative, it is made absolute against
the base URI of the element on which it is specified
(p:with-optionXP or p:directory-list
in the case of a
syntactic shortcutXP value).
It is a
dynamic error (err:XC0017
) if the absolute path does not
identify a directory. It is a
dynamic error (err:XC0012
) if the contents of the directory
path are not available to the step due to access restrictions in the
environment in which the pipeline is run.
Conformant processors must support directory paths whose
scheme is file
. It is
implementation-definedXP what other schemes are
supported by p:directory-list
, and what the interpretation
of 'directory', 'file' and 'contents' is for those schemes.
If present, the value of the include-filter
or exclude-filter
option must be a regular expression as specified in [XPath 2.0 Functions and Operators], section 7.61 “Regular Expression
Syntax
”.
If the include-filter
pattern matches a
directory entry's name, the entry is included in the output. If the
exclude-filter
pattern matches a directory entry's name,
the entry is excluded in the output. If both options are provided, the
include filter is processed first, then the exclude filter.
The result document produced for
the specified directory path has a c:directory
document
element whose base URI is the directory path and whose
name
attribute is the last segment
of the directory path (that is, the directory's (local) name).
<c:directory
name = string>
(c:file |
c:directory |
c:other)*
</c:directory>
Its contents are determined as follows, based on the entries in
the directory identified by the directory path. For each entry in the
directory, if either no filter
was specified, or the
(local) name of the entry matches the filter pattern, a
c:file
, a c:directory
, or a c:other
element is produced, as follows:
A c:directory
is produced for each subdirectory not
determined to be special.
A c:file
is produced for each file
not determined to be special.
<c:file
name = string />
Any file or directory determined to be
special by the p:directory-list
step may be output using a
c:other
element but the criteria for marking a file as
special are implementation-definedXP.
<c:other
name = string />
When a directory entry is a subdirectory, that directory's entries are not
output as part of that entry's c:directory
. A user must apply this step
again to the subdirectory to list subdirectory contents.
Each of the elements c:file
, c:directory
,
and c:other
has a name
attribute when it
appears within the top-level c:directory
element, whose
value is a relative IRI reference, giving the (local) file or
directory name.
Any attributes other than name
on
c:file
, c:directory
, or c:other
are implementation-definedXP.
The p:error
step generates a dynamic error using the input provided
to the step.
<p:declare-step
type
="
p:error
"
>
<p:input
port
="
source
"
primary
="
false
"
/>
<p:output
port
="
result
"
sequence
="
true
"
/>
<p:option
name
="
code
"
required
="
true
"
as
="
xs:QName
"
/>
<p:option
name
="
code-prefix
"
as
="
xs:NCName
"
/>
<p:option
name
="
code-namespace
"
as
="
xs:anyURI
"
/>
</p:declare-step>
The value of the code
option
must be a QName
.
If the lexical value does not contain a colon, then the code-namespace
may be used to specify the
namespace of the code. In that case, the code-prefix
may be specified to suggest a
prefix for the code. It is a
dynamic error (err:XD0034
XP) to specify a new namespace or
prefix if the lexical value of the specified name contains a
colon.
This step uses the document provided on its input as the content of the error raised. An instance of the c:errorsXP element will be produced on the error output port, as is always the case for dynamic errors. The error generated can be caught by a p:tryXP just like any other dynamic error.
For authoring convenience, the p:error
step is
declared with a single, primary output port. With respect to
connectionsXP, this port behaves like
any other output port even though nothing can ever
appear on it since the step always fails.
For example, given the following invocation:
<p:error xmlns:my="http://www.example.org/error"
name="bad-document" code="my:unk12">
<p:input port="source">
<p:inline>
<message>The document element is unknown.</message>
</p:inline>
</p:input>
</p:error>
The error vocabulary element (and document) generated on the error output port would be:
<c:errors xmlns:c="http://www.w3.org/ns/xproc-step"
xmlns:p="http://www.w3.org/ns/xproc"
xmlns:my="http://www.example.org/error">
<c:error name="bad-document" type="p:error"
code="my:unk12"><message>The document element is unknown.</message>
</c:error>
</c:errors>
The href
,
line
and column
,
or offset
, might also be present on the
c:error
to identify the location of the p:error
element in the pipeline.
The p:escape-markup
step applies XML serialization to the
children of the document element and replaces those children with their
serialization. The outcome is a single element with text content that
represents the "escaped" syntax of the children as they were
serialized.
<p:declare-step
type
="
p:escape-markup
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
cdata-section-elements
"
select
="
''
"
as
="
xs:string
"
/>
<!--
ListOfQNames -->
<p:option
name
="
doctype-public
"
as
="
xs:string
"
/>
<p:option
name
="
doctype-system
"
as
="
xs:anyURI
"
/>
<p:option
name
="
escape-uri-attributes
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
include-content-type
"
select
="
'true'
"
as
="
xs:boolean
"
/>
<p:option
name
="
indent
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
media-type
"
as
="
xs:string
"
/>
<p:option
name
="
method
"
select
="
'xml'
"
as
="
xs:QName
"
/>
<p:option
name
="
omit-xml-declaration
"
select
="
'true'
"
as
="
xs:boolean
"
/>
<p:option
name
="
standalone
"
select
="
'omit'
"
as
="
xs:token
"
/>
<!--
"true" | "false" | "omit" -->
<p:option
name
="
undeclare-prefixes
"
as
="
xs:boolean
"
/>
<p:option
name
="
version
"
select
="
'1.0'
"
as
="
xs:string
"
/>
</p:declare-step>
This step supports the standard serialization options as specified in Section 1.3, “Serialization Options”. These options control how the output markup is produced before it is escaped.
For example, the input:
<description>
<div xmlns="http://www.w3.org/1999/xhtml">
<p>This is a chunk of XHTML.</p>
</div>
</description>
produces:
<description>
<div xmlns="http://www.w3.org/1999/xhtml">
<p>This is a chunk of XHTML.</p>
</div>
</description>
The result of this step is an XML document that contains the
Unicode characters that are the characters that result from escaping
the input. It is not encoded characters in a serialized octet stream,
therefore, the serialization options related to encoding characters
(byte-order-mark
, encoding
, and
normalization-form
) do not apply. They are omitted
from the standard serialization options on this step.
By default, this step must not generate an XML declaration in the escaped result.
The p:filter
step selects portions of the source document
based on a (possibly dynamically constructed) XPath select expression.
<p:declare-step
type
="
p:filter
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
sequence
="
true
"
/>
<p:option
name
="
select
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XPathExpression -->
</p:declare-step>
This step behaves just like an p:inputXP with
a select
expression except that the select
expression is computed dynamically.
The p:http-request
step provides for interaction
with resources over HTTP or related protocols.
The input
document provided on the source
port specifies a request
by a single c:request
element. This element specifies
the method, resource, and other request properties as well as possibly
including an entity body (content) for the request.
<p:declare-step
type
="
p:http-request
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
/>
<p:output
port
="
result
"
sequence
="
true
"
/>
<p:option
name
="
byte-order-mark
"
as
="
xs:boolean
"
/>
<p:option
name
="
cdata-section-elements
"
select
="
''
"
as
="
xs:string
"
/>
<!--
ListOfQNames -->
<p:option
name
="
doctype-public
"
as
="
xs:string
"
/>
<p:option
name
="
doctype-system
"
as
="
xs:anyURI
"
/>
<p:option
name
="
encoding
"
as
="
xs:string
"
/>
<p:option
name
="
escape-uri-attributes
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
include-content-type
"
select
="
'true'
"
as
="
xs:boolean
"
/>
<p:option
name
="
indent
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
media-type
"
as
="
xs:string
"
/>
<p:option
name
="
method
"
select
="
'xml'
"
as
="
xs:QName
"
/>
<p:option
name
="
normalization-form
"
select
="
'none'
"
as
="
xs:token
"
/>
<!--
NormalizationForm -->
<p:option
name
="
omit-xml-declaration
"
select
="
'true'
"
as
="
xs:boolean
"
/>
<p:option
name
="
standalone
"
select
="
'omit'
"
as
="
xs:token
"
/>
<!--
"true" | "false" | "omit" -->
<p:option
name
="
undeclare-prefixes
"
as
="
xs:boolean
"
/>
<p:option
name
="
version
"
select
="
'1.0'
"
as
="
xs:string
"
/>
</p:declare-step>
The standard serialization options are provided to control the serialization of any XML content which is sent as part of the request. The effect of these options is as specified in Section 1.3, “Serialization Options”. See Section 1.1.11.3, “Request Entity body conversion” for a discussion of when serialization occurs in constructing a request.
It is a dynamic error (err:XC0040
)
if the document element of the document that arrives on the
source
port is not c:request
.
Can the input document be JSON?
An HTTP request is represented by a c:request
element.
<c:request
method = NCName
href? = anyURI
detailed? = boolean
status-only? = boolean
username? = string
password? = string
auth-method? = string
send-authorization? = boolean
override-content-type? = ContentType>
(c:header*,
(c:multipart |
c:body)?)
</c:request>
It is a dynamic error (err:XC0006
) if the
method
is not specified on a c:request
.
It is a dynamic error (err:XC0005
) if the
request contains a c:body
or c:multipart
but the
method
does not allow for an entity body being sent with the request.
It is a dynamic error (err:XC0004
) if the
status-only
attribute has the value true
and
the detailed
attribute does not have the value true
.
The method
attribute specifies the method to be
used against the IRI specified by the href
attribute,
e.g. GET
or POST
(the value is not case-sensitive).
If the href
attribute is not absolute, it will be resolved against the base URI of
the element on which it is occurs.
In the case of simple “GET” requests, implementors are encouraged
to support as many protocols as practical. In particular, pipeline authors may
attempt to use p:http-request
to load documents with computed
URIs using the file:
scheme.
If the username
attribute is specified, the
username
, password
,
auth-method
, and send-authorization
attributes are used to handle authentication according to the selected
authentication method.
For the purposes of avoiding an authentication challenge, if the
send-authorization
attribute has the value
true
and the authentication method specified by the
auth-method
supports generation of an
Authorization
header without a challenge, then an
Authorization
header is generated and sent on the first
request. If the send-authorization
attribute is absent or
has the value
false
, then the first request is sent without an
Authorization
header.
If the initial response to the request is an
authentication challenge, the auth-method
,
username
, password
and any relevant data from
the challenge are used to generate an
Authorization
header and the request is sent again. If
that authorization fails, the request is not retried.
Appropriate values for the auth-method
attribute
are “Basic” or “Digest” but other values are allowed.
If the authentication method is “Basic” or “Digest”, authentication
is handled as per [RFC 2617].
The
interpretation of auth-method
values on
c:request
other than “Basic” or “Digest” is
implementation-definedXP.
It
is a dynamic error (err:XC0003
) if a username
or password
is specified without specifying an
auth-method
, if
the requested
auth-method
isn't supported, or the authentication
challenge contains an authentication method that isn't
supported. All implementations are required to support "Basic"
and "Digest" authentication per [RFC 2617].
The c:header
element specifies a
header name and value, either for inclusion in a request, or as received in a response.
<c:header
name = string
value = string />
The request is formulated from the attribute values on the
c:request
element and its
c:header
and c:multipart
or c:body
children,
if present, and transmitted to the host (and port, if present) specified by the
href
attribute. The details of how the request entity body, if any, is
constructed are given in Section 1.1.11.5, “Converting Response Entity Bodies”.
When the request is formulated, the step and/or protocol implementation may add headers as necessary to either complete the request or as appropriate for the content specified (e.g. transfer encodings). A user of this step is guaranteed that their requested headers and content will be sent with the exception of any conflicts with protocol-related headers.
The p:http-request
step allows users to specify
independently values that are not always independent. For example,
some combinations of c:header
values
(e.g., Content-Type
)
may be inconsistent
with values that the step and/or protocol implementation must set. In
a few cases, the step provides more than one mechanism to specify what
is actually a single value (e.g., the boundary string in multipart
messages).
It is a
dynamic error (err:XC0020
) if the the user specifies a value
or values that are inconsistent with each other or with the requirements
of the step or protocol.
Implementations that support file:
URIs,
should support “globbing”. For example, the
URI file:///path/to/dir/*.xml
should return all of the
XML documents in the directory
/path/to/dir
.
Must define the globbing rules!
The c:multipart
element specifies a multi-part
body, per [RFC 1521], either for inclusion in a
request or as received in a response.
<c:multipart
content-type = ContentType
boundary = string>
c:body+
</c:multipart>
In the context of a request, the media type of the
c:multipart
must be a multipart media
type (i.e. have a main type of 'multipart'). If the
content-type
attribute is not specified, a value of
“multipart/mixed
” will be assumed. (Whether
or not, and to what extent, “multipart/byte-ranges
”
responses are supported is
implementation-definedXP.)
The boundary
attribute is required and is used to
provide a multipart boundary marker. The implementation must use this
boundary marker and must prefix the value with the string
“--
” when formulating the multipart message. It is a dynamic error (err:XC0002
) if the value
starts with the string “--
”.
If the boundary is also specified as a parameter in the
content-type
option, then the parameter value specified
and the boundary
value specified must
be the same. If the boundary is specified in both the boundary
option and the content-type
option then the two values
must be the same.
The c:body
element holds the body or body part of the message. Each of the attributes holds controls some aspect of the encoding the request body or decoding the body element's content when the request is formulated. These are specified as follows:
<c:body
content-type = ContentType
encoding? = string
id? = string
description? = string
disposition? = string>
anyElement*
</c:body>
The content-type
attribute specifies the media type
of the body or body part, that is, the value of its
Content-Type
header. If the media type is not an XML type
or text, the content must already be base64-encoded.
The encoding
attribute controls the decoding of the
element content for formulating the body. A value of
base64
indicates the element's content is a base64
encoded string whose byte stream should be sent as the message body.
An implementation may support encodings other than
base64
but these encodings and their names are
implementation-definedXP.
It is a dynamic
error (err:XC0052
) if the encoding specified is not supported by the
implementation.
The p:http-request
step provides only a single set of
serialization options for XML media types. There's no direct support
for sending a multipart message with two XML parts encoded
differently.
For each body or body part, the id
attribute
specifies the value of the Content-ID
header;
the description
attribute specifies the
value of the Content-Description
header;
and the disposition
attribute specifies the value
of the Content-Disposition
header.
If an entity body is to be sent as part of a request (e.g. a
POST
), either a c:body
element, specifying the
request entity body, or a c:multipart
element, specifying
multiple entity body parts, may be used. When c:multipart
is used it may contain multiple c:body
children. A
c:body
specifies the construction of a body or body part as
follows:
If the content-type
attribute does not specify an
XML media type, or the encoding
attribute is
“base64
”, then it is a
dynamic error (err:XC0028
) if the content of the
c:body
element does not consist entirely of
characters, and the entity body or body part will consist of
exactly those characters.
Otherwise (the content-type
attribute
does specify an XML media type and the
encoding
attribute is not 'base64'),
it is a dynamic error (err:XC0022
) if
the content of the c:body
element does not consist of
exactly one element, optionally preceded and/or followed by any number
of processing instructions, comments or whitespace characters,
and the entity body or body part will consist of the serialization of
a document node containing that content. The serialization of that
document is controlled by the serialization options on the
p:http-request
step itself.
For example, the following input to a
p:http-request
step will POST a small XML
document:
<c:request method="POST" href="http://example.com/someservice">
<c:body xmlns:c="http://www.w3.org/ns/xproc-step" content-type="application/xml">
<doc>
<title>My document</title>
</doc>
</c:body>
</c:request>
The corresponding request should look something like this:
POST http://example.com/someservice HTTP/1.1
Host: example.com
Content-Type: application/xml; charset="utf-8"
<?xml version='1.0'?>
<doc>
<title>My document</title>
</doc>
Where do we say that ffor URI schemes (such as
file:
and ftp:
) where a content type is not
provided by the underlying request, the content type is
implementation-dependentXP?
The handling of the response to the request and the generation
of the step's result document is controlled by the
status-only
, override-content-type
and
detailed
attributes on the c:request
input.
The override-content-type
attribute controls
interpretation of the response's Content-Type
header. If
this attribute is present, the response will be treated as if it
returned the Content-Type
given by its value. This
original Content-Type
header will however be reflected
unchanged as a c:header
in the result document. It is a dynamic error (err:XC0030
) if the
override-content-type
value cannot be used (e.g.
text/plain
to override
image/png
).
If the override-content-type
includes an encoding
parameter, then that encoding must be used to
read the document.
If the status-only
attribute has the value
true
, the result document will contain only header
information. The entity of the response will not be processed to
produce a c:body
or c:multipart
element.
The c:response
element represents an HTTP response.
The response's status code is encoded in the status
attribute and the headers and entity body are processing into
c:header
and c:multipart
or c:body
content.
<c:response
status? = integer>
(c:header*,
(c:multipart |
c:body)?)
</c:response>
The value of the detailed
attribute determines the
content of the result document. If it is true
, the
response to the request is handled as follows:
A single c:response
element is produced with the status
attribute containing the status of the response received.
Each response header is translated into a c:header
element.
Unless the status-only
attribute has a value
true
, the entity body of the response is converted into
a c:body
or c:multipart
element via the rules given in
Section 1.1.11.5, “Converting Response Entity Bodies”.
Otherwise (the detailed
attribute is not specified
or its value is false
), the response to the request
is handled as follows:
If the media type (as determined by the
override-content-type
attribute or the
Content-Type
response header) is an XML media type, the
entity is decoded if necessary, then parsed as an XML document:
The parser which p:http-request
employs
must process the external subset; all general and
external parsed entities must be fully expanded.
The requirement to process the external subset comes from p:load, we probably don't want to impose that on all p:http-request calls. Need a way to control it?
It
may perform xml:id
processing, but it must not perform any other
processing, such as expanding XIncludes.
The parser must be conformant to Namespaces in XML.
Parsing the document must not fail due to validation errors.
The resulting XML document is
produced on the result
output port as the entire output
of the step.
Otherwise, the entity body of the response is converted into
a c:body
or c:multipart
element via the rules given in
Section 1.1.11.5, “Converting Response Entity Bodies”.
In either case the base URI of the output document is the resolved value
of the href
attribute from the input c:request
.
One possible response from an HTTP request is a redirect, indicated by a status code in the three-hundred range. The precise semantics of the 3xx return codes are laid out by section 10.3 Redirection 3xx in [RFC 2616].
The p:http-request
step should follow
redirect requests (in a manner consistent with [RFC 2616])
if they are returned by the server.
The entity of a response may be multipart per [RFC 1521]. In those situations, the result document will be
a c:multipart
element that contains multiple
c:body
elements inside.
Although it is technically possible for any of the individual parts of a multipart message to also be multipart, XProc does not provide a standard representation for such messages. The interpretation of a multipart message inside another multipart message is implementation-dependentXP.
The result of the p:http-request
step is an XML
document. For media types (images, binaries, etc.) that can't be
represented as a sequence of Unicode characters, the response is
encoded as base64
and then returned as text children of the c:body
element.
If the content is base64-encoded, the encoding
attribute on c:body
must
be set to “base64
”.
This section hasn't been updated to reflect the fact that non-XML documents are now possible. It should probably say something like:
If the document identified has a non-XML content type, no extra processing is mandated. The number and variety of media types that an implementation can load is implementation-definedXP.
If the media type of the response
is a text type with a
charset
parameter that is a Unicode character encoding
(per [Unicode TR#17]) or
is recognized as a non-XML media type whose contents are encoded as a
sequence of Unicode characters (e.g. it has a character parameter or
the definition of the media type is such that it requires Unicode),
the content of the constructed c:body
element is the translation
of the text into a sequence of Unicode characters.
If the response is an XML media type, the content of the
constructed c:body
element is the result of decoding the
body as necessary, then parsing it with an XML parser.
The parser which p:http-request
employs
must process the external subset; all general and
external parsed entities must be fully expanded.
The requirement to process the external subset comes from p:load, we probably don't want to impose that on all p:http-request calls. Need a way to control it?
It
may perform xml:id
processing, but it must not perform any other
processing, such as expanding XIncludes.
The parser must be conformant to Namespaces in XML.
Parsing the document must not fail due to validation errors.
If the content is not well-formed, the step fails.
This prose should be consolidated into a single place.
In a c:body
in a response, the
content-type
attribute must
be an exact copy of the value returned in the
Content-Type
header. That is, it must reflect the
content type actually returned, not any override value that may have been
specified, and it must include any parameters returned by the server.
In the case of a multipart response, the same rules apply when
constructing a c:body
element for each body part
encountered.
Given the above description, any content identified as
text/html
will be encoded as (escaped) text or
base64-encoded in the c:body
element, as HTML isn't always
well-formed XML. A user can attempt to convert such content into XML
using the p:unescape-markup
step.
A simple form might be posted as follows:
<c:request method="POST" href="http://www.example.com/form-action" xmlns:c="http://www.w3.org/ns/xproc-step">
<c:body content-type="application/x-www-form-urlencoded">
name=W3C&spec=XProc
</c:body>
</c:request>
and if the response was an XHTML document, the result document would be:
<c:response status="200" xmlns:c="http://www.w3.org/ns/xproc-step">
<c:header name="Date" value=" Wed, 09 May 2007 23:12:24 GMT"/>
<c:header name="Server" value="Apache/1.3.37 (Unix) PHP/4.4.5"/>
<c:header name="Vary" value="negotiate,accept"/>
<c:header name="TCN" value="choice"/>
<c:header name="P3P" value="policyref='http://www.w3.org/2001/05/P3P/p3p.xml'"/>
<c:header name="Cache-Control" value="max-age=600"/>
<c:header name="Expires" value="Wed, 09 May 2007 23:22:24 GMT"/>
<c:header name="Last-Modified" value="Tue, 08 May 2007 16:10:49 GMT"/>
<c:header name="ETag" value="'4640a109;42380ddc'"/>
<c:header name="Accept-Ranges" value="bytes"/>
<c:header name="Keep-Alive" value="timeout=2, max=100"/>
<c:header name="Connection" value="Keep-Alive"/>
<c:body content-type="application/xhtml+xml">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><title>OK</title></head>
<body><p>OK!</p></body>
</html>
</c:body>
</c:response>
The p:identity
step makes a verbatim copy of its input
available on its output.
<p:declare-step
type
="
p:identity
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
sequence
="
true
"
/>
<p:output
port
="
result
"
sequence
="
true
"
/>
</p:declare-step>
If the implementation supports passing PSVI annotations between
steps, the p:identity
step must preserve
any annotations that appear in the input.
The p:insert
step inserts the
insertion
port's document into the source
port's document relative to the matching elements in the
source
port's document.
<p:declare-step
type
="
p:insert
"
>
<p:input
port
="
source
"
primary
="
true
"
/>
<p:input
port
="
insertion
"
sequence
="
true
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
match
"
select
="
'/*'
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
<p:option
name
="
position
"
required
="
true
"
as
="
xs:token
"
/>
<!--
"first-child" | "last-child" | "before" | "after" -->
</p:declare-step>
The value of the match
option
must be an XSLTMatchPattern. It
is a dynamic error (err:XC0023
) if that pattern matches
anything other than element, text, processing-instruction, or comment nodes.
Multiple matches are
allowed, in which case multiple copies of the insertion
documents will occur. If no elements match, then the document is
unchanged.
The value of the position
option must be an NMTOKEN in
the following list:
“first-child
” - the insertion is made as the first child of the match;
“last-child
” - the insertion is made as the last child of the match;
“before
” - the insertion is made as the immediate preceding sibling of the match;
“after
” - the insertion is made as the immediate following sibling of the match.
It is a dynamic error (err:XC0025
)
if the match pattern matches anything other than an element node and
the value of the position
option is
“first-child
” or
“last-child
”.
As the inserted elements are part of the output of the step they
are not considered in determining matching elements. If an empty sequence
appears on the insertion
port, the result will be the same
as the source.
The p:label-elements
step generates a label for each matched
element and stores that label in the specified attribute.
<p:declare-step
type
="
p:label-elements
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
attribute
"
select
="
'xml:id'
"
as
="
xs:QName
"
/>
<p:option
name
="
attribute-prefix
"
as
="
xs:NCName
"
/>
<p:option
name
="
attribute-namespace
"
as
="
xs:anyURI
"
/>
<p:option
name
="
label
"
select
="
'concat("_",$p:index)'
"
as
="
xs:string
"
/>
<!--
XPathExpression -->
<p:option
name
="
match
"
select
="
'*'
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
<p:option
name
="
replace
"
select
="
'true'
"
as
="
xs:boolean
"
/>
</p:declare-step>
The value of the attribute
option
must be a QName
.
If the lexical value does not contain a colon, then the attribute-namespace
may be used to specify the
namespace of the attribute name. In that case, the attribute-prefix
may be specified to suggest a
prefix for the attribute name. It is a
dynamic error (err:XD0034
XP) to specify a new namespace or
prefix if the lexical value of the specified name contains a
colon.
The value of the label
option is an XPath
expression used to generate the value of the attribute label.
The value of the match
option
must be an XSLTMatchPattern. It
is a dynamic error (err:XC0023
) if that expression matches
anything other than element nodes.
The value of the replace
must be a boolean value and is used to indicate
whether existing attribute values are replaced.
This step operates by generating attribute labels for each
element matched. For every matched element, the expression is
evaluated with the context node set to the matched element. An
attribute is added to the matched element using the attribute name is
specified the attribute
option and the string value
of result of evaluating the expression. If the attribute already
exists on the matched element, the value is replaced with the string
value only if the replace
option has the value of
true
.
If this step is used to add or change the value
of an attribute named “xml:base
”, the base URI
of the element must also be amended accordingly.
An implementation must bind the variable
“p:index
” in the static context of each evaluation
of the XPath expression to the position of the element in the sequence
of matched elements. In other words, the first element (in document
order) matched gets the value “1
”, the second gets
the value “2
”, the third, “3
”,
etc.
The result of the p:label-elements step is the input document with the attribute labels associated with matched elements. All other non-matching content remains the same.
The p:load
step has no inputs but produces as its
result a document (or documents) specified by an IRI.
<p:declare-step
type
="
p:load
"
>
<p:output
port
="
result
"
sequence
="
true
"
/>
<p:option
name
="
href
"
required
="
true
"
as
="
xs:anyURI
"
/>
<p:option
name
="
dtd-validate
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
override-content-type
"
as
="
xs:string
"
/>
</p:declare-step>
The value of the href
option
must be an anyURI
. It is interpreted
as an IRI reference. If it is relative, it is made absolute against
the base URI of the element on which it is specified
(p:with-optionXP or p:load
in the case of a
syntactic shortcutXP value).
The document or documents identified by the URI is loaded and returned. If the URI protocol supports redirection, then redirects must be followed.
If dtd-validate
is false
,
the p:load
step is equivalent to performing the following
p:http-request
:
<p:http-request>
<p:input port="source">
<p:inline>
<c:request method="GET"
href="{HREF}"
detailed="false"
status-only="false"
override-content-type="{OVERRIDE}"/>
</p:inline>
</p:input>
</p:http-request>
Where the “{HREF}
” value is the value of
the href
option made absolute and the
“{OVERRIDE}
value is the value of the
override-content-type
option. If no value is provided
for the override-content-type
option, then the
override-content-type
attribute is not
present on the c:request
.
If dtd-validate
is true
,
the p:load
step is equivalent to performing the following
pipeline:
<p:declare-step>
<p:output port="result" sequence="false"/>
<p:option name="href" required="true"/>
<p:option name="override-content-type"/>
<p:http-request>
<p:input port="source">
<p:inline expand-text="true">
<c:request method="GET"
href="{$href}"
detailed="false"
status-only="false"
override-content-type="text/plain"/>
</p:inline>
</p:input>
</p:http-request>
<p:xml-parse dtd-validate="true"/>
<p:choose>
<p:when test="p:value-availalle('override-content-type')">
<p:cast-content-type content-type="{$override-content-type}"/>
</p:when>
<p:otherwise>
<p:identity/>
</p:otherwise>
</p:choose>
</p:declare-step>
The retrieved document or documents are produced on the
result
port. For single part responses, the base URI of
the result is the (absolute) IRI used to retrieve it. For multipart
responses, the base URI of each part is the (absolute) IRI used to
retrieve it unless the content-disposition
header indicates
a URI. If the content-disposition
header indicates a relative
URI, it is made absolute agains the (absolute) IRI used to retreive it.
How does the preceding paragraph jibe with what
p:http-request
says about multipart responses?
The p:make-absolute-uris
step makes an element or
attribute's value in the source document an absolute IRI value in the
result document.
<p:declare-step
type
="
p:make-absolute-uris
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
<p:option
name
="
base-uri
"
as
="
xs:anyURI
"
/>
</p:declare-step>
The value of the match
option must be an
XSLTMatchPattern.
It is a dynamic error (err:XC0023
) if
the pattern matches anything other than element or attribute
nodes.
The value of the base-uri
option
must be an anyURI
. It is interpreted
as an IRI reference. If it is relative, it is made absolute against
the base URI of the element on which it is specified
(p:with-optionXP or p:make-absolute-uris
in the case of
a syntactic shortcutXP
value).
For every element or attribute in the input document which matches the specified pattern, its XPath string-value is resolved against the specified base URI and the resulting absolute IRI is used as the matched node's entire contents in the output.
The base URI used for resolution defaults to the matched
attribute's element or the matched element's base URI unless the
base-uri
option is specified. When the
base-uri
option is specified, the option value is
used as the base URI regardless of any contextual base URI value in
the document. This option value is resolved against the base URI of
the p:optionXP element used to set the option.
If the IRI reference specified by the base-uri
option
on p:make-absolute-uris
is
not valid, or if it is absent and the input document has no base URI,
the results are implementation-dependentXP.
The p:namespace-rename
step renames any namespace declaration or
use of a namespace in a document to a new IRI value.
<p:declare-step
type
="
p:namespace-rename
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
from
"
as
="
xs:anyURI
"
/>
<p:option
name
="
to
"
as
="
xs:anyURI
"
/>
<p:option
name
="
apply-to
"
select
="
'all'
"
as
="
xs:token
"
/>
<!--
"all" | "elements" | "attributes" -->
</p:declare-step>
The value of the from
option
must be an anyURI
. It
should be either empty or absolute, but will not be
resolved in any case.
The value of the to
option
must be an anyURI
. It
should be empty or absolute, but will not be
resolved in any case.
The value of the apply-to
option
must be one of “all
”,
“elements
”, or “attributes
”.
If the value is “elements
”, only elements will be
renamed, if the value is “attributes
”, only attributes
will be renamed, if the value is “all
”, both elements
and attributes will be renamed.
It is a dynamic error (err:XC0014
)
if the XML namespace (http://www.w3.org/XML/1998/namespace
)
or the XMLNS namespace (http://www.w3.org/2000/xmlns/
) is
the value of either the from
option or the
to
option.
If the value of the from
option is the same as
the value of the to
option, the input is reproduced
unchanged on the output. Otherwise, namespace bindings, namespace
attributes and element and attribute names are changed as
follows:
Namespace bindings: If the from
option is present
and its value is not the empty string,
then every binding of a prefix (or the default namespace) in the input
document whose value is the same as the value of the from
option is
replaced in the output with a binding to the value of the to
option, provided it is present and not the empty string;
otherwise (the to
option is
not specified or has an empty string as its value) absent from the output.
If the from
option is absent, or its value is the empty string,
then no bindings are changed or removed.
Elements and attributes: If the from
option is present
and its value is not the empty string, for every element and attribute,
as appropriate, in the input whose namespace name is the same as the value of the
from
option, in the output its namespace name is
replaced with the value of the to
option, provided it is present and not the empty string;
otherwise (the to
option is
not specified or has an empty string as its value) changed to have no value.
If the from
option is absent, or its value
is the empty string, then for every element and attribute, as appropriate,
whose namespace name has no value, in the
output its namespace name is set to the value of the
to
option.
Namespace attributes: If the from
option is present
and its value is not the empty string, for every namespace attribute in the
input whose value is the same as the value of the from
option, in the output
the namespace attribute's value is replaced with the value of the to
option, provided it is present and not the empty string;
otherwise (the to
option is
not specified or has an empty string as its value) the namespace attribute is absent.
The apply-to
option is primarily intended to make
it possible to avoid renaming attributes when the from
option
specifies no namespace, since many attributes are in no namespace.
Care should be taken when specifying no namespace with the
to
option. Prefixed names in content, for example QNames and
XPath expressions, may end up with no appropriate namespace binding.
The p:pack
step merges two document sequences in a pair-wise
fashion.
<p:declare-step
type
="
p:pack
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
sequence
="
true
"
primary
="
true
"
/>
<p:input
port
="
alternate
"
sequence
="
true
"
/>
<p:output
port
="
result
"
sequence
="
true
"
/>
<p:option
name
="
wrapper
"
required
="
true
"
as
="
xs:QName
"
/>
<p:option
name
="
wrapper-prefix
"
as
="
xs:NCName
"
/>
<p:option
name
="
wrapper-namespace
"
as
="
xs:anyURI
"
/>
</p:declare-step>
The value of the wrapper
option
must be a QName
. If the lexical value
does not contain a colon, then the wrapper-namespace
may be used to specify the namespace of the wrapper. In that case, the
wrapper-prefix
may be specified to suggest a
prefix for the wrapper element.
It is a dynamic error (err:XD0034
XP)
to specify a new namespace or prefix if the lexical value of the specified
name contains a colon.
The step takes each pair of documents, in order, one from the
source
port and one from the alternate
port,
wraps them with a new element node whose QName is the value specified
in the wrapper
option, and writes that element to the
result
port as a document.
If the step reaches the end of one input sequence before the other, then it simply wraps each of the remaining documents in the longer sequence.
In the common case, where the document element of a document in
the result
sequence has two element children, any
comments, processing instructions, or white space text nodes that
occur between them may have come from either of the input documents;
this step does not attempt to distinguish which one.
The p:parameters
step exposes a set of parameters
as a c:param-set
document.
<p:declare-step
type
="
p:parameters
"
>
<p:output
port
="
result
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item())
"
/>
</p:declare-step>
Each parameter in the parameters
map is converted into a
c:param
element.
The resulting c:param
elements are wrapped in a
c:param-set
and the parameter set document is written
to the result
port.
The
order in which c:param
elements occur in the c:param-set
is
implementation-dependentXP.
For consistency and user convenience, if any of the parameters
have names that are in a namespace, the
namespace
attribute on the
c:param
element must be used. Each
name
must be an NCName.
The base URI of the output document is the URI of the pipeline document that contains the step.
A c:param
represents a parameter on a parameter
input.
<c:param
name = QName
namespace? = anyURI
value = string />
The name
attribute of the
c:param
must have the lexical form of a QName.
If the namespace
attribute is
specified, then the expanded name of the parameter is constructed from
the specified namespace and the name
value. It is a dynamic
error (err:XD0025
XP) if the namespace
attribute is specified, the name
contains
a colon, and the specified namespace is not the same as the in-scope
namespace binding for the specified prefix.
If the namespace
attribute is not
specified, and the name
contains a colon,
then the expanded name of the parameter is constructed using the name
value and the namespace declarations
in-scope on the c:param
element.
If the namespace
attribute is not
specified, and the name
does not contain
a colon, then the expanded name of the parameter is in no
namespace.
Any namespace-qualified attribute names that appear on the
c:param
element are ignored. It is a
dynamic error (err:XD0014
XP) for any unqualified attribute
names other than “name
”,
“namespace
”, or “value
” to
appear on a c:param
element.
A c:param-set
represents a set of parameters on a
parameter input.
<c:param-set>
c:param*
</c:param-set>
The c:param-set
contains zero or more
c:param
elements. It is a
dynamic error (err:XD0018
XP) if the parameter list contains
any elements other than c:param
.
Any namespace-qualified attribute names that appear on the
c:param-set
element are ignored. It is
a dynamic error (err:XD0014
XP) for any unqualified attribute
names to appear on a c:param-set
element.
The p:rename
step renames elements, attributes, or
processing-instruction targets in a document.
<p:declare-step
type
="
p:rename
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
<p:option
name
="
new-name
"
required
="
true
"
as
="
xs:QName
"
/>
<p:option
name
="
new-prefix
"
as
="
xs:NCName
"
/>
<p:option
name
="
new-namespace
"
as
="
xs:anyURI
"
/>
</p:declare-step>
The value of the match
option must be an
XSLTMatchPattern. It is a dynamic
error (err:XC0023
) if the pattern matches anything other than element,
attribute or processing instruction nodes.
The value of the new-name
option must be a
QName
.
If the lexical value does not contain a colon, then the new-namespace
may be used to specify the
namespace of the new name. In that case, the new-prefix
may be specified to suggest a
prefix for the new name. It is a
dynamic error (err:XD0034
XP) to specify a new namespace or
prefix if the lexical value of the specified name contains a
colon.
Each element, attribute, or processing-instruction in the input
matched by the match pattern specified in the match
option is renamed in the output to the name specified by the
new-name
option.
If the match
option matches an attribute and if
the element on which it occurs already has an attribute whose expanded
name is the same as the expanded name of the specified
new-name
, then the results is as if the current
attribute named “new-name
” was deleted before
renaming the matched attribute.
With respect to attributes named “xml:base
”, the
following semantics apply: renaming an from
“xml:base
” to something else has
no effect on the underlying base URI of the element; however,
if an attribute is renamed from something else
to “xml:base
”, the base URI
of the element must also be amended accordingly.
If the pattern matches processing instructions, then it is the
processing instruction target that is renamed. It
is a dynamic error (err:XC0013
) if the pattern matches
a processing instruction and the new name has a non-null namespace.
The p:replace
step replaces matching nodes in
its primary input with the document element of the
replacement
port's document.
<p:declare-step
type
="
p:replace
"
>
<p:input
port
="
source
"
primary
="
true
"
/>
<p:input
port
="
replacement
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
</p:declare-step>
The value of the match
option
must be an XSLTMatchPattern. It
is a dynamic error (err:XC0023
) if that pattern matches
anything other than element, text, processing-instruction, or comment
nodes. Multiple matches are allowed, in which case multiple
copies of the replacement
document will occur.
Every node in the primary input matching the specified
pattern is replaced in the output is replaced by the document element
of the replacement
document. Only non-nested matches are
replaced. That is, once a node is replaced, its descendants cannot
be matched.
The p:set-attributes
step sets attributes on
matching elements.
<p:declare-step
type
="
p:set-attributes
"
>
<p:input
port
="
source
"
primary
="
true
"
/>
<p:input
port
="
attributes
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
</p:declare-step>
The value of the match
option must be an
XSLTMatchPattern. It is a dynamic
error (err:XC0023
) if that pattern matches anything other than element
nodes.
Each attribute on the document element of the document that
appears on the attributes
port is copied to each element
that matches the match
expression.
If an attribute with the same name as one of the attributes to
be copied already exists, the value specified on the
attribute
port's document is used. The result port of
this step produces a copy of the source
port's document
with the matching elements' attributes modified.
The matching elements are specified by the match pattern in the
match
option. All matching elements are processed.
If no elements match, the step will not change any elements.
This step must not copy namespace declarations. If the attributes
copied from the attributes
use namespaces, prefixes, or
prefixes bound to different namespaces, the document produced on the
result
output port will require
Section 2.5.1, “Namespace Fixup on XML Outputs”XP.
If an attribute named
xml:base
is added or changed, the base URI
of the element must also be amended accordingly.
The p:set-properties
step sets document
propertiesXP on the source document.
<p:declare-step
type
="
p:set-properties
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
properties
"
required
="
true
"
as
="
map(xs:string,xs:string)
"
/>
</p:declare-step>
The document propertiesXP of the document
on the source
port are augmented with the values specified
in the properties
option. The document produced on
the result
port has the same representation but the
adjusted property values.
It is a dynamic
error (err:XC1001
) if the properties
map contains
a key equal to the string “content-type
”.
The p:sink
step accepts a sequence of documents and
discards them. It has no output.
<p:declare-step
type
="
p:sink
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
sequence
="
true
"
/>
</p:declare-step>
The p:split-sequence
step accepts a sequence of
documents and divides it into two sequences.
<p:declare-step
type
="
p:split-sequence
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
sequence
="
true
"
/>
<p:output
port
="
matched
"
sequence
="
true
"
primary
="
true
"
/>
<p:output
port
="
not-matched
"
sequence
="
true
"
/>
<p:option
name
="
initial-only
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
test
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XPathExpression -->
</p:declare-step>
The value of the test
option must be an XPathExpression.
The XPath expression in the test
option is
applied to each document in the input sequence. If the effective
boolean value of the expression is true, the document is copied to the
matched
port; otherwise it is copied to the
not-matched
port.
If the initial-only
option is true, then when
the first document that does not satisfy the test expression is
encountered, it and all the documents that follow
it are written to the not-matched
port.
In other words, it only writes the initial series of matched
documents (which may be empty) to the matched
port.
All other documents are written to the not-matched
port,
irrespective of whether or not they match.
The XPath contextXP for the
test
option changes over time. For each document that
appears on the source
port, the expression is evaluated
with that document as the context document. The context position
(position()
) is the position of that document within the
sequence and the context size (last()
) is the total
number of documents in the sequence.
In principle, this component cannot stream because it must
buffer all of the input sequence in order to find the context size. In
practice, if the test expression does not use the
last()
function, the implementation can stream
and ignore the context size.
If the implementation supports passing PSVI annotations between
steps, the p:split-sequence
step must preserve
any annotations that appear in the input.
The p:store
step stores (a possibly serialized
version of) its input to a URI. This step outputs a reference to the
location of the stored document.
<p:declare-step
type
="
p:store
"
>
<p:input
port
="
source
"
content-types
="
*/*
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
href
"
required
="
true
"
as
="
xs:anyURI
"
/>
<p:option
name
="
byte-order-mark
"
as
="
xs:boolean
"
/>
<p:option
name
="
cdata-section-elements
"
select
="
''
"
as
="
xs:string
"
/>
<!--
ListOfQNames -->
<p:option
name
="
doctype-public
"
as
="
xs:string
"
/>
<p:option
name
="
doctype-system
"
as
="
xs:anyURI
"
/>
<p:option
name
="
encoding
"
as
="
xs:string
"
/>
<p:option
name
="
escape-uri-attributes
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
include-content-type
"
select
="
'true'
"
as
="
xs:boolean
"
/>
<p:option
name
="
indent
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
media-type
"
as
="
xs:string
"
/>
<p:option
name
="
method
"
select
="
'xml'
"
as
="
xs:QName
"
/>
<p:option
name
="
normalization-form
"
select
="
'none'
"
as
="
xs:token
"
/>
<!--
NormalizationForm -->
<p:option
name
="
omit-xml-declaration
"
select
="
'true'
"
as
="
xs:boolean
"
/>
<p:option
name
="
standalone
"
select
="
'omit'
"
as
="
xs:token
"
/>
<!--
"true" | "false" | "omit" -->
<p:option
name
="
undeclare-prefixes
"
as
="
xs:boolean
"
/>
<p:option
name
="
version
"
select
="
'1.0'
"
as
="
xs:string
"
/>
</p:declare-step>
The value of the href
option
must be an anyURI
. If it is relative,
it is made absolute against the base URI of the element on which it is
specified (p:with-optionXP or p:store
in the case
of a syntactic shortcutXP
value).
The step attempts to store the XML document to the specified
URI. It is a dynamic error (err:XC0050
)
if the URI scheme is not supported or the step cannot store to the
specified location.
The output of this step is a document containing a single
c:result
element whose content is the absolute URI of the
document stored by the step.
The standard serialization options are provided to control the serialization of XML content when it is stored. These options are as specified in Section 1.3, “Serialization Options”.
The p:string-replace
step matches nodes in the
document provided on the source
port and replaces them
with the string result of evaluating an XPath expression.
<p:declare-step
type
="
p:string-replace
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
<p:option
name
="
replace
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XPathExpression -->
</p:declare-step>
The value of the match
option must be an
XSLTMatchPattern.
The value of the replace
option must be an
XPathExpression.
The matched nodes are specified with the match pattern in the
match
option.
For each matching node, the XPath
expression provided by the replace
option is
evaluated with the matching node as the XPath context node.
The string value of the result is used in the output.
Nodes that do not match are copied without change.
If the expression given in the match
option
matches an attribute, the string value of the
replace
expression is used as the new value of the attribute in the output.
If the attribute is named “xml:base
”, the base URI
of the element must also be amended accordingly.
If the expression matches any other kind of node, the entire
node (and not just its contents) is replaced by
the string value of the replace
expression.
The p:unescape-markup
step takes the string value of
the document element and parses the content as if it was a Unicode
character stream containing serialized XML. The output consists of the
same document element with children that result from the parse. This
is the reverse of the p:escape-markup
step.
<p:declare-step
type
="
p:unescape-markup
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml text/*
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
namespace
"
as
="
xs:anyURI
"
/>
<p:option
name
="
content-type
"
select
="
'application/xml'
"
as
="
xs:string
"
/>
<p:option
name
="
encoding
"
as
="
xs:string
"
/>
<p:option
name
="
charset
"
as
="
xs:string
"
/>
</p:declare-step>
The value of the namespace
option
must be an anyURI
. It
should be absolute, but will not be
resolved.
When the string value is parsed, the original document element is preserved so that the result will be well-formed XML even if the content consists of multiple, sibling elements.
The namespace
option specifies a default
namespace. Elements that are in no namespace in the unescaped content
will be placed into this namespace unless there is an in-scope namespace
declaration that specifies a different namespace (or explicitly undeclares
the default namespace).
The content-type
option may
be used to specify an alternate content type for the string value. An
implementation may use a different parser to
produce XML content depending on the specified content-type. For
example, an implementation might provide an HTML to XHTML parser (e.g.
[HTML Tidy] or [TagSoup]) for the
content type 'text/html
'.
All implementations must support the content
type application/xml
, and must use a standard XML
parser for it. It is a dynamic
error (err:XC0051
) if the content-type specified is not supported by
the implementation.
Behavior of
p:unescape-markup
for content-type
s other
than application/xml
is
implementation-definedXP.
The encoding
option specifies how the data is
encoded. All implementations must support the
base64
encoding (and the absence of an encoding
option, which implies that the content is plain Unicode text).
It is a dynamic error (err:XC0052
) if the
encoding specified is not supported by the
implementation.
If an encoding
is specified, a
charset
may also be specified.
The character set may be specified as a parameter on the
content-type
or via the separate
charset
option. If it is specified in both places,
the value of the charset
option
must be used.
If the specified
encoding
is base64
,
then the character set
must be specified.
It is a dynamic error (err:XC0010
)
if an encoding of base64
is specified and
the character set is not specified or if the specified
character set is not supported by the implementation.
The octet-stream that results from decoding the
text must be interpreted using the character encoding named by
the value of the charset
option
to produce a sequence of Unicode characters to parse.
If no encoding
is specified, the character set
is ignored, irrespective of where it was specified.
For example, with the 'namespace' option set to the XHTML namespace, the following input:
<description>
<p>This is a chunk.</p>
<p>This is a another chunk.</p>
</description>
would produce:
<description>
<p xmlns="http://www.w3.org/1999/xhtml">This is a chunk.</p>
<p xmlns="http://www.w3.org/1999/xhtml">This is a another chunk.</p>
</description>
The p:unwrap
step replaces matched elements with their
children.
<p:declare-step
type
="
p:unwrap
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
</p:declare-step>
The value of the match
option must be an
XSLTMatchPattern. It is a dynamic
error (err:XC0023
) if that pattern matches anything other than element
nodes.
Every element in the source
document that matches
the specified match
pattern is replaced by its children,
effectively “unwrapping” the children from their parent. Non-element nodes
and unmatched elements are passed through unchanged.
The matching applies to the entire document, not just the “top-most”
matches. A pattern of the form h:div
will replace
all h:div
elements, not just the top-most
ones.
This step produces a single document; if the document element is unwrapped, the result might not be well-formed XML.
The p:wrap
step wraps matching nodes in the
source
document with a new parent element.
<p:declare-step
type
="
p:wrap
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
wrapper
"
required
="
true
"
as
="
xs:QName
"
/>
<p:option
name
="
wrapper-prefix
"
as
="
xs:NCName
"
/>
<p:option
name
="
wrapper-namespace
"
as
="
xs:anyURI
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
<p:option
name
="
group-adjacent
"
as
="
xs:string
"
/>
<!--
XPathExpression -->
</p:declare-step>
The value of the wrapper
option
must be a QName
. If the lexical value
does not contain a colon, then the wrapper-namespace
may be used to specify the namespace of the wrapper. In that case, the
wrapper-prefix
may be specified to suggest a
prefix for the wrapper element.
It is a dynamic error (err:XD0034
XP)
to specify a new namespace or prefix if the lexical value of the specified
name contains a colon.
The value of the match
option
must be an XSLTMatchPattern. It
is a dynamic error (err:XC0023
) if the pattern matches
anything other than document, element, text, processing instruction, and comment
nodes.
The value of the group-adjacent
option
must be an XPathExpression.
If the node matched is the document node (match="/"
),
the result is a new document where the document element is a new
element node whose QName is the value specified in the
wrapper
option. That new element contains copies of
all of the children of the original document node.
When the match pattern does not match the document node,
every node that matches the specified match
pattern is replaced with a new element node whose QName is the value
specified in the wrapper
option.
The content of that new element is a copy of the original,
matching node. The p:wrap
step performs a "deep" wrapping, the children
of the matching node and their descendants are processed and wrappers
are added to all matching nodes.
The group-adjacent
option can be used to group
adjacent matching nodes in a single wrapper element. The specified
XPath expression is evaluated for each matching node with that node
as the XPath context node. Whenever two or more adjacent matching nodes
have the same “group adjacent” value, they are wrapped together in
a single wrapper element.
Two matching nodes are considered adjacent if and only if they are siblings and either there are no nodes between them or all intervening, non-matching nodes are whitespace text, comment, or processing instruction nodes.
The p:wrap-sequence
step accepts a sequence of
documents and produces either a single document or a new sequence of
documents.
<p:declare-step
type
="
p:wrap-sequence
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
sequence
="
true
"
/>
<p:output
port
="
result
"
sequence
="
true
"
/>
<p:option
name
="
wrapper
"
required
="
true
"
as
="
xs:QName
"
/>
<p:option
name
="
wrapper-prefix
"
as
="
xs:NCName
"
/>
<p:option
name
="
wrapper-namespace
"
as
="
xs:anyURI
"
/>
<p:option
name
="
group-adjacent
"
as
="
xs:string
"
/>
<!--
XPathExpression -->
</p:declare-step>
The value of the wrapper
option
must be a QName
. If the lexical value
does not contain a colon, then the wrapper-namespace
may be used to specify the namespace of the wrapper. In that case, the
wrapper-prefix
may be specified to suggest a
prefix for the wrapper element.
It is a dynamic error (err:XD0034
XP)
to specify a new namespace or prefix if the lexical value of the specified
name contains a colon.
The value of the group-adjacent
option
must be an XPathExpression.
In its simplest form, p:wrap-sequence
takes a
sequence of documents and produces a single, new document by placing
each document in the source
sequence inside a new
document element as sequential siblings. The name of the document
element is the value specified in the wrapper
option.
The group-adjacent
option can be used to group
adjacent documents.
The
XPath
contextXP for the
group-adjacent
option changes over time. For each document that
appears on the source
port, the expression is evaluated
with that document as the context document. The context position
(position()
) is the position of that document within the
sequence and the context size (last()
) is the total
number of documents in the sequence.
Whenever
two or more sequentially adjacent documents have the same “group
adjacent” value, they are wrapped together in a single wrapper
element.
The p:xinclude
step applies [XInclude] processing to the source
document.
<p:declare-step
type
="
p:xinclude
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
fixup-xml-base
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
fixup-xml-lang
"
select
="
'false'
"
as
="
xs:boolean
"
/>
</p:declare-step>
The value of the fixup-xml-base
option must be a
boolean. If it is true, base URI fixup will be performed as per
[XInclude].
The value of the fixup-xml-lang
option must be a
boolean. If it is true, language fixup will be performed as per
[XInclude].
The included documents are located with the base URI of the input document and are not provided as input to the step.
It is a dynamic error (err:XC0029
)
if an XInclude error occurs during processing.
The p:xslt
step applies an
[XSLT 1.0] or
[XSLT 2.0] stylesheet to a document.
<p:declare-step
type
="
p:xslt
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
sequence
="
true
"
primary
="
true
"
/>
<p:input
port
="
stylesheet
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item())
"
/>
<p:output
port
="
result
"
primary
="
true
"
sequence
="
true
"
/>
<p:output
port
="
secondary
"
sequence
="
true
"
/>
<p:option
name
="
initial-mode
"
as
="
xs:QName
"
/>
<p:option
name
="
template-name
"
as
="
xs:QName
"
/>
<p:option
name
="
output-base-uri
"
as
="
xs:anyURI
"
/>
<p:option
name
="
version
"
as
="
xs:string
"
/>
</p:declare-step>
If present, the value of the initial-mode
option must be a QName
.
If present, the value of the template-name
option must be a QName
.
If present, the value of the output-base-uri
option must be an anyURI
. If it is
relative, it is made absolute against the base URI of the element on
which it is specified (p:with-optionXP or p:xslt
in the
case of a syntactic shortcutXP
value).
If the step specifies a version
, then that version
of XSLT must be used to process the transformation.
It is a
dynamic error (err:XC0038
) if the specified version
is not available. If the step does not specify a version, the
implementation may use any version it has available and may use any means
to determine what version to use, including, but not limited to,
examining the version of the stylesheet.
The XSLT stylesheet provided on the stylesheet
port
is applied to the document on the source
port. Any
parameters passed in the parameters
option are used
to define top-level stylesheet parameters. The primary result document
of the transformation, if there is one, appears on the
result
port. At most one document can appear on the
result
port. All other result documents appear on the
secondary
port. The order in which result documents
appear on the secondary
port is
implementation-dependentXP. If XSLT 1.0 is
used, an empty sequence of documents must appear on
the secondary
port.
If a sequence of documents is provided on the
source
port, the first document is used as the
primary input document. The whole sequence is also the default
collection.
If no documents are provided on the source
port,
the primary input document is undefined and the default collection
is empty.
It is a
dynamic error (err:XC0039
) if a sequence of documents (including
an empty sequence) is provided
to an XSLT 1.0 step.
A dynamic error occurs if the XSLT processor signals a fatal
error. This includes the case where the transformation terminates due
to a xsl:message
instruction with a terminate
attribute value of
“yes
”. How XSLT message termination
errors are reported to the XProc processor is
implementation-dependentXP.
The invocation of the transformation is controlled by the
initial-mode
and template-name
options that set the initial mode and/or named template in the XSLT
transformation where processing begins. It is a
dynamic error (err:XC0056
) if the specified initial mode
or named template cannot be applied to the specified stylesheet.
The output-base-uri
option sets the context's
output base URI per the XSLT 2.0 specification, otherwise the base URI
of the result
document is the base URI of the first
document in the source
port's sequence. If the value of
the output-base-uri
option is not absolute, it will
be resolved using the base URI of its p:optionXP
element. An XSLT 1.0 step should use the value of the
output-base-uri
as the base URI of its output, if the
option is specified.
If XSLT 2.0 is used, the outputs of this step may include PSVI annotations.
The static and initial dynamic contexts of the XSLT processor are the contexts defined in Section 2.7.2, “Step XPath Context”XP with the following adjustments.
The dynamic context is augmented as follows:
The first document that appears on the source
port.
Any parameters
passed in the parameters
option are available as variable bindings
to the XSLT processor.
The function implementations provided by the XSLT processor.
The sequence of documents provided on the source
port.
The following steps are optional. If they are supported by a processor, they must conform to the semantics outlined here, but a conformant processor is not required to support all (or any) of these steps.
The p:exec
step runs an external command passing the
input that arrives on its source
port as standard input,
reading result
from standard output, and errors
from standard error.
<p:declare-step
type
="
p:exec
"
>
<p:input
port
="
source
"
primary
="
true
"
sequence
="
true
"
/>
<p:output
port
="
result
"
primary
="
true
"
/>
<p:output
port
="
errors
"
/>
<p:output
port
="
exit-status
"
/>
<p:option
name
="
command
"
required
="
true
"
as
="
xs:string
"
/>
<p:option
name
="
args
"
select
="
''
"
as
="
xs:string
"
/>
<p:option
name
="
cwd
"
as
="
xs:string
"
/>
<p:option
name
="
source-is-xml
"
select
="
'true'
"
as
="
xs:boolean
"
/>
<p:option
name
="
result-is-xml
"
select
="
'true'
"
as
="
xs:boolean
"
/>
<p:option
name
="
wrap-result-lines
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
errors-is-xml
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
wrap-error-lines
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
path-separator
"
as
="
xs:string
"
/>
<p:option
name
="
failure-threshold
"
as
="
xs:integer
"
/>
<p:option
name
="
arg-separator
"
select
="
' '
"
as
="
xs:string
"
/>
<p:option
name
="
byte-order-mark
"
as
="
xs:boolean
"
/>
<p:option
name
="
cdata-section-elements
"
select
="
''
"
as
="
xs:string
"
/>
<!--
ListOfQNames -->
<p:option
name
="
doctype-public
"
as
="
xs:string
"
/>
<p:option
name
="
doctype-system
"
as
="
xs:anyURI
"
/>
<p:option
name
="
encoding
"
as
="
xs:string
"
/>
<p:option
name
="
escape-uri-attributes
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
include-content-type
"
select
="
'true'
"
as
="
xs:boolean
"
/>
<p:option
name
="
indent
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
media-type
"
as
="
xs:string
"
/>
<p:option
name
="
method
"
select
="
'xml'
"
as
="
xs:QName
"
/>
<p:option
name
="
normalization-form
"
select
="
'none'
"
as
="
xs:token
"
/>
<!--
NormalizationForm -->
<p:option
name
="
omit-xml-declaration
"
select
="
'true'
"
as
="
xs:boolean
"
/>
<p:option
name
="
standalone
"
select
="
'omit'
"
as
="
xs:token
"
/>
<!--
"true" | "false" | "omit" -->
<p:option
name
="
undeclare-prefixes
"
as
="
xs:boolean
"
/>
<p:option
name
="
version
"
select
="
'1.0'
"
as
="
xs:string
"
/>
</p:declare-step>
The values of the command
, args
,
cwd
, path-separator
, and
arg-separator
options must be strings.
The values of the source-is-xml
,
result-is-xml
, errors-is-xml
,
and fix-slashes
options must be
boolean.
The p:exec
step executes the command passed on
command
with the arguments passed on
args
. The processor does not interpolate the values
of the command
or args
(for example,
expanding references to environment variables).
It is a dynamic
error (err:XC0033
) if the command cannot be run.
If cwd
is specified, then the current working
directory is changed to the value of that option before execution
begins. It is a dynamic
error (err:XC0034
) if the current working directory cannot be changed
to the value of the cwd
option.
If cwd
is not
specified, the current working directory is
implementation-definedXP.
If the path-separator
option is specified,
every occurrence of the character identified as the
path-separator
character that occurs in the
command
, args
, or
cwd
will be replaced by the platform-specific path
separator character. It is a dynamic
error (err:XC0063
) if the path-separator
option is
specified and is not exactly one character long.
The value of the args
option is a string. In
order to support passing more than one argument to a command, the
args
string is broken into a sequence of values.
The arg-separator
option specifies the character
that is used to separate values; by default it is a single space
It is a dynamic error (err:XC0066
) if
the arg-separator
option is specified and is not
exactly one character long.
The following examples of p:exec
are equivalent. The
first uses the default arg-separator
:
<p:exec command="someCommand" args="arg1 arg2 arg3"/>
The second specifies an alternate separator:
<p:exec command="someCommand" args="arg1,arg2,arg3"
arg-separator=","/>
If one of the arguments contains a space (e.g., a filename that contains a space), then you must specify an alternate separator.
The source
port is declared to accept a sequence so that
it can be empty. If no document appears on the source
port, then the
command receives nothing on standard input. If a document does arrive on the source
port,
it will
be passed to the command as its standard input. It is a dynamic error (err:XD0006
XP) if
more than one document appears on the source
port of the p:exec
step.
If
source-is-xml
is true, the serialization options are
used to convert the input into serialized XML which is passed to
the command, otherwise the XPath string-value
of the document is passed.
The standard output of the command is read and returned on
result
; the standard error output is read and returned on
errors
. In order to assure that the result will be an
XML document, each of the results will be wrapped in a c:result
element.
If result-is-xml
is true, the standard output of
the program is assumed to be XML and will be parsed as a single document.
If it is false, the output is assumed not to be XML
and will be returned as escaped text.
If wrap-result-lines
is
true, a c:line
element will be wrapped around each line of output.
<c:line>
string
</c:line>
It is a dynamic
error (err:XC0035
) to specify both result-is-xml
and
wrap-result-lines
.
The same rules apply to the
standard error output of the program, with the errors-is-xml
and wrap-error-lines
options, respectively.
If either of the results are XML, they must be parsed with namespaces enabled and validation turned off, just like p:documentXP.
The exit-status
port always returns a single
c:result
element which contains the system exit status that
the process returned. The specific exit status values returned by
a process invoked with p:exec
are
implementation-dependentXP.
If a failure-threshold
value is supplied, and the
exit status is greater than that threshold, then the p:exec
step must fail.
It is a dynamic
error (err:XC0064
) if the exit code from the command is greater than
the specified failure-threshold
value.
This failure, like any step failure,
can be captured with a p:tryXP.
The p:hash
step generates a hash, or digital “fingerprint”,
for some value and injects it into the source
document.
<p:declare-step
type
="
p:hash
"
>
<p:input
port
="
source
"
primary
="
true
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item())
"
/>
<p:option
name
="
value
"
required
="
true
"
as
="
xs:string
"
/>
<p:option
name
="
algorithm
"
required
="
true
"
as
="
xs:QName
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
<p:option
name
="
version
"
as
="
xs:string
"
/>
</p:declare-step>
The value of the algorithm
option must be a QName.
If it does not have a prefix, then it must be one of the following values:
“crc”, “md”, or “sha”.
If a version
is not specified, the
default version is algorithm-defined. For “crc
” it
is 32, for “md
” it is 5, for “sha
”
it is 1.
A hash is constructed from the string specified in the
value
option using the specified algorithm and version.
Implementations must support
[CRC32],
[MD5], and [SHA1]
hashes. It is
implementation-definedXP what other algorithms are
supported.
The resulting hash should be returned as a string of
hexadecimal characters.
The value of the match
option must be an
XSLTMatchPattern.
The hash of the specified value is computed using the algorithm and
parameters specified. It is a
dynamic error (err:XC0036
) if the requested hash algorithm is not
one that the processor understands or if the value or parameters are
not appropriate for that algorithm.
The matched nodes are specified with the match pattern in the
match
option. For each matching node, the string
value of the computed hash is used in the output (if more than one node
matches, the same hash value is used in each match).
Nodes that do not
match are copied without change.
If the expression given in the match
option
matches an attribute, the hash is used as the new
value of the attribute in the output.
If the attribute is named “xml:base
”, the base URI
of the element must also be amended accordingly.
If the expression matches any other kind of node, the entire node (and not just its contents) is replaced by the hash.
The p:in-scope-names
step exposes all of the
in-scope variables and options as a set of parameters in a
c:param-set
document.
<p:declare-step
type
="
p:in-scope-names
"
>
<p:output
port
="
result
"
primary
="
false
"
/>
</p:declare-step>
Each in-scope variable and option is converted into a
c:param
element.
The resulting c:param
elements are wrapped in a
c:param-set
and the parameter set document is written
to the result
port.
The
order in which c:param
elements occur in the c:param-set
is
implementation-dependentXP.
For consistency and user convenience, if any of the variables or options
have names that are in a namespace, the
namespace
attribute on the
c:param
element must be used. Each
name
must be an NCName.
The base URI of the output document is the URI of the pipeline document that contains the step.
For consistency with the p:parameters
step, the
result
port is not primary.
This unlikely pipeline demonstrates the behavior of p:in-scope-names
:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
name="main" version="1.0">
<p:output port="result">
<p:pipe step="vars" port="result"/>
</p:output>
<p:option name="username" required="true"/>
<p:option name="password" required="true"/>
<p:variable name="host" select="'http://example.com/'"/>
<p:in-scope-names name="vars"/>
</p:declare-step>
Assuming the values supplied for the username and password
options are “user
” and “pass
”,
respectively, the output would be:
<c:param-set xmlns:c="http://www.w3.org/ns/xproc-step">
<c:param name="username" namespace="" value="user"/>
<c:param name="host" namespace="" value="http://example.com/"/>
<c:param name="password" namespace="" value="pass"/>
</c:param-set>
The p:template
replaces each XPath expression, delimited with
curly braces, in the template
document with the result of evaluating that
expression.
<p:declare-step
type
="
p:template
"
>
<p:input
port
="
template
"
/>
<p:input
port
="
source
"
sequence
="
true
"
primary
="
true
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item())
"
/>
</p:declare-step>
While evaluating each expression, the names of any parameters passed to the step are available as variable values in the XPath dynamic context.
The step searches for XPath expressions in attribute values, text content (adjacent text nodes, if they occur in the data model, must be coalesced; this step always processes maximal length text nodes), processing instruction data, and comments. XPath expressions are identified by curly braces, similar to attribute value templates in XSLT or enclosed expressions in XQuery.
In order to allow curly braces to appear literally in content, they can be escaped
by doubling them. In other words, where “{
” would start an XPath expression,
“{{
” is simply a single, literal opening curly brace.
The same applies for closing curly braces.
Inside an XPath expression, strings quoted by single ('
) or
double ("
) quotes are treated literally. Outside of quoted text, it
is an error for an opening curly brace to occur. A closing curly brace ends the
XPath expression (whether or not it is followed immediately by another closing
curly brace).
These parsing rules can be described by the following algorithm, though implementations are by no means required to implement the parsing in exactly this way, provided that they achieve the same results.
The parser begins in regular-mode at the start of each unit of content where expansion may occur. In regular-mode:
“{{
” is replaced by a single “{
”.
“}}
” is replaced by a single “}
”.
Note:
It is a dynamic error (err:XC0067
) to
encounter a single closing curly brace “}
” that is not immediately
followed by another closing curly brace.
A single opening curly brace “{
” (not
immediately followed by another opening curly brace) is discarded and
the parser moves into xpath-mode. The inital expression is empty.
All other characters are copied without change.
In xpath-mode:
It is a dynamic error (err:XC0067
) to
encounter an opening curly brace “{
”.
A closing curly brace “}
” is discarded and ends the
expression. The expression is evaluated and the result of that
evaluation is copied to the output. The parser returns to
regular-mode.
Note: Braces cannot be escaped by doubling them in xpath-mode.
A single quote ('
) is added to the current expression and
the parser moves to single-quote-mode.
A double quote ("
) is added to the current expression and
the parser moves to double-quote-mode.
All other characters are appended to the current expression.
In single-quote-mode:
A single quote ('
) is added to the current expression and
the parser moves to xpath-mode.
All other characters are appended to the current expression.
In double-quote-mode:
A double quote ("
) is added to the current expression and
the parser moves to xpath-mode.
All other characters are appended to the current expression.
It is a dynamic error (err:XC0067
) if the parser reaches
the end of the unit of content and it is not in regular-mode.
The context node used for each expression is the document passed on the
source
port.
It is a dynamic error (err:XC0068
)
if more than one document appears on the source
port.
In an XPath 1.0 implementation, if
p:emptyXP is given or implied on the source
port, an
empty document node is used as
the context node. In an XPath 2.0 implementation, the context item is
undefined.
It is a dynamic error (err:XC0026
) if
any XPath expression makes reference to the context node, size, or
position when the context item is undefined.
In an attribute value, processing instruction, or comment, the string value of the XPath expression is used. In text content, an expression that selects nodes will cause those nodes to be copied into the template document.
Depending on which version of XPath an implementation supports,
and possibly on the xpath-version
setting on
the p:template
, some implementations may report errors, or
different results, than other implementations in those cases where the
interpretation of an XPath expression differs between the versions of
XPath.
It's quite common to construct documents using values computed
by the pipeline. This is particularly (but not exclusively) the case
when the pipeline uses the p:http-request
step. The input
to p:http-request
is a c:request
document;
attributes on the c:request
element control most of the
request parameters; the body of the document forms the body of
request.
A typical example looks like this:
<c:request method="POST" href="http://example.com/post"
username="user" password="password">
<c:body>
<computed-content/>
</c:body>
</c:request>
If we assume that the href
value and the computed
content come from an input document, and the username and password are options, then a
typical pipeline to compute the request becomes quite complex.
<p:pipeline xmlns:p="http://www.w3.org/ns/xproc"
xmlns:c="http://www.w3.org/ns/xproc-step"
name="main" version="1.0">
<p:option name="username" required="true"/>
<p:option name="password" required="true"/>
<p:identity>
<p:input port="source">
<p:inline>
<c:request method="POST"/>
</p:inline>
</p:input>
</p:identity>
<p:add-attribute match="/c:request" attribute-name="href">
<p:with-option name="attribute-value" select="/doc/request/@uri">
<p:pipe step="main" port="source"/>
</p:with-option>
</p:add-attribute>
<p:add-attribute match="/c:request" attribute-name="username">
<p:with-option name="attribute-value" select="$username"/>
</p:add-attribute>
<p:add-attribute match="/c:request" attribute-name="password">
<p:with-option name="attribute-value" select="$password"/>
</p:add-attribute>
<p:insert position="first-child" match="/c:request">
<p:input port="insertion" select="/doc/request">
<p:pipe step="main" port="source"/>
</p:input>
</p:insert>
<p:unwrap match="/c:request/request"/>
</p:pipeline>
There's nothing wrong with this pipeline, but it requires several steps to accomplish with the pipeline author probably considers a single operation. What's more, the result of these steps is not immediately obvious on casual inspection.
In order to make this simple construction case both literally and conceptually simpler, this note introduces two new XProc steps in the XProc namespace. Support for these steps is optional, but we strongly encourage implementors to provide them.
The new steps are p:in-scope-names
and
p:template
. Taken together, they greatly simplify
the pipeline:
<p:pipeline xmlns:p="http://www.w3.org/ns/xproc"
xmlns:c="http://www.w3.org/ns/xproc-step"
name="main" version="1.0">
<p:option name="username" required="true"/>
<p:option name="password" required="true"/>
<p:in-scope-names name="vars"/>
<p:template>
<p:input port="template">
<p:inline>
<c:request method="POST" href="{/doc/request/@uri}"
username="{$username}" password="{$password}">
{ /doc/request/node() }
</c:request>
</p:inline>
</p:input>
<p:input port="source">
<p:pipe step="main" port="source"/>
</p:input>
<p:input port="parameters">
<p:pipe step="vars" port="result"/>
</p:input>
</p:template>
</p:pipeline>
The p:in-scope-names
step provides all of the in-scope options and variables
in a c:param-set
(this operation is exactly analagous to what the
p:parameters
step does, except that it operates on the options and variables instead
of on parameters).
The p:template
step searches for XPath
expressions, delimited by curly braces, in a template document and replaces each with the
result of evaluating the expression. All of the parameters passed to the
p:template
step are available as in-scope variable names when evaluating
each XPath expression.
Where the expressions occur in attribute values, their string value is used. Where they appear in text content, their node values are used.
The p:uuid
step generates a
[UUID] and injects it into
the source
document.
<p:declare-step
type
="
p:uuid
"
>
<p:input
port
="
source
"
primary
="
true
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
<p:option
name
="
version
"
as
="
xs:integer
"
/>
</p:declare-step>
The value of the match
option must be an
XSLTMatchPattern. The value of the version
option
must be an integer.
If the version
is specified, that version of
UUID must be computed. It is a dynamic
error (err:XC0060
) if the processor does not support the specified
version
of the UUID algorithm. If the
version
is not specified, the version of UUID
computed is
implementation-definedXP.
Implementations must support version 4 UUIDs. Support for other versions of UUID, and the mechanism by which the necessary inputs are made available for computing other versions, is implementation-definedXP.
The matched nodes are specified with the match pattern in the
match
option. For each matching node, the generated
UUID is used in the output (if more than one node matches, the
same UUID is used in each match). Nodes that do not
match are copied without change.
If the expression given in the match
option
matches an attribute, the UUID is used as the new
value of the attribute in the output. If the attribute is named “xml:base
”, the base URI of the element
must also be amended accordingly.
If the expression matches any other kind of node, the entire node (and not just its contents) is replaced by the UUID.
The p:validate-with-relax-ng
step applies
[RELAX NG]
validation to the source
document.
<p:declare-step
type
="
p:validate-with-relax-ng
"
>
<p:input
port
="
source
"
primary
="
true
"
/>
<p:input
port
="
schema
"
content-types
="
application/xml */*+xml text/*
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
dtd-attribute-values
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
dtd-id-idref-warnings
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
assert-valid
"
select
="
'true'
"
as
="
xs:boolean
"
/>
</p:declare-step>
The values of the dtd-attribute-values
and
dtd-id-idref-warnings
options
must be booleans.
If the schema
document has an XML media type, then
it must be interpreted as a RELAX NG Grammar. If
the media type has a “text
” type, then it
must be interpreted as a [RELAX NG
Compact Syntax] document for validation.
If the dtd-attribute-values
option is
true
, then the attribute value defaulting conventions of
[RELAX NG DTD Compatibility] are also applied.
If the dtd-id-idref-warnings
option is
true
, then the validator should
treat a schema that is incompatible with the ID/IDREF/IDREFs feature
of [RELAX NG DTD Compatibility] as if the document
was invalid.
It is a dynamic error (err:XC0053
)
if the assert-valid
option is true
and the input document is not valid.
The output from this step is a copy of the input, possibly augmented by application of the [RELAX NG DTD Compatibility]. The output of this step may include PSVI annotations.
Support for [RELAX NG DTD Compatibility] is implementation definedXP.
The p:validate-with-schematron
step applies
[Schematron]
processing to the source
document.
<p:declare-step
type
="
p:validate-with-schematron
"
>
<p:input
port
="
source
"
primary
="
true
"
/>
<p:input
port
="
schema
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item())
"
/>
<p:output
port
="
result
"
primary
="
true
"
/>
<p:output
port
="
report
"
sequence
="
true
"
/>
<p:option
name
="
phase
"
select
="
'#ALL'
"
as
="
xs:string
"
/>
<p:option
name
="
assert-valid
"
select
="
'true'
"
as
="
xs:boolean
"
/>
</p:declare-step>
It is a dynamic error (err:XC0054
)
if the assert-valid
option is true
and any Schematron assertions fail.
The value of the phase
option identifies the
Schematron validation phase with which validation begins.
The parameters
option provides name/value pairs which
correspond to Schematron external variables.
The result
output from this step is a copy of the
input.
Schematron assertions and reports, if any,
must appear on the report
port. The
output should be in Schematron
Validation Report Language (SVRL).
The output of this step may include PSVI annotations.
The p:validate-with-xml-schema
step applies
[W3C XML Schema: Part 1]
validity assessment to the source
input.
<p:declare-step
type
="
p:validate-with-xml-schema
"
>
<p:input
port
="
source
"
primary
="
true
"
/>
<p:input
port
="
schema
"
sequence
="
true
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
use-location-hints
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
try-namespaces
"
select
="
'false'
"
as
="
xs:boolean
"
/>
<p:option
name
="
assert-valid
"
select
="
'true'
"
as
="
xs:boolean
"
/>
<p:option
name
="
mode
"
select
="
'strict'
"
as
="
xs:token
"
/>
<!--
"strict" | "lax" -->
<p:option
name
="
version
"
as
="
xs:string
"
/>
</p:declare-step>
The values of the use-location-hints
,
try-namespaces
, and
assert-valid
options
must be boolean.
The value of the mode
option
must be an NMTOKEN whose value is either
“strict
” or “lax
”.
Validation is performed against the set of schemas represented
by the documents on the schema
port. These schemas must
be used in preference to any schema locations provided by schema
location hints encountered during schema validation, that is, schema
locations supplied for xs:import
or
xsi:schema-location
, or determined by
schema-processor-defined namespace-based strategies, for the
namespaces covered by the documents available on the schemas port.
If xs:include
elements occur within the supplied
schema documents, they are treated like any other
external
documentsXP. It is
implementation-definedXP if the documents supplied
on the schemas
port are considered when resolving
xs:include
elements in the schema documents provided.
The use-location-hints
and
try-namespaces
options allow the pipeline author to
control how the schema processor should attempt to locate schema
documents necessary but not provided on the schema
port. Any schema documents provided on the schema
port
must be used in preference to schema documents
located by other means.
If the use-location-hints
option is
“true
”, the processor should
make use of schema location hints to locate schema documents. If the
option is “false
”, the processor
should ignore any such hints.
If the try-namespaces
option is
“true
”, the processor should
attempt to dereference the namespace URI to locate schema documents.
If the
option is “false
”, the processor
should not dereference namespace URIs.
The mode
option allow the pipeline author to
control how schema validation begins. The “strict
”
mode means that the document element must be declared and
schema-valid, otherwise it will be treated as invalid. The
“lax
” mode means that the
absence of a declaration for the document element does not itself
count as an unsuccessful outcome of validation.
If the step specifies a version
, then that version
of XML Schema must be used to process the validation.
It is a
dynamic error (err:XC0038
) if the specified version
is not available. If the step does not specify a version, the
implementation may use any version it has available and may use any means
to determine what version to use, including, but not limited to,
examining the version of the schema(s).
It is a dynamic error (err:XC0053
)
if the assert-valid
option is true
and the input document is not valid. If the assert-valid
option is false
, it is not an error for the document
to be invalid. In this case, if the implementation does not
support the PSVI, p:validate-with-xml-schema
is essentially
just an “identity” step, but if the implementation does
support the PSVI, then the resulting document will have additional type
information (at least for the subtrees that are valid).
When XML Schema validation assessment
is performed, the processor is invoked in the mode specified by the
mode
option.
It is a dynamic error (err:XC0055
)
if the implementation does not support the specified mode.
The result
of the assessment is a document with the
Post-Schema-Validation-Infoset (PSVI) ([W3C XML Schema: Part 1]) annotations, if the pipeline implementation
supports such annotations. If not, the input document is reproduced
with any defaulting of attributes and elements performed as specified
by the XML Schema recommendation.
The p:www-form-urldecode
step decodes a
x-www-form-urlencoded
string into an XML representation.
<p:declare-step
type
="
p:www-form-urldecode
"
>
<p:output
port
="
result
"
/>
<p:option
name
="
value
"
required
="
true
"
as
="
xs:string
"
/>
</p:declare-step>
The value
option is interpreted as a string of
parameter values encoded using the
x-www-form-urlencoded
algorithm. Each name/value
pair is written in a c:param
element.
The entire set of parameters
is written (as a c:param-set
) on the result
output port.
It is a
dynamic error (err:XC0037
) if the value
provided
is not a properly
x-www-form-urlencoded
value.
It is a
dynamic error (err:XC0061
) if the name of any encoded parameter
name is not a valid xs:NCName
. In other words, this
step can only decode simple name/value pairs where the names do not contain
colons or any characters that cannot be used in XML names.
The order of the c:param
elements in the result is the same
as the order of the encoded parameters, reading from left to right.
If any parameter name occurs more than once in the encoded string,
the resulting parameter set will contain a c:param
for
each instance.
The p:www-form-urlencode
step encodes a set of parameter
values as a x-www-form-urlencoded
string and
injects it into the source
document.
<p:declare-step
type
="
p:www-form-urlencode
"
>
<p:input
port
="
source
"
primary
="
true
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item())
"
/>
<p:option
name
="
match
"
required
="
true
"
as
="
xs:string
"
/>
<!--
XSLTMatchPattern -->
</p:declare-step>
The value of the match
option must be an
XSLTMatchPattern.
The set of parameters is encoded as a single
x-www-form-urlencoded
string of name/value pairs.
When parameters are encoded into name/value pairs,
only the local name of each parameter is used.
The namespace name is ignored and no prefix or colon appears in the name.
The order of the parameters is is implementation-dependentXP.
The matched nodes are specified with the match pattern in the
match
option. For each matching node, the encoded
string is used in the output. Nodes that do not
match are copied without change.
If the expression given in the match
option
matches an attribute, the encoded
string is used as the new value of the attribute in the output.
If the expression matches any other kind of node, the entire
node (and not just its contents) is replaced by
the encoded string.
The p:xquery
step applies an
[XQuery 1.0] query to the sequence of documents
provided on the source
port.
<p:declare-step
type
="
p:xquery
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
sequence
="
true
"
primary
="
true
"
/>
<p:input
port
="
query
"
content-types
="
application/xml */*+xml text/*
"
/>
<p:output
port
="
result
"
sequence
="
true
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item())
"
/>
<p:option
name
="
version
"
as
="
xs:string
"
/>
</p:declare-step>
If a sequence of documents is provided on the
source
port, the first document is used as the
initial context item. The whole sequence is also the default
collection. If no documents are provided on the source
port,
the initial context item is undefined and the default collection
is empty.
The query
port must receive a single document:
If the document root element is c:query
, the text
descendants of this element are considered the query.
<c:query>
string
</c:query>
If the document root element is in the XQueryX namespace, the document is treated as an XQueryX-encoded query. Support for XQueryX is implementation-definedXP.
If the query
document has an XML media type, then
the string value of the document must be treated as
the query. If the media type has a “text
” type,
then it must be interpreted as the query.
Otherwise, the interpretation of the query is implementation-definedXP.
If the step specifies a version
, then that version
of XQuery must be used to process the transformation.
It is a
dynamic error (err:XC0038
) if the specified version
is not available. If the step does not specify a version, the
implementation may use any version it has available and may use any means
to determine what version to use, including, but not limited to,
examining the version of the query.
The result of the p:xquery
step must be a sequence of
documents. It is a dynamic
error (err:XC0057
) if the sequence that results from evaluating the XQuery contains
items other than documents and elements. Any elements that appear
in the result sequence will be treated as documents with the element as their
document element.
For example:
<c:query>
declare namespace atom="http://www.w3.org/2005/Atom";
/atom:feed/atom:entry
</c:query>
The output of this step may include PSVI annotations.
The static context of the XQuery processor is augmented in the following way:
document()*
Unchanged from the implementation defaults. No namespace declarations in the XProc pipeline are automatically exposed in the static context.
The dynamic context of the XQuery processor is augmented in the following way:
The first document that appears on the source
port.
1
1
Any parameters passed in the parameters
option
augment any implementation-defined variable bindings known to the XQuery
processor.
The function implementations provided by the XQuery processor.
The point in time returned as the current dateTime is implementation-definedXP.
The implicit timezone is implementation-definedXP.
The set of available documents (those that may be retrieved with a URI) is implementation-dependentXP.
The set of available collections is implementation-dependentXP.
The sequence of documents provided on the source
port.
The following pipeline applies XInclude processing and schema validation before using XQuery:
Where countp.xq
might contain:
<count>{count(.//p)}</count>
The p:xsl-formatter
step receives an [XSL 1.1] document and renders the content. The result of
rendering is stored to the URI provided via the href
option. A reference to that result is produced on the output
port.
<p:declare-step
type
="
p:xsl-formatter
"
>
<p:input
port
="
source
"
content-types
="
application/xml text/xml */*+xml
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
parameters
"
as
="
map(xs:QName,item())
"
/>
<p:option
name
="
href
"
required
="
true
"
as
="
xs:anyURI
"
/>
<p:option
name
="
content-type
"
as
="
xs:string
"
/>
</p:declare-step>
The value of the href
option
must be an anyURI
. If it is relative,
it is made absolute against the base URI of the element on which it is
specified (p:with-optionXP or p:xsl-formatter
in the
case of a syntactic shortcutXP
value).
The content-type of the output is controlled by the
content-type
option. This option specifies a media
type as defined by [IANA Media Types]. The option may
include media type parameters as well (e.g.
"application/someformat; charset=UTF-8"). The use of media type
parameters on the content-type
option is
implementation-definedXP.
If the content-type
option is not specified,
the output type is implementation-definedXP. The default should be
PDF.
A formatter may take any number of optional rendering
parameters via the step's parameters
; such parameters
are defined by the XSL implementation used and are
implementation-definedXP.
The output of this step is a document containing a single
c:result
element whose content is the absolute URI of the
document stored by the step.
Several steps in this step library require serialization options to control the serialization of XML. These options are used to control serialization as in the [Serialization] specification.
The following options may be present on steps that perform serialization:
byte-order-mark
The value of this option must be a boolean. If it's not specified, the default varies by encoding: for UTF-16 it's true, for all others, it's false.
cdata-section-elements
The value of this option must be a list of
QName
s. They are interpreted as elements name.
doctype-public
The value of this option must be a string. The public identifier of the doctype.
doctype-system
The value of this option must be an
anyURI
. The system identifier of the doctype. It need not
be absolute, and is not resolved.
encoding
A character set name. If no encoding
is
specified, the encoding used is implementation
definedXP. If the method
is
“xml
” or “xhtml
”, the
implementation defined encoding must be either
UTF-8 or UTF-16.
escape-uri-attributes
The value of this option must be a
boolean. It is ignored unless the specified method is
“xhtml
” or “html
”.
include-content-type
The value of this option must be a boolean.
It is ignored unless the specified method is
“xhtml
” or “html
”.
indent
The value of this option must be a boolean.
media-type
The value of this option must be a string. It
specifies the media type (MIME content type). If not specified, the
default varies according to the method
:
xml
application/xml
html
text/html
xhtml
application/xhtml+xml
text
text/plain
For methods other than xml
, html
,
xhtml
, and text
; the
media-type
is implementation definedXP.
method
The value of this option must be a
QName
. It specifies the serialization method.
normalization-form
The value of this option must be an NMTOKEN,
one of the enumerated values NFC
, NFD
,
NFKC
, NFKD
, fully-normalized
,
none
or an implementation-defined value.
omit-xml-declaration
The value of this option must be a boolean.
standalone
The value of this option must be an NMTOKEN,
one of the enumerated values true
, false
, or
omit
.
undeclare-prefixes
The value of this option must be a boolean.
version
The value of this option must be a string.
In order to be consistent with the rest of this specification, boolean values for the serialization parameters must use one of the XML Schema lexical forms for boolean: "true", "false", "1", or "0". This is different from the [Serialization] specification which uses “yes” and “no”. No change in semantics is implied by this different spelling.
The method
option controls the serialization
method used by this component with standard values of 'html', 'xml',
'xhtml', and 'text' but only the 'xml' value is required to be
supported. The interpretation of the remaining options is as
specified in [Serialization].
Implementations may support other method values but their results are implementation-definedXP.
A minimally conforming implementation must support the
xml
output method with the following option
values:
The version
must support the value 1.0
.
The encoding
must support the values UTF-8
.
The omit-xml-declaration
must be supported. If the value is not specified or has the value no
, an XML declaration must be produced.
All other option values may be ignored for the xml
output method.
If a processor chooses to implement an option for serialization, it must conform to the semantics defined in the [Serialization] specification.
The use-character-maps parameter in [Serialization] specification has not been provided in the standard serialization options provided by this specification.
Errors in a pipeline can be divided into two classes: static errors and dynamic errors.
[Definition: A static error is one which can be detected before pipeline evaluation is even attempted.] Examples of static errors include cycles and incorrect specification of inputs and outputs.
Static errors are fatal and must be detected before any steps are evaluated.
For a complete list of static errors, see Section 1, “Static Errors”XP.
A [Definition: A dynamic error is one which occurs while a pipeline is being evaluated.] Examples of dynamic errors include references to URIs that cannot be resolved, steps which fail, and pipelines that exhaust the capacity of an implementation (such as memory or disk space).
If a step fails due to a dynamic error, failure propagates upwards until either a p:tryXP is encountered or the entire pipeline fails. In other words, outside of a p:tryXP, step failure causes the entire pipeline to fail.
For a complete list of dynamic errors, see Section 2, “Dynamic Errors”XP.
Several of the steps in the standard and option step library can generate dynamic errors.
For a complete list of the dynamic errors raised by builtin pipeline steps, see Appendix A, Step Errors.
The following dynamic errors can be raised by steps in this specification:
err:XC0002
It is a dynamic error if the value starts with the string “--”.
err:XC0003
It is a dynamic error if a username or password is specified without specifying an auth-method, if the requested auth-method isn't supported, or the authentication challenge contains an authentication method that isn't supported.
See: Specifying a request
err:XC0004
It is a dynamic error if the status-only attribute has the value true and the detailed attribute does not have the value true.
See: Specifying a request
err:XC0005
It is a dynamic error if the request contains a c:body or c:multipart but the method does not allow for an entity body being sent with the request.
See: Specifying a request
err:XC0006
It is a dynamic error if the method is not specified on a c:request.
See: Specifying a request
err:XC0010
It is a dynamic error if an encoding of base64 is specified and the character set is not specified or if the specified character set is not supported by the implementation.
See: p:unescape-markup
err:XC0012
It is a dynamic error if the contents of the directory path are not available to the step due to access restrictions in the environment in which the pipeline is run.
See: p:directory-list
err:XC0013
It is a dynamic error if the pattern matches a processing instruction and the new name has a non-null namespace.
See: p:rename
err:XC0014
It is a dynamic error if the XML namespace (http://www.w3.org/XML/1998/namespace) or the XMLNS namespace (http://www.w3.org/2000/xmlns/) is the value of either the from option or the to option.
See: p:namespace-rename
err:XC0017
It is a dynamic error if the absolute path does not identify a directory.
See: p:directory-list
err:XC0019
It is a dynamic error if the documents are not equal, and the value of the fail-if-not-equal option is true.
See: p:compare
err:XC0020
It is a dynamic error if the the user specifies a value or values that are inconsistent with each other or with the requirements of the step or protocol.
See: Specifying a request
err:XC0022
it is a dynamic error if the content of the c:body element does not consist of exactly one element, optionally preceded and/or followed by any number of processing instructions, comments or whitespace characters
err:XC0023
It is a dynamic error if the match pattern does not match an element.
See: p:add-attribute, p:insert, p:label-elements, p:make-absolute-uris, p:rename, p:replace, p:set-attributes, p:unwrap, p:wrap
err:XC0025
It is a dynamic error if the match pattern matches anything other than an element node and the value of the position option is “first-child” or “last-child”.
See: p:insert
err:XC0026
It is a dynamic error if any XPath expression makes reference to the context node, size, or position when the context item is undefined.
See: p:template
err:XC0028
it is a dynamic error if the content of the c:body element does not consist entirely of characters
err:XC0029
It is a dynamic error if an XInclude error occurs during processing.
See: p:xinclude
err:XC0030
It is a dynamic error if the override-content-type value cannot be used (e.g. text/plain to override image/png).
err:XC0033
It is a dynamic error if the command cannot be run.
See: p:exec
err:XC0034
It is a dynamic error if the current working directory cannot be changed to the value of the cwd option.
See: p:exec
err:XC0035
It is a dynamic error to specify both result-is-xml and wrap-result-lines.
See: p:exec
err:XC0036
It is a dynamic error if the requested hash algorithm is not one that the processor understands or if the value or parameters are not appropriate for that algorithm.
See: p:hash
err:XC0037
It is a dynamic error if the value provided is not a properly x-www-form-urlencoded value.
See: p:www-form-urldecode
err:XC0038
It is a dynamic error if the specified version is not available.
err:XC0039
It is a dynamic error if a sequence of documents (including an empty sequence) is provided to an XSLT 1.0 step.
See: p:xslt
err:XC0040
It is a dynamic error if the document element of the document that arrives on the source port is not c:request.
See: p:http-request
err:XC0050
It is a dynamic error if the URI scheme is not supported or the step cannot store to the specified location.
See: p:store
err:XC0051
It is a dynamic error if the content-type specified is not supported by the implementation.
See: p:unescape-markup
err:XC0052
It is a dynamic error if the encoding specified is not supported by the implementation.
err:XC0053
It is a dynamic error if the assert-valid option is true and the input document is not valid.
err:XC0054
It is a dynamic error if the assert-valid option is true and any Schematron assertions fail.
err:XC0055
It is a dynamic error if the implementation does not support the specified mode.
err:XC0056
It is a dynamic error if the specified initial mode or named template cannot be applied to the specified stylesheet.
See: p:xslt
err:XC0057
It is a dynamic error if the sequence that results from evaluating the XQuery contains items other than documents and elements.
See: p:xquery
err:XC0058
It is a dynamic error if the all and relative options are both true.
See: p:add-xml-base
err:XC0059
It is a dynamic error if the QName value in the attribute-name option uses the prefix “xmlns” or any other prefix that resolves to the namespace name http://www.w3.org/2000/xmlns/.
See: p:add-attribute
err:XC0060
It is a dynamic error if the processor does not support the specified version of the UUID algorithm.
See: p:uuid
err:XC0061
It is a dynamic error if the name of any encoded parameter name is not a valid xs:NCName.
See: p:www-form-urldecode
err:XC0062
It is a dynamic error if the match option matches a namespace node.
See: p:delete
err:XC0063
It is a dynamic error if the path-separator option is specified and is not exactly one character long.
See: p:exec
err:XC0064
It is a dynamic error if the exit code from the command is greater than the specified failure-threshold value.
See: p:exec
err:XC0066
It is a dynamic error if the arg-separator option is specified and is not exactly one character long.
See: p:exec
err:XC0067
It is a dynamic error to encounter a single closing curly brace “}” that is not immediately followed by another closing curly brace.
See: p:template, p:template, p:template
err:XC0068
It is a dynamic error if more than one document appears on the source port.
See: p:template
err:XC1001
It is a dynamic error if the properties map contains a key equal to the string “content-type”.
See: p:set-properties
err:XC1002
It is a dynamic error if the supplied content-type is not a valid media type of the form “type/subtype+ext”.
See: p:cast-content-type
err:XC1003
It is a dynamic error if the p:cast-content-type step cannot perform the requested cast.
See: p:cast-content-type
err:XC1004
It is a dynamic error if the c:data contains content is not a valid base64 string.
See: p:cast-content-type
err:XC1005
It is a dynamic error if the c:data element does not have a content-type attribute.
See: p:cast-content-type
err:XC1006
It is a dynamic error if the content-type is supplied and is not the same as the content-type specified on the c:data element.
See: p:cast-content-type
err:XC1007
In all cases except when the input document is a c:data element, it is a dynamic error if the content-type is not supplied.
See: p:cast-content-type
[XProc V2.0 Requirements] XProc V2.0 Requirements. Alex Milowski, James Fuller, and Norman Walsh editors. W3C Working Draft 5 November 2013.
[XProc 2.0] XProc 2.0: An XML Pipeline Language. Norman Walsh, Alex Milowski, and Henry Thompson, editors. W3C Working Draft 15 December 2014.
[XSLT 1.0] XSL Transformations (XSLT) Version 1.0. James Clark, editor. W3C Recommendation. 16 November 1999.
[XPath 2.0 Functions and Operators] XQuery 1.0 and XPath 2.0 Functions and Operators. Ashok Malhotra, Jim Melton, and Norman Walsh, editors. W3C Recommendation. 23 January 2007.
[XSLT 2.0] XSL Transformations (XSLT) Version 2.0. Michael Kay, editor. W3C Recommendation. 23 January 2007.
[XSL 1.1] Extensible Stylesheet Language (XSL) Version 1.1. Anders Berglund, editor. W3C Recommendation. 5 December 2006.
[XQuery 1.0] XQuery 1.0: An XML Query Language. Scott Boag, Don Chamberlin, Mary Fernández, et. al., editors. W3C Recommendation. 23 January 2007.
[RELAX NG] ISO/IEC JTC 1/SC 34. ISO/IEC 19757-2:2008(E) Document Schema Definition Language (DSDL) -- Part 2: Regular-grammar-based validation -- RELAX NG 2008.
[RELAX NG Compact Syntax] ISO/IEC JTC 1/SC 34. ISO/IEC 19757-2:2003/Amd 1:2006 Document Schema Definition Languages (DSDL) — Part 2: Grammar-based validation — RELAX NG AMENDMENT 1 Compact Syntax 2006.
[RELAX NG DTD Compatibility] RELAX NG DTD Compatibility. OASIS Committee Specification. 3 December 2001.
[Schematron] ISO/IEC JTC 1/SC 34. ISO/IEC 19757-3:2006(E) Document Schema Definition Languages (DSDL) — Part 3: Rule-based validation — Schematron 2006.
[W3C XML Schema: Part 1] XML Schema Part 1: Structures Second Edition. Henry S. Thompson, David Beech, Murray Maloney, et. al., editors. World Wide Web Consortium, 28 October 2004.
[XInclude] XML Inclusions (XInclude) Version 1.0 (Second Edition). Jonathan Marsh, David Orchard, and Daniel Veillard, editors. W3C Recommendation. 15 November 2006.
[Serialization] XSLT 2.0 and XQuery 1.0 Serialization. Scott Boag, Michael Kay, Joanne Tong, Norman Walsh, and Henry Zongaro, editors. W3C Recommendation. 23 January 2007.
[MD5] RFC 1321: The MD5 Message-Digest Algorithm. R. Rivest. Network Working Group, IETF, April 1992.
[RFC 1521] RFC 1521: MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies. N. Borenstein, N. Freed, editors. Internet Engineering Task Force. September, 1993.
[RFC 2616] RFC 2616: Hypertext Transfer Protocol — HTTP/1.1. R. Fielding, J. Gettys, J. Mogul, et. al., editors. Internet Engineering Task Force. June, 1999.
[RFC 2617] RFC 2617: HTTP Authentication: Basic and Digest Access Authentication. J. Franks, P. Hallam-Baker, J. Hostetler, S. Lawrence, P. Leach, A. Luotonen, L. Stewart. June, 1999 .
[Unicode TR#17] Unicode Technical Report #17: Character Encoding Model. Ken Whistler, Mark Davis, and Asmus Freytag, authors. The Unicode Consortium. 11 November 2008.
[IANA Media Types] IANA MIME Media Types. Internet Engineering Task Force.
[HTML Tidy] HTML Tidy Library Project. SourceForge project.
[CRC32] “32-Bit Cyclic Redundancy Codes for Internet Applications”, The International Conference on Dependable Systems and Networks: 459. 10.1109/DSN.2002.1028931. P. Koopman. June 2002.