Paul Kiddie

Escaping single quotes from all attribute and element text node values via XSL transform

January 24, 2013

I recently needed to ensure that all single quotes in attribute and text nodes for a given XML document were escaped in a first pass before passing the result to another transform to produce some HTML. This output HTML would form part of a Javascript object, so it was important that all single quotes were escaped, otherwise it would lead to invalid Javascript being output.

Turns out it’s pretty intuitive to do so. Let’s say I’ve got the following XML document:

<?xml version="1.0" encoding="utf-8" ?> <Pages> <Page Title="Lorem ipsum">Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus venenatis.</Page> <Page Title="XSLT Programmer's Reference 2nd Edition">This compact, relevant, updated version reflects recent changes in the XSLT specification and developments in XSLT parsers.</Page> <Page Title="The Pragmatic Programmer's Guide">This book is a tutorial and reference for the Ruby programming language... As Pragmatic Programmers we've tried many, many languages in our search for tools to make our lives easier, for tools to help us do our jobs better.</Page> </Pages>

With some inspiration for a recursive template to escape single quotes, I just needed a way to iterate over all attributes and text nodes.

With the identity template as a basis and some generic selectors for attributes and elements, the following XSL achieved exactly what I wanted:

<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl" > <xsl:output method="xml" indent="yes"/> <!-- identity template --> <xsl:template match="@\* | node()"> <xsl:copy> <xsl:apply-templates select="@\* | node()"/> </xsl:copy> </xsl:template>
<!-- escape attribute values --> <xsl:template match="@\*"> <xsl:attribute name="{local-name()}"> <xsl:call-template name="replace-single-quote"> <xsl:with-param name="text"> <xsl:value-of select="."/> </xsl:with-param> </xsl:call-template> </xsl:attribute> </xsl:template>
<!-- escape text nodes (on elements) --> <xsl:template match="node()/text()"> <xsl:call-template name="replace-single-quote"> <xsl:with-param name="text" select="." /> </xsl:call-template> </xsl:template>
<!-- recursive template to escape single quotes --> <xsl:template name="replace-single-quote"> <xsl:param name="text" /> <xsl:variable name="replace" select="&quot;'&quot;" /> <xsl:variable name="by" select="&quot;\\'&quot;" /> <xsl:choose> <xsl:when test="contains($text, $replace)"> <xsl:value-of select="substring-before($text,$replace)" /> <xsl:value-of select="$by" /> <xsl:call-template name="replace-single-quote"> <xsl:with-param name="text" select="substring-after($text,$replace)" /> </xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:value-of select="$text" disable-output-escaping="yes"/> </xsl:otherwise> </xsl:choose> </xsl:template> </xsl:stylesheet>

Essentially, the stylesheet is escaping the text nodes in place and recreating the attributes with the same name (ignorant of namespace, using the local-name() function) but with an escaped value. When applied to the above XML, this produces the following output.

<?xml version="1.0" encoding="utf-8"?> <Pages> <Page Title="Lorem ipsum">Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus venenatis.</Page> <Page Title="XSLT Programmer\\'s Reference 2nd Edition">This compact, relevant, updated version reflects recent changes in the XSLT specification and developments in XSLT parsers.</Page> <Page Title="The Pragmatic Programmer\\'s Guide">This book is a tutorial and reference for the Ruby programming language... As Pragmatic Programmers we\\'ve tried many, many languages in our search for tools to make our lives easier, for tools to help us do our jobs better.</Page> </Pages>```
Perfect for the next step - converting to HTML before stuffing it into a Javascript variable.
export const _frontmatter = {"title":"Escaping single quotes from all attribute and element text node values via XSL transform","date":"2013-01-24"}

👋 I'm Paul Kiddie, a software engineer working in London. I'm currently working as a Principal Engineer at trainline.