using automatic character conversion in XML to XML transformation

If I transform my XML document to HTML using the HTML output method oXygen automatically turns special characters into HTML character entities (e.g., a copyright symbol gets transformed into "©"). I want to use that functionality in a transformation from XML to XML using XML as an output method, but I can't figure out how to do it. Is there a way? Thanks, --Paul ___________________________ Paul Dever :: Manager, Electronic Workflows :: EPD-US (T) +1 314-995-3291 :: (E) p.dever@elsevier.com <mailto:p.dever@elsevier.com> 11830 Westline Industrial Drive, St. Louis, MO 63146 ELSEVIER

Hi Paul, When the output method is html then the XSLT processor uses a different output serializer than the one used when you set the output method to xml. For XML there are different rules than the ones for HTML. In XSLT 1.0 you can set the output encoding to ASCII for instance or to some encoding that cannot represent the characters you want to output as entities and those characters will be output as as character references, the copyright symbol will appear as &_#169; (added one underscore _ to avoid the conversion of the character reference to the actual character by some email clients). In XSLT 2.0 you can use character maps to output &_copy; (again I added an _ ). You can find below a sample stylesheet that copies the source to the output representing the copyright characters as &_copy; <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:character-map name="test"> <xsl:output-character character="©" string="©"/> </xsl:character-map> <xsl:output use-character-maps="test"/> <xsl:template match="node() | @*"> <xsl:copy> <xsl:apply-templates select="node() | @*"/> </xsl:copy> </xsl:template> </xsl:stylesheet> This stylesheet applied on a document like: <test> <a> © </a> <b> © </b> <c> © </c> </test> will result in <?xml version="1.0" encoding="UTF-8"?><test> <a> © </a> <b> © </b> <c> © </c> </test> But note that the result document is not wellformed as the copy entity is used but not declared. To have an wellformed result you need to create a DTD like below test.dtd <?xml version="1.0" encoding="UTF-8"?> <!ENTITY copy "©"> and change the xsl:output to refer to this DTD <xsl:output doctype-system="test.dtd" use-character-maps="test"/> And the result will be now: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test SYSTEM "test.dtd"> <test> <a> © </a> <b> © </b> <c> © </c> </test> which is wellformed (but not valid against the DTD). If you want the output to be also valid you need to update the DTD to contain the elements and attributes declarations, in the above example that will be <?xml version="1.0" encoding="UTF-8"?> <!ENTITY copy "©"> <!ELEMENT test (a,b,c)> <!ELEMENT a (#PCDATA)> <!ELEMENT b (#PCDATA)> <!ELEMENT c (#PCDATA)> Best Regards, George --------------------------------------------------------------------- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com Dever, Paul (ELS) wrote:
If I transform my XML document to HTML using the HTML output method oXygen automatically turns special characters into HTML character entities (e.g., a copyright symbol gets transformed into "©").
I want to use that functionality in a transformation from XML to XML using XML as an output method, but I can't figure out how to do it. Is there a way?
Thanks, --Paul ___________________________ *Paul Dever* *::* Manager, Electronic Workflows *::* EPD-US (T) +1 314-995-3291 *:: *(E) p.dever@elsevier.com <mailto:p.dever@elsevier.com> 11830 Westline Industrial Drive, St. Louis, MO 63146 *ELSEVIER* ** **
------------------------------------------------------------------------
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user
participants (2)
-
Dever, Paul (ELS)
-
George Cristian Bina