tokenization problem

Hi, We want to tokenize natural language into xml. Each word and each punctuation mark needs to be put into an attribute of an xml element. But doing this, oXygen reports an error "not closing an xml tag". When I checked the output it seemded oXygen transforms " into a literal " and (even) " seems to be transformed that way. Is there a way to prevent oXygen from behaving this way? Kind regards, Roderik Dernison ________________________________ ---------------------------------------------------------- Aan dit bericht kunnen geen rechten worden ontleend. Het bericht is alleen bestemd voor de geadresseerde. Indien het bericht niet voor u is bestemd, verzoeken wij u dit aan ons te melden en het bericht te verwijderen. This message shall not constitute any obligations. This message is intended solely for the addressee. If you have received this message in error, please inform us and delete the message. ----------------------------------------------------------

At 2013-02-04 14:38 +0000, Roderik Dernison wrote:
We want to tokenize natural language into xml. Each word and each punctuation mark needs to be put into an attribute of an xml element. But doing this, oXygen reports an error "not closing an xml tag". When I checked the output it seemded oXygen transforms " into a literal " and (even) " seems to be transformed that way. Is there a way to prevent oXygen from behaving this way?
How are you creating your markup? You say that oXygen is the culprit, but you don't show us the steps of what is happening. Can you show an example of your data and your output, and tell us the steps that you take? Even something simple like this that has punctuation and quotes in it: Did you see? This "phrase" isn't working! Then we can better be in a position to help you. . . . . . . . Ken -- Contact us for world-wide XML consulting and instructor-led training Free 5-hour lecture: http://www.CraneSoftwrights.com/links/udemy.htm Crane Softwrights Ltd. http://www.CraneSoftwrights.com/z/ G. Ken Holman mailto:gkholman@CraneSoftwrights.com Google+ profile: https://plus.google.com/116832879756988317389/about Legal business disclaimers: http://www.CraneSoftwrights.com/legal

Hi, I'm guessing you have an XSLT transformation that performs the tokenization. XSLT engines always resolve the entities during the transformation, you cannot preserve them. If you want to escape some of the characters from the XSLT output, you have to explicitly declare a character map for them. e.g. <xsl:character-map name="specialchars"> <xsl:output-character character=""" string="""/> </xsl:character-map> <xsl:output use-character-maps="specialchars" media-type="xml"/> Regards, Adrian Adrian Buza oXygen XML Editor and Author Support Tel: +1-650-352-1250 ext.202 Fax: +40-251-461482 support@oxygenxml.com http://www.oxygenxml.com Roderik Dernison wrote:
Hi,
We want to tokenize natural language into xml. Each word and each punctuation mark needs to be put into an attribute of an xml element. But doing this, oXygen reports an error “not closing an xml tag”. When I checked the output it seemded oXygen transforms " into a literal “ and (even) " seems to be transformed that way.
Is there a way to prevent oXygen from behaving this way?
Kind regards,
Roderik Dernison
------------------------------------------------------------------------
---------------------------------------------------------- Aan dit bericht kunnen geen rechten worden ontleend. Het bericht is alleen bestemd voor de geadresseerde. Indien het bericht niet voor u is bestemd, verzoeken wij u dit aan ons te melden en het bericht te verwijderen.
This message shall not constitute any obligations. This message is intended solely for the addressee. If you have received this message in error, please inform us and delete the message. ---------------------------------------------------------- ------------------------------------------------------------------------
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user
participants (3)
-
G. Ken Holman
-
Oxygen XML Editor Support
-
Roderik Dernison