p:unescape-markup - missing htmlparser library ?

I'm trying to use the XProc step p:unescape-markup to parse some HTML, but I don't think tagsoup got replaced with htmlparser[1] with the newer versions of calabash[2]? At least I can find tagsoup-1.2.jar but not htmlparser-1.3.1.jar in my oXygen directory. Or have I just misconfigured something? (quite likely...) (using <oXygen/> XML Editor 13.2, build 2012011017) Regards Jostein [1] http://about.validator.nu/htmlparser/ [2] http://lists.w3.org/Archives/Public/xproc-dev/2011Oct/0010.html

Hi Jostein, Calabash documents only tagsoup on its main documentation page as required for this step http://xmlcalabash.com/docs/ However, looking into this it seems that it defaults to the HTML parser that you mentioned. There are two options now: 1. Edit the engine.xml file from [oXygen]/lib/xproc/calabash/engine.xml and add a line <system-property name="com.xmlcalabash.html-parser" value="tagsoup"/> inside the runtime element. 2. Add the htmlparser-1.3.1.jar inside [oXygen]/lib/xproc/calabash/ and edit again the [oXygen]/lib/xproc/calabash/engine.xml file to add a library entry pointing to this jar in the runtime element <library name="htmlparser-1.3.1.jar"/> Option 1 is easier and it is what I tested but option 2 should work as well. Best Regards, George -- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com On 1/18/12 2:23 PM, Jostein Austvik Jacobsen wrote:
I'm trying to use the XProc step p:unescape-markup to parse some HTML, but I don't think tagsoup got replaced with htmlparser[1] with the newer versions of calabash[2]? At least I can find tagsoup-1.2.jar but not htmlparser-1.3.1.jar in my oXygen directory.
Or have I just misconfigured something? (quite likely...)
(using <oXygen/> XML Editor 13.2, build 2012011017)
Regards Jostein
[1] http://about.validator.nu/htmlparser/ [2] http://lists.w3.org/Archives/Public/xproc-dev/2011Oct/0010.html
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user

Tried option 2 and it works. Thanks! Jostein 2012/1/19 George Cristian Bina <george@oxygenxml.com>
Hi Jostein,
Calabash documents only tagsoup on its main documentation page as required for this step http://xmlcalabash.com/docs/
However, looking into this it seems that it defaults to the HTML parser that you mentioned. There are two options now:
1. Edit the engine.xml file from [oXygen]/lib/xproc/calabash/**engine.xml and add a line <system-property name="com.xmlcalabash.html-**parser" value="tagsoup"/> inside the runtime element.
2. Add the htmlparser-1.3.1.jar inside [oXygen]/lib/xproc/calabash/ and edit again the [oXygen]/lib/xproc/calabash/**engine.xml file to add a library entry pointing to this jar in the runtime element <library name="htmlparser-1.3.1.jar"/>
Option 1 is easier and it is what I tested but option 2 should work as well.
Best Regards, George -- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
On 1/18/12 2:23 PM, Jostein Austvik Jacobsen wrote:
I'm trying to use the XProc step p:unescape-markup to parse some HTML, but I don't think tagsoup got replaced with htmlparser[1] with the newer versions of calabash[2]? At least I can find tagsoup-1.2.jar but not htmlparser-1.3.1.jar in my oXygen directory.
Or have I just misconfigured something? (quite likely...)
(using <oXygen/> XML Editor 13.2, build 2012011017)
Regards Jostein
[1] http://about.validator.nu/**htmlparser/<http://about.validator.nu/htmlparser/> [2] http://lists.w3.org/Archives/**Public/xproc-dev/2011Oct/0010.**html<http://lists.w3.org/Archives/Public/xproc-dev/2011Oct/0010.html>
______________________________**_________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/**mailman/listinfo/oxygen-user<http://www.oxygenxml.com/mailman/listinfo/oxygen-user>
participants (2)
-
George Cristian Bina
-
Jostein Austvik Jacobsen