Is Tamil among the languages for which a specific collation is available?

Greetings The title of this message says it all: Is Tamil among the languages for which a specific collation is available when using Oxygen? I have to sort items in the Tamil alphabetical order, but when I specify that « lang="ta" » as a parameter in my sort command, the order I obtain is the one which is based on the Unicode codepoint collation, which is not what one expects while sorting Tamil words The same thing happens if I define a parameter such as <xsl:param name="sorting-collation" select="'http://saxon.sf.net/collation?lang=ta'"/> and then use it in a sort command <xsl:sort select="." collation="{$sorting-collation}"/> To give a specific example the following short list is extracted from a much longer list which is part of an HTML file created by applying an XSLT file (containing SORT commands) to an XML file <ul> <li>அத்தத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அனந்தன் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்பொறியின் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்றிரளின் பெயர் [head-word ABOVE 1 items]</li> <li>அனுடத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அமரமாதர் பெயர் [head-word ABOVE 2 items]</li> <li>அரக்கர் பெயர் [head-word ABOVE 7 items]</li> <li>அருகன் பெயர் [head-word ABOVE 43 items]</li> </ul> HOWEVER, this is not the proper Tamil dictionnary order order, which should be: <ul> <li>அத்தத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அமரமாதர் பெயர் [head-word ABOVE 2 items]</li> <li>அரக்கர் பெயர் [head-word ABOVE 7 items]</li> <li>அருகன் பெயர் [head-word ABOVE 43 items]</li> <li>அனந்தன் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்பொறியின் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்றிரளின் பெயர் [head-word ABOVE 1 items]</li> <li>அனுடத்தின் பெயர் [head-word ABOVE 7 items]</li> </ul> Any suggestions would be appreciated -- Jean-Luc Chevillard (currently in Pondicherry, India) https://univ-paris-diderot.academia.edu/JeanLucChevillard https://twitter.com/JLC1956

Hi Jean-Luc, I tried working a little bit with your samples on my side and I think I managed to make this work. In the Oxygen libraries directory "OXYGEN_INSTALL_DIR\lib" there is a JAR library called "icu4j.jar". It is an incomplete version of a larger ICU4J library which can be downloaded from: http://site.icu-project.org/download/59#TOC-ICU4J-Download Once you have the "icu4j-59_1.jar", move the original "icu4j.jar" from the Oxygen library folder to some other place and replace it with this larger JAR library. Also the xsl:sort in the XSLT worked only if I used this syntax:
<xsl:sort select="." collation="http://www.w3.org/2013/collation/UCA?lang=ta"/>
I do not know much about the values that the collation attribute takes, but this was in one of the examples on the Saxonica documentation page: http://www.saxonica.com/html/documentation/xsl-elements/sort.html Regards, Radu Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com On 9/19/2017 5:15 PM, Jean-Luc Chevillard wrote:
Greetings
The title of this message says it all:
Is Tamil among the languages for which a specific collation is available when using Oxygen?
I have to sort items in the Tamil alphabetical order, but when I specify that « lang="ta" » as a parameter in my sort command, the order I obtain is the one which is based on the Unicode codepoint collation, which is not what one expects while sorting Tamil words
The same thing happens if I define a parameter such as
<xsl:param name="sorting-collation" select="'http://saxon.sf.net/collation?lang=ta'"/>
and then use it in a sort command
<xsl:sort select="." collation="{$sorting-collation}"/>
To give a specific example the following short list is extracted from a much longer list which is part of an HTML file created by applying an XSLT file (containing SORT commands) to an XML file
<ul> <li>அத்தத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அனந்தன் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்பொறியின் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்றிரளின் பெயர் [head-word ABOVE 1 items]</li> <li>அனுடத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அமரமாதர் பெயர் [head-word ABOVE 2 items]</li> <li>அரக்கர் பெயர் [head-word ABOVE 7 items]</li> <li>அருகன் பெயர் [head-word ABOVE 43 items]</li> </ul>
HOWEVER, this is not the proper Tamil dictionnary order order, which should be:
<ul> <li>அத்தத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அமரமாதர் பெயர் [head-word ABOVE 2 items]</li> <li>அரக்கர் பெயர் [head-word ABOVE 7 items]</li> <li>அருகன் பெயர் [head-word ABOVE 43 items]</li> <li>அனந்தன் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்பொறியின் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்றிரளின் பெயர் [head-word ABOVE 1 items]</li> <li>அனுடத்தின் பெயர் [head-word ABOVE 7 items]</li> </ul>
Any suggestions would be appreciated
-- Jean-Luc Chevillard (currently in Pondicherry, India)
https://univ-paris-diderot.academia.edu/JeanLucChevillard
https://twitter.com/JLC1956 _______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

Hello Radu, thanks a lot for this. I shall try This gives me courage! :-) -- Jean-Luc (in Pondy) https://univ-paris-diderot.academia.edu/JeanLucChevillard https://twitter.com/JLC1956 On 20/09/2017 18:14, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi Jean-Luc,
I tried working a little bit with your samples on my side and I think I managed to make this work. In the Oxygen libraries directory "OXYGEN_INSTALL_DIR\lib" there is a JAR library called "icu4j.jar". It is an incomplete version of a larger ICU4J library which can be downloaded from:
http://site.icu-project.org/download/59#TOC-ICU4J-Download
Once you have the "icu4j-59_1.jar", move the original "icu4j.jar" from the Oxygen library folder to some other place and replace it with this larger JAR library.
Also the xsl:sort in the XSLT worked only if I used this syntax:
<xsl:sort select="." collation="http://www.w3.org/2013/collation/UCA?lang=ta"/>
I do not know much about the values that the collation attribute takes, but this was in one of the examples on the Saxonica documentation page:
http://www.saxonica.com/html/documentation/xsl-elements/sort.html
Regards, Radu
Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
On 9/19/2017 5:15 PM, Jean-Luc Chevillard wrote:
Greetings
The title of this message says it all:
Is Tamil among the languages for which a specific collation is available when using Oxygen?
I have to sort items in the Tamil alphabetical order, but when I specify that « lang="ta" » as a parameter in my sort command, the order I obtain is the one which is based on the Unicode codepoint collation, which is not what one expects while sorting Tamil words
The same thing happens if I define a parameter such as
<xsl:param name="sorting-collation" select="'http://saxon.sf.net/collation?lang=ta'"/>
and then use it in a sort command
<xsl:sort select="." collation="{$sorting-collation}"/>
To give a specific example the following short list is extracted from a much longer list which is part of an HTML file created by applying an XSLT file (containing SORT commands) to an XML file
<ul> <li>அத்தத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அனந்தன் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்பொறியின் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்றிரளின் பெயர் [head-word ABOVE 1 items]</li> <li>அனுடத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அமரமாதர் பெயர் [head-word ABOVE 2 items]</li> <li>அரக்கர் பெயர் [head-word ABOVE 7 items]</li> <li>அருகன் பெயர் [head-word ABOVE 43 items]</li> </ul>
HOWEVER, this is not the proper Tamil dictionnary order order, which should be:
<ul> <li>அத்தத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அமரமாதர் பெயர் [head-word ABOVE 2 items]</li> <li>அரக்கர் பெயர் [head-word ABOVE 7 items]</li> <li>அருகன் பெயர் [head-word ABOVE 43 items]</li> <li>அனந்தன் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்பொறியின் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்றிரளின் பெயர் [head-word ABOVE 1 items]</li> <li>அனுடத்தின் பெயர் [head-word ABOVE 7 items]</li> </ul>
Any suggestions would be appreciated
-- Jean-Luc Chevillard (currently in Pondicherry, India)
https://univ-paris-diderot.academia.edu/JeanLucChevillard
https://twitter.com/JLC1956 _______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

Dear Radu, this morning in Pondicherry I have tried to locate the "icu4j.jar" and was successful in the case of my UBUNTU laptop (running Ubuntu 14.04) but UNSUCCESSFUL in the case of my Windows laptop (running Windows 7). I am part of a research team where both types of computers are represented and have to be conversant with both sides. I usually run Oxygen on the Windows Laptop (where I have a bigger screen :-) and my version is <oXygen/> XML Editor 19.0, build 2017042020 I occasionally run Oxygen on my Ubuntu laptop and have not yet upgraded to Oxygen 18 and my version there is <oXygen/> XML Editor 18.0, build 2016051118 Since I could not locate the "icu4j.jar" on the Windows laptop, I could apply your solution ONLY on the Ubuntu laptop, WHERE IT WORKED PERFECTLY. THANKS A LOT for your timely and efficient help. All that remains for me to be happy is your telling me how to handle the Windows 7 laptop. As far as I can see, inside the "Program Files" folder there is a folder called "Oxygen XML Editor 19" that folder contains a folder called ".install4j" which contains several .jar files but none is called "icu4j.jar" and a SEARCH on the computer hard disk does not reveal anything Thanks for giving me additional pointers Of course, I could use the Ubuntu machine as my main machine ;-) but I can't expect everyone else to also do that .... ;-) Best wishes -- Jean-Luc (in Pondy) https://univ-paris-diderot.academia.edu/JeanLucChevillard https://twitter.com/JLC1956 On 20/09/2017 18:14, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi Jean-Luc,
I tried working a little bit with your samples on my side and I think I managed to make this work. In the Oxygen libraries directory "OXYGEN_INSTALL_DIR\lib" there is a JAR library called "icu4j.jar". It is an incomplete version of a larger ICU4J library which can be downloaded from:
http://site.icu-project.org/download/59#TOC-ICU4J-Download
Once you have the "icu4j-59_1.jar", move the original "icu4j.jar" from the Oxygen library folder to some other place and replace it with this larger JAR library.
Also the xsl:sort in the XSLT worked only if I used this syntax:
<xsl:sort select="." collation="http://www.w3.org/2013/collation/UCA?lang=ta"/>
I do not know much about the values that the collation attribute takes, but this was in one of the examples on the Saxonica documentation page:
http://www.saxonica.com/html/documentation/xsl-elements/sort.html
Regards, Radu
Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
On 9/19/2017 5:15 PM, Jean-Luc Chevillard wrote:
Greetings
The title of this message says it all:
Is Tamil among the languages for which a specific collation is available when using Oxygen?
I have to sort items in the Tamil alphabetical order, but when I specify that « lang="ta" » as a parameter in my sort command, the order I obtain is the one which is based on the Unicode codepoint collation, which is not what one expects while sorting Tamil words
The same thing happens if I define a parameter such as
<xsl:param name="sorting-collation" select="'http://saxon.sf.net/collation?lang=ta'"/>
and then use it in a sort command
<xsl:sort select="." collation="{$sorting-collation}"/>
To give a specific example the following short list is extracted from a much longer list which is part of an HTML file created by applying an XSLT file (containing SORT commands) to an XML file
<ul> <li>அத்தத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அனந்தன் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்பொறியின் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்றிரளின் பெயர் [head-word ABOVE 1 items]</li> <li>அனுடத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அமரமாதர் பெயர் [head-word ABOVE 2 items]</li> <li>அரக்கர் பெயர் [head-word ABOVE 7 items]</li> <li>அருகன் பெயர் [head-word ABOVE 43 items]</li> </ul>
HOWEVER, this is not the proper Tamil dictionnary order order, which should be:
<ul> <li>அத்தத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அமரமாதர் பெயர் [head-word ABOVE 2 items]</li> <li>அரக்கர் பெயர் [head-word ABOVE 7 items]</li> <li>அருகன் பெயர் [head-word ABOVE 43 items]</li> <li>அனந்தன் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்பொறியின் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்றிரளின் பெயர் [head-word ABOVE 1 items]</li> <li>அனுடத்தின் பெயர் [head-word ABOVE 7 items]</li> </ul>
Any suggestions would be appreciated
-- Jean-Luc Chevillard (currently in Pondicherry, India)
https://univ-paris-diderot.academia.edu/JeanLucChevillard
https://twitter.com/JLC1956 _______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

POST-SCRIPTUM I have to apologize for sending an unnecessary request. As pointed out just now by Thilak Bhaskaran, who works in our EFEO research center, the file "icu4j.jar" DOES EXIST on the Windows laptop also. It is located in the "lib" directory of the folder called "Oxygen XML Editor 19" I have replaced it by "icu4j-59_1.jar" and restarted Oxygen and now the Tamil Alphabetical order is perfectly obtained. That is GREAT! -- Jean-Luc (in Pondy) https://univ-paris-diderot.academia.edu/JeanLucChevillard https://twitter.com/JLC1956 ############################## Dear Radu, this morning in Pondicherry I have tried to locate the "icu4j.jar" and was successful in the case of my UBUNTU laptop (running Ubuntu 14.04) but UNSUCCESSFUL in the case of my Windows laptop (running Windows 7). I am part of a research team where both types of computers are represented and have to be conversant with both sides. I usually run Oxygen on the Windows Laptop (where I have a bigger screen :-) and my version is <oXygen/> XML Editor 19.0, build 2017042020 I occasionally run Oxygen on my Ubuntu laptop and have not yet upgraded to Oxygen 18 and my version there is <oXygen/> XML Editor 18.0, build 2016051118 Since I could not locate the "icu4j.jar" on the Windows laptop, I could apply your solution ONLY on the Ubuntu laptop, WHERE IT WORKED PERFECTLY. THANKS A LOT for your timely and efficient help. All that remains for me to be happy is your telling me how to handle the Windows 7 laptop. As far as I can see, inside the "Program Files" folder there is a folder called "Oxygen XML Editor 19" that folder contains a folder called ".install4j" which contains several .jar files but none is called "icu4j.jar" and a SEARCH on the computer hard disk does not reveal anything Thanks for giving me additional pointers Of course, I could use the Ubuntu machine as my main machine ;-) but I can't expect everyone else to also do that .... ;-) Best wishes -- Jean-Luc (in Pondy) https://univ-paris-diderot.academia.edu/JeanLucChevillard https://twitter.com/JLC1956 On 20/09/2017 18:14, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi Jean-Luc,
I tried working a little bit with your samples on my side and I think I managed to make this work. In the Oxygen libraries directory "OXYGEN_INSTALL_DIR\lib" there is a JAR library called "icu4j.jar". It is an incomplete version of a larger ICU4J library which can be downloaded from:
http://site.icu-project.org/download/59#TOC-ICU4J-Download
Once you have the "icu4j-59_1.jar", move the original "icu4j.jar" from the Oxygen library folder to some other place and replace it with this larger JAR library.
Also the xsl:sort in the XSLT worked only if I used this syntax:
<xsl:sort select="." collation="http://www.w3.org/2013/collation/UCA?lang=ta"/>
I do not know much about the values that the collation attribute takes, but this was in one of the examples on the Saxonica documentation page:
http://www.saxonica.com/html/documentation/xsl-elements/sort.html
Regards, Radu
Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
On 9/19/2017 5:15 PM, Jean-Luc Chevillard wrote:
Greetings
The title of this message says it all:
Is Tamil among the languages for which a specific collation is available when using Oxygen?
I have to sort items in the Tamil alphabetical order, but when I specify that « lang="ta" » as a parameter in my sort command, the order I obtain is the one which is based on the Unicode codepoint collation, which is not what one expects while sorting Tamil words
The same thing happens if I define a parameter such as
<xsl:param name="sorting-collation" select="'http://saxon.sf.net/collation?lang=ta'"/>
and then use it in a sort command
<xsl:sort select="." collation="{$sorting-collation}"/>
To give a specific example the following short list is extracted from a much longer list which is part of an HTML file created by applying an XSLT file (containing SORT commands) to an XML file
<ul> <li>அத்தத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அனந்தன் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்பொறியின் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்றிரளின் பெயர் [head-word ABOVE 1 items]</li> <li>அனுடத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அமரமாதர் பெயர் [head-word ABOVE 2 items]</li> <li>அரக்கர் பெயர் [head-word ABOVE 7 items]</li> <li>அருகன் பெயர் [head-word ABOVE 43 items]</li> </ul>
HOWEVER, this is not the proper Tamil dictionnary order order, which should be:
<ul> <li>அத்தத்தின் பெயர் [head-word ABOVE 7 items]</li> <li>அமரமாதர் பெயர் [head-word ABOVE 2 items]</li> <li>அரக்கர் பெயர் [head-word ABOVE 7 items]</li> <li>அருகன் பெயர் [head-word ABOVE 43 items]</li> <li>அனந்தன் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்பொறியின் பெயர் [head-word ABOVE 1 items]</li> <li>அனற்றிரளின் பெயர் [head-word ABOVE 1 items]</li> <li>அனுடத்தின் பெயர் [head-word ABOVE 7 items]</li> </ul>
Any suggestions would be appreciated
-- Jean-Luc Chevillard (currently in Pondicherry, India)
https://univ-paris-diderot.academia.edu/JeanLucChevillard
https://twitter.com/JLC1956 _______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
participants (2)
-
Jean-Luc Chevillard
-
Oxygen XML Editor Support (Radu Coravu)