Re: [oXygen-user] Search for Characters in Unicode range

24 Jun 2016

      Hello Tobias,

Note that only 4 digits hex codes are supported by the Java/Oxygen regex 
engine with the \u unicode code point.
If you use 5 digits, the 5th digit is interpreted independently as a 
literal, so this creates undesired side effects.

e.g.
[\u0100-\u1F9FF] is interpreted as [\u0100-\u1F9F]|[F]. So you are 
inadvertently also matching "F".

Regards,
Adrian

Adrian Buza
oXygen XML Editor and Author Support

Tel: +1-650-352-1250 ext.2020
Fax: +40-251-461482

On 24.06.2016 11:17, Tobias Fischer | pagina GmbH wrote:
...
Hi Andreas,
sure, this can be done with basic regex query:|[\u00D8-\u00F6]|
||
|And for your example: [\u0100-\u1F9FF] Unfortunately, oXygen 18 seems 
to have a bug with this query (precisely: with 5 digit hex codes) as 
it also matches characters below \u0100 (which is the following of 
\u00FF). However, you can also work with negation: [^\u0000-\u00FF] 
And this seems to work fine :) Regards, Tobias |
Tobias Fischer
XML- und E-Book-Entwicklung
Telefon: +49 (0)7071 9876-44 · Fax: -22
Mail:tobias.fischer@pagina-tuebingen.de
pagina GmbH - Publikationstechnologien
Herrenberger Straße 51 | D-72070 Tübingen
www.pagina-online.de  |www.parsx.de
Handelsregister Stuttgart - HRB 380249
Geschäftsführer: Tobias Ott
Am 24.06.2016 um 09:50 schrieb Andreas Wagner:
...
Dear all,
In order to make sure that we have caught all special characters in 
an externally transcribed TEI/XML file, I would like to seach for all 
characters above Unicode Codepoint 0x00ff. Can this be done in the 
Regular Expression Find box? (I found the search for single unicode 
codepoints with \u, \x etc., but can't figure out if this can be used 
to search for characters (not) in codepoint ranges.
Thanks for any suggestion,
Andreas
_______________________________________________
oXygen-user mailing list
oXygen-user@oxygenxml.com
https://www.oxygenxml.com/mailman/listinfo/oxygen-user

Re: [oXygen-user] Search for Characters in Unicode *range*

Oxygen XML Editor Support (Adrian Buza)

Re: [oXygen-user] Search for Characters in Unicode range