How to type an UTF8 symbol in text as well as in author mode

When I soaked my keyboard with tea, I had to by another one. I settled for the logitech G910 which has some additional function keys. I have already three of them in use for "text mode", "author mode" and a macro for citations. Now I want it to use for nonbreakable space (U+00A0), and nonseparable hyphen (U+2011). To this I need to know how to enter those codes (and any other UTF8 code) with a normal keyboard. Please advice. I have tested with alt + 194 160 (on the keypad) but end up with r; alt + 160 is á. It should be "space". Using entities would differ between text and author mode, which I would like to avoid. Bernhard -- spitzhalde9 D-79853 lenzkirch bernhard.kleine@gmx.net www.b-kleine.com, www.urseetal.net - thunderbird mit enigmail GPG schlüssel: D5257409 fingerprint: 08 B7 F8 70 22 7A FC C1 15 49 CA A6 C7 6F A0 2E D5 25 74 09

Hi Bernhard, It seems that for "nbsp" which has the decimal equivalent "160" you would need to type "ALT" and then "0160", that leading "0" seems to be important. The same probably for all other characters, type their decimal equivalent but it needs to be four typed figures. Regards, Radu Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com On 2/18/2018 6:31 PM, Bernhard Kleine wrote:
When I soaked my keyboard with tea, I had to by another one. I settled for the logitech G910 which has some additional function keys. I have already three of them in use for "text mode", "author mode" and a macro for citations. Now I want it to use for nonbreakable space (U+00A0), and nonseparable hyphen (U+2011). To this I need to know how to enter those codes (and any other UTF8 code) with a normal keyboard. Please advice.
I have tested with alt + 194 160 (on the keypad) but end up with r; alt + 160 is á. It should be "space". Using entities would differ between text and author mode, which I would like to avoid.
Bernhard
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

On Mon, Feb 19, 2018 at 09:33:28AM +0200, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi Bernhard,
It seems that for "nbsp" which has the decimal equivalent "160" you would need to type "ALT" and then "0160", that leading "0" seems to be important. The same probably for all other characters, type their decimal equivalent but it needs to be four typed figures.
Oh, how quickly we forget certain things. :) oXygen has had the ability to enter UTF-8 characters in the first plane by their four character hexadecimal code point value since version 17.1. I can't recall what the default hotkey is for invoking it because I changed mine (back) to F8 as soon as I installed that version. I believe I've still got the plugin you guys provided me during my trial period for 17.0. Anyway, if Bernhard is happy with using hex instead of int, that's the solution instead of the Windows alt sequences (or the Mac alt/option sequences either, for that matter). Accessing characters in multiplanes beyond the first is difficult in most programs, including oXygenXML. Obviously XML can handle it, but the accessing problems are twofold: 1. Entering a hexadecimal character comprised of five or six hex characters on the remaining 16 planes (i.e. 0x10000 to 0x1fffff). 2. Rendering characters which can only be displayed using multiple fonts and guaranteeing font fallback capablities. I have only one program which can handle both of these natively for editing and that's GNU Emacs, but in those cases where I need to delve into the upper multiplanes I can open a file from oXygen in Emacs and that'll do for now. It might be worth having a look at extending the hex entry feature to enable a way to enter a hex value of grater than 3 bytes (4 characters), but oXygfen takes that input differently to other programs and so it might be tricker. Emacs, LibreOffice and other programs work by activating the hex input function (it's "M-x insert-char" in Emacs) and then entering the code point hex value. In oXygen you enter the hex value as four characters in the document and then press the hotkey which reads the preceding four characters and transforms them. As for font fallback, there's pretty much no options for handling that in oXygen, but there are effective workarounds by doing sneaky things with CSS in the source files as well as the output formats. I've got my own little Unicode cheat sheet which has been gradually growing over the last decade or so and covers most of this in more detail. Bear in mind two things: first, it's a personal cheat sheet that I only share because it often answers frequent questions I hear elsewhere; and second, it's a "living document" that gets updated frequently. That said, it's here: https://www.dropbox.com/s/8jifzcc8qks5cef/UnicodeNotes.pdf?dl=0 Or to download it: https://www.dropbox.com/s/8jifzcc8qks5cef/UnicodeNotes.pdf?dl=1 It's only ever released as a PDF because of all the font/glyph embedding. It claims or attempts to export as PDF/A-1, but only to ensure that font embedding and it probably won't pass preflight checks (nor does it need to). For those few readers of this list who also use Emacs, the last three pages of that file include those portions of my Emacs init file which specify the fallback fonts using fontset default. I've got coverage from 0x0000 to 0x2ffff and where things occasionally misbehave, they're easy to identify with the aid of the binding on F16 (i.e. M-x describe-char). Finally, my current favourite code point checking tool, for any system with Perl installed, is unum.pl, available here: https://www.fourmilab.ch/webtools/unum/ The current version of the cheat sheet discusses it on page 23, but here's a nice example of what it does: bash-4.4$ unum.pl 0x1f926 Octal Decimal Hex HTML Character Unicode 0374446 129318 0x1F926 🤦 "🤦" FACE PALM bash-4.4$ Obviously some of us can see that character properly and some can't, but you all know which it is. Regards, Ben

Hi, Thanks for the reminder Ben. Indeed I forgot about this feature in Oxygen: https://www.oxygenxml.com/doc/versions/19.1/ug-editor/topics/text-mode-actio... which basically allows you to type away the hex digits in Oxygen and then invoke the special "Convert Hexadecimal Sequence to Character" action. Regards, Radu Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com On 2/19/2018 10:56 AM, Ben McGinnes wrote:
On Mon, Feb 19, 2018 at 09:33:28AM +0200, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi Bernhard,
It seems that for "nbsp" which has the decimal equivalent "160" you would need to type "ALT" and then "0160", that leading "0" seems to be important. The same probably for all other characters, type their decimal equivalent but it needs to be four typed figures.
Oh, how quickly we forget certain things. :)
oXygen has had the ability to enter UTF-8 characters in the first plane by their four character hexadecimal code point value since version 17.1. I can't recall what the default hotkey is for invoking it because I changed mine (back) to F8 as soon as I installed that version. I believe I've still got the plugin you guys provided me during my trial period for 17.0.
Anyway, if Bernhard is happy with using hex instead of int, that's the solution instead of the Windows alt sequences (or the Mac alt/option sequences either, for that matter).
Accessing characters in multiplanes beyond the first is difficult in most programs, including oXygenXML. Obviously XML can handle it, but the accessing problems are twofold:
1. Entering a hexadecimal character comprised of five or six hex characters on the remaining 16 planes (i.e. 0x10000 to 0x1fffff).
2. Rendering characters which can only be displayed using multiple fonts and guaranteeing font fallback capablities.
I have only one program which can handle both of these natively for editing and that's GNU Emacs, but in those cases where I need to delve into the upper multiplanes I can open a file from oXygen in Emacs and that'll do for now.
It might be worth having a look at extending the hex entry feature to enable a way to enter a hex value of grater than 3 bytes (4 characters), but oXygfen takes that input differently to other programs and so it might be tricker. Emacs, LibreOffice and other programs work by activating the hex input function (it's "M-x insert-char" in Emacs) and then entering the code point hex value. In oXygen you enter the hex value as four characters in the document and then press the hotkey which reads the preceding four characters and transforms them.
As for font fallback, there's pretty much no options for handling that in oXygen, but there are effective workarounds by doing sneaky things with CSS in the source files as well as the output formats.
I've got my own little Unicode cheat sheet which has been gradually growing over the last decade or so and covers most of this in more detail. Bear in mind two things: first, it's a personal cheat sheet that I only share because it often answers frequent questions I hear elsewhere; and second, it's a "living document" that gets updated frequently.
That said, it's here:
https://www.dropbox.com/s/8jifzcc8qks5cef/UnicodeNotes.pdf?dl=0
Or to download it:
https://www.dropbox.com/s/8jifzcc8qks5cef/UnicodeNotes.pdf?dl=1
It's only ever released as a PDF because of all the font/glyph embedding. It claims or attempts to export as PDF/A-1, but only to ensure that font embedding and it probably won't pass preflight checks (nor does it need to).
For those few readers of this list who also use Emacs, the last three pages of that file include those portions of my Emacs init file which specify the fallback fonts using fontset default. I've got coverage from 0x0000 to 0x2ffff and where things occasionally misbehave, they're easy to identify with the aid of the binding on F16 (i.e. M-x describe-char).
Finally, my current favourite code point checking tool, for any system with Perl installed, is unum.pl, available here:
https://www.fourmilab.ch/webtools/unum/
The current version of the cheat sheet discusses it on page 23, but here's a nice example of what it does:
bash-4.4$ unum.pl 0x1f926 Octal Decimal Hex HTML Character Unicode 0374446 129318 0x1F926 🤦 "🤦" FACE PALM bash-4.4$
Obviously some of us can see that character properly and some can't, but you all know which it is.
Regards, Ben
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

Hi, Another way to enter a special character, or in general any code fragment, is to use code templates as documented at: https://www.oxygenxml.com/doc/versions/19.1/ug-editor/topics/code-templates-... Best Regards, George -- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com On 19/02/18 11:03, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi,
Thanks for the reminder Ben. Indeed I forgot about this feature in Oxygen:
https://www.oxygenxml.com/doc/versions/19.1/ug-editor/topics/text-mode-actio...
which basically allows you to type away the hex digits in Oxygen and then invoke the special "Convert Hexadecimal Sequence to Character" action.
Regards, Radu
Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
On 2/19/2018 10:56 AM, Ben McGinnes wrote:
On Mon, Feb 19, 2018 at 09:33:28AM +0200, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi Bernhard,
It seems that for "nbsp" which has the decimal equivalent "160" you would need to type "ALT" and then "0160", that leading "0" seems to be important. The same probably for all other characters, type their decimal equivalent but it needs to be four typed figures.
Oh, how quickly we forget certain things. :)
oXygen has had the ability to enter UTF-8 characters in the first plane by their four character hexadecimal code point value since version 17.1. I can't recall what the default hotkey is for invoking it because I changed mine (back) to F8 as soon as I installed that version. I believe I've still got the plugin you guys provided me during my trial period for 17.0.
Anyway, if Bernhard is happy with using hex instead of int, that's the solution instead of the Windows alt sequences (or the Mac alt/option sequences either, for that matter).
Accessing characters in multiplanes beyond the first is difficult in most programs, including oXygenXML. Obviously XML can handle it, but the accessing problems are twofold:
1. Entering a hexadecimal character comprised of five or six hex characters on the remaining 16 planes (i.e. 0x10000 to 0x1fffff).
2. Rendering characters which can only be displayed using multiple fonts and guaranteeing font fallback capablities.
I have only one program which can handle both of these natively for editing and that's GNU Emacs, but in those cases where I need to delve into the upper multiplanes I can open a file from oXygen in Emacs and that'll do for now.
It might be worth having a look at extending the hex entry feature to enable a way to enter a hex value of grater than 3 bytes (4 characters), but oXygfen takes that input differently to other programs and so it might be tricker. Emacs, LibreOffice and other programs work by activating the hex input function (it's "M-x insert-char" in Emacs) and then entering the code point hex value. In oXygen you enter the hex value as four characters in the document and then press the hotkey which reads the preceding four characters and transforms them.
As for font fallback, there's pretty much no options for handling that in oXygen, but there are effective workarounds by doing sneaky things with CSS in the source files as well as the output formats.
I've got my own little Unicode cheat sheet which has been gradually growing over the last decade or so and covers most of this in more detail. Bear in mind two things: first, it's a personal cheat sheet that I only share because it often answers frequent questions I hear elsewhere; and second, it's a "living document" that gets updated frequently.
That said, it's here:
https://www.dropbox.com/s/8jifzcc8qks5cef/UnicodeNotes.pdf?dl=0
Or to download it:
https://www.dropbox.com/s/8jifzcc8qks5cef/UnicodeNotes.pdf?dl=1
It's only ever released as a PDF because of all the font/glyph embedding. It claims or attempts to export as PDF/A-1, but only to ensure that font embedding and it probably won't pass preflight checks (nor does it need to).
For those few readers of this list who also use Emacs, the last three pages of that file include those portions of my Emacs init file which specify the fallback fonts using fontset default. I've got coverage from 0x0000 to 0x2ffff and where things occasionally misbehave, they're easy to identify with the aid of the binding on F16 (i.e. M-x describe-char).
Finally, my current favourite code point checking tool, for any system with Perl installed, is unum.pl, available here:
https://www.fourmilab.ch/webtools/unum/
The current version of the cheat sheet discusses it on page 23, but here's a nice example of what it does:
bash-4.4$ unum.pl 0x1f926 Octal Decimal Hex HTML Character Unicode 0374446 129318 0x1F926 🤦 "🤦" FACE PALM bash-4.4$
Obviously some of us can see that character properly and some can't, but you all know which it is.
Regards, Ben
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

The UTF8 table at http://www.utf8-zeichentabelle.de/unicode-utf8-table.pl?start=8592 shows this first four lines. Unicode Codepos. Zeichen UTF-8 (hex.) Name U+2190 ← e2 86 90 LEFTWARDS ARROW U+2191 ↑ e2 86 91 UPWARDS ARROW U+2192 → e2 86 92 RIGHTWARDS ARROW U+2193 ↓ e2 86 93 DOWNWARDS ARROW When I tried to change a utf8 hex value in a simple doc, using Ctrl-Shift-X, I get: (not a valid hexadecimal sequence to change) I also tried the 0x1F926 from Bens example below. The same error. What do I wrong? These arrows would be a good example since they will be used. Regards Bernhard Am 19.02.2018 um 10:03 schrieb Oxygen XML Editor Support (Radu Coravu) :
Hi,
Thanks for the reminder Ben. Indeed I forgot about this feature in Oxygen:
https://www.oxygenxml.com/doc/versions/19.1/ug-editor/topics/text-mode-actio...
which basically allows you to type away the hex digits in Oxygen and then invoke the special "Convert Hexadecimal Sequence to Character" action.
Regards, Radu
Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
On 2/19/2018 10:56 AM, Ben McGinnes wrote:
On Mon, Feb 19, 2018 at 09:33:28AM +0200, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi Bernhard,
It seems that for "nbsp" which has the decimal equivalent "160" you would need to type "ALT" and then "0160", that leading "0" seems to be important. The same probably for all other characters, type their decimal equivalent but it needs to be four typed figures.
Oh, how quickly we forget certain things. :)
oXygen has had the ability to enter UTF-8 characters in the first plane by their four character hexadecimal code point value since version 17.1. I can't recall what the default hotkey is for invoking it because I changed mine (back) to F8 as soon as I installed that version. I believe I've still got the plugin you guys provided me during my trial period for 17.0.
Anyway, if Bernhard is happy with using hex instead of int, that's the solution instead of the Windows alt sequences (or the Mac alt/option sequences either, for that matter).
....
bash-4.4$ unum.pl 0x1f926 Octal Decimal Hex HTML Character Unicode 0374446 129318 0x1F926 🤦 "🤦" FACE PALM bash-4.4$
Obviously some of us can see that character properly and some can't, but you all know which it is.
Regards, Ben
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
-- spitzhalde9 D-79853 lenzkirch bernhard.kleine@gmx.net www.b-kleine.com, www.urseetal.net - thunderbird mit enigmail GPG schlüssel: D5257409 fingerprint: 08 B7 F8 70 22 7A FC C1 15 49 CA A6 C7 6F A0 2E D5 25 74 09

You probably need to enter the codepoint’s hex value (0x2192) rather than its UTF-8 representation. On 19/02/2018 16:17, Bernhard Kleine wrote:
The UTF8 table at http://www.utf8-zeichentabelle.de/unicode-utf8-table.pl?start=8592 shows this first four lines.
Unicode Codepos. Zeichen UTF-8 (hex.) Name U+2190 ← e2 86 90 LEFTWARDS ARROW U+2191 ↑ e2 86 91 UPWARDS ARROW U+2192 → e2 86 92 RIGHTWARDS ARROW U+2193 ↓ e2 86 93 DOWNWARDS ARROW
When I tried to change a utf8 hex value in a simple doc, using Ctrl-Shift-X, I get:
(not a valid hexadecimal sequence to change)
I also tried the 0x1F926 from Bens example below. The same error. What do I wrong?
These arrows would be a good example since they will be used.
Regards
Bernhard
Am 19.02.2018 um 10:03 schrieb Oxygen XML Editor Support (Radu Coravu) :
Hi,
Thanks for the reminder Ben. Indeed I forgot about this feature in Oxygen:
https://www.oxygenxml.com/doc/versions/19.1/ug-editor/topics/text-mode-actio...
which basically allows you to type away the hex digits in Oxygen and then invoke the special "Convert Hexadecimal Sequence to Character" action.
Regards, Radu
Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
On 2/19/2018 10:56 AM, Ben McGinnes wrote:
On Mon, Feb 19, 2018 at 09:33:28AM +0200, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi Bernhard,
It seems that for "nbsp" which has the decimal equivalent "160" you would need to type "ALT" and then "0160", that leading "0" seems to be important. The same probably for all other characters, type their decimal equivalent but it needs to be four typed figures.
Oh, how quickly we forget certain things. :)
oXygen has had the ability to enter UTF-8 characters in the first plane by their four character hexadecimal code point value since version 17.1. I can't recall what the default hotkey is for invoking it because I changed mine (back) to F8 as soon as I installed that version. I believe I've still got the plugin you guys provided me during my trial period for 17.0.
Anyway, if Bernhard is happy with using hex instead of int, that's the solution instead of the Windows alt sequences (or the Mac alt/option sequences either, for that matter).
....
bash-4.4$ unum.pl 0x1f926 Octal Decimal Hex HTML Character Unicode 0374446 129318 0x1F926 🤦 "🤦" FACE PALM bash-4.4$
Obviously some of us can see that character properly and some can't, but you all know which it is.
Regards, Ben
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
-- spitzhalde9 D-79853 lenzkirch bernhard.kleine@gmx.net www.b-kleine.com,www.urseetal.net - thunderbird mit enigmail GPG schlüssel: D5257409 fingerprint: 08 B7 F8 70 22 7A FC C1 15 49 CA A6 C7 6F A0 2E D5 25 74 09
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
-- Gerrit Imsieke Geschäftsführer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@le-tex.de, http://www.le-tex.de Registergericht / Commercial Register: Amtsgericht Leipzig Registernummer / Registration Number: HRB 24930 Geschäftsführer: Gerrit Imsieke, Svea Jelonek, Thomas Schmidt

Hi Bernhard, The action converts the Unicode codepoint, that is 2190 and not its UTF-8 encoding, so just type 2190 and invoke the action. Best Regards, George -- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com On 19/02/18 17:17, Bernhard Kleine wrote:
The UTF8 table at http://www.utf8-zeichentabelle.de/unicode-utf8-table.pl?start=8592 shows this first four lines.
Unicode Codepos. Zeichen UTF-8 (hex.) Name U+2190 ← e2 86 90 LEFTWARDS ARROW U+2191 ↑ e2 86 91 UPWARDS ARROW U+2192 → e2 86 92 RIGHTWARDS ARROW U+2193 ↓ e2 86 93 DOWNWARDS ARROW
When I tried to change a utf8 hex value in a simple doc, using Ctrl-Shift-X, I get:
(not a valid hexadecimal sequence to change)
I also tried the 0x1F926 from Bens example below. The same error. What do I wrong?
These arrows would be a good example since they will be used.
Regards
Bernhard
Am 19.02.2018 um 10:03 schrieb Oxygen XML Editor Support (Radu Coravu) :
Hi,
Thanks for the reminder Ben. Indeed I forgot about this feature in Oxygen:
https://www.oxygenxml.com/doc/versions/19.1/ug-editor/topics/text-mode-actio...
which basically allows you to type away the hex digits in Oxygen and then invoke the special "Convert Hexadecimal Sequence to Character" action.
Regards, Radu
Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
On 2/19/2018 10:56 AM, Ben McGinnes wrote:
On Mon, Feb 19, 2018 at 09:33:28AM +0200, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi Bernhard,
It seems that for "nbsp" which has the decimal equivalent "160" you would need to type "ALT" and then "0160", that leading "0" seems to be important. The same probably for all other characters, type their decimal equivalent but it needs to be four typed figures.
Oh, how quickly we forget certain things. :)
oXygen has had the ability to enter UTF-8 characters in the first plane by their four character hexadecimal code point value since version 17.1. I can't recall what the default hotkey is for invoking it because I changed mine (back) to F8 as soon as I installed that version. I believe I've still got the plugin you guys provided me during my trial period for 17.0.
Anyway, if Bernhard is happy with using hex instead of int, that's the solution instead of the Windows alt sequences (or the Mac alt/option sequences either, for that matter).
....
bash-4.4$ unum.pl 0x1f926 Octal Decimal Hex HTML Character Unicode 0374446 129318 0x1F926 🤦 "🤦" FACE PALM bash-4.4$
Obviously some of us can see that character properly and some can't, but you all know which it is.
Regards, Ben
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
-- spitzhalde9 D-79853 lenzkirch bernhard.kleine@gmx.net www.b-kleine.com, www.urseetal.net - thunderbird mit enigmail GPG schlüssel: D5257409 fingerprint: 08 B7 F8 70 22 7A FC C1 15 49 CA A6 C7 6F A0 2E D5 25 74 09
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

The (hex.) column is the UTF-8 encoding of the character, that is, the sequence of bytes. The actual Unicode character number is the value in the first column, e.g., \u2190. So you should be able to type 2190 and get the character you want. Unicode is the character set and the character numbers (code points) are independent of how the characters are encoded. The encoding is how the characters are translated to bytes when written as a byte sequence. The Unicode standard defines a number of encodings, including UTF-8 and UTF-16. So there are not “UTF-8 characters”, only UTF-8 encodings of Unicode characters. The UTF-8 encoding was designed so that it is identical to ASCII for the first 127 or 255 characters (depending on which version of ASCII you’re looking at). But after character 255 it takes at least 3 bytes to encode a character. Cheers, E. -- Eliot Kimber http://contrext.com From: oXygen-user <oxygen-user-bounces@oxygenxml.com> on behalf of Bernhard Kleine <bernhard.kleine@gmx.net> Date: Monday, February 19, 2018 at 9:17 AM To: <oxygen-user@oxygenxml.com> Subject: Re: [oXygen-user] How to type an UTF8 symbol in text as well as in author mode The UTF8 table at http://www.utf8-zeichentabelle.de/unicode-utf8-table.pl?start=8592 shows this first four lines. Unicode Codepos.ZeichenUTF-8 (hex.)Name U+2190←e2 86 90LEFTWARDS ARROW U+2191↑e2 86 91UPWARDS ARROW U+2192→e2 86 92RIGHTWARDS ARROW U+2193↓e2 86 93DOWNWARDS ARROW When I tried to change a utf8 hex value in a simple doc, using Ctrl-Shift-X, I get: (not a valid hexadecimal sequence to change) I also tried the 0x1F926 from Bens example below. The same error. What do I wrong? These arrows would be a good example since they will be used. Regards Bernhard Am 19.02.2018 um 10:03 schrieb Oxygen XML Editor Support (Radu Coravu) : Hi, Thanks for the reminder Ben. Indeed I forgot about this feature in Oxygen: https://www.oxygenxml.com/doc/versions/19.1/ug-editor/topics/text-mode-actio... which basically allows you to type away the hex digits in Oxygen and then invoke the special "Convert Hexadecimal Sequence to Character" action. Regards, Radu Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com On 2/19/2018 10:56 AM, Ben McGinnes wrote: On Mon, Feb 19, 2018 at 09:33:28AM +0200, Oxygen XML Editor Support (Radu Coravu) wrote: Hi Bernhard, It seems that for "nbsp" which has the decimal equivalent "160" you would need to type "ALT" and then "0160", that leading "0" seems to be important. The same probably for all other characters, type their decimal equivalent but it needs to be four typed figures. Oh, how quickly we forget certain things. :) oXygen has had the ability to enter UTF-8 characters in the first plane by their four character hexadecimal code point value since version 17.1. I can't recall what the default hotkey is for invoking it because I changed mine (back) to F8 as soon as I installed that version. I believe I've still got the plugin you guys provided me during my trial period for 17.0. Anyway, if Bernhard is happy with using hex instead of int, that's the solution instead of the Windows alt sequences (or the Mac alt/option sequences either, for that matter). .... bash-4.4$ unum.pl 0x1f926 Octal Decimal Hex HTML Character Unicode 0374446 129318 0x1F926 🤦 "🤦" FACE PALM bash-4.4$ Obviously some of us can see that character properly and some can't, but you all know which it is. Regards, Ben _______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user _______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user -- spitzhalde9 D-79853 lenzkirch bernhard.kleine@gmx.net www.b-kleine.com, www.urseetal.net - thunderbird mit enigmail GPG schlüssel: D5257409 fingerprint: 08 B7 F8 70 22 7A FC C1 15 49 CA A6 C7 6F A0 2E D5 25 74 09 _______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

That was a difficult birth. But it works now. Thanks you so much for your patience. Now It will be very easy to program some function keys on the Logitech keyboard to shortcut the use of the mouse. Best regards Bernhard Am 19.02.2018 um 16:28 schrieb Eliot Kimber:
The (hex.) column is the UTF-8 encoding of the character, that is, the sequence of bytes.
The actual Unicode character number is the value in the first column, e.g., \u2190.
So you should be able to type 2190 and get the character you want.
Unicode is the character set and the character numbers (code points) are independent of how the characters are encoded.
The encoding is how the characters are translated to bytes when written as a byte sequence.
The Unicode standard defines a number of encodings, including UTF-8 and UTF-16.
So there are not “UTF-8 characters”, only UTF-8 encodings of Unicode characters.
The UTF-8 encoding was designed so that it is identical to ASCII for the first 127 or 255 characters (depending on which version of ASCII you’re looking at). But after character 255 it takes at least 3 bytes to encode a character.
Cheers,
E.
--
Eliot Kimber
*From: *oXygen-user <oxygen-user-bounces@oxygenxml.com> on behalf of Bernhard Kleine <bernhard.kleine@gmx.net> *Date: *Monday, February 19, 2018 at 9:17 AM *To: *<oxygen-user@oxygenxml.com> *Subject: *Re: [oXygen-user] How to type an UTF8 symbol in text as well as in author mode
The UTF8 table at http://www.utf8-zeichentabelle.de/unicode-utf8-table.pl?start=8592shows this first four lines.
*Unicode Codepos.*
*Zeichen*
*UTF-8 (hex.)*
*Name*
U+2190
←
e2 86 90
LEFTWARDS ARROW
U+2191
↑
e2 86 91
UPWARDS ARROW
U+2192
→
e2 86 92
RIGHTWARDS ARROW
U+2193
↓
e2 86 93
DOWNWARDS ARROW
When I tried to change a utf8 hex value in a simple doc, using Ctrl-Shift-X, I get:
cid:part1.482BC927.158A9BBD@gmx.net
(not a valid hexadecimal sequence to change)
I also tried the 0x1F926 from Bens example below. The same error. What do I wrong?
These arrows would be a good example since they will be used.
Regards
Bernhard
Am 19.02.2018 um 10:03 schrieb Oxygen XML Editor Support (Radu Coravu) :
Hi,
Thanks for the reminder Ben. Indeed I forgot about this feature in Oxygen:
https://www.oxygenxml.com/doc/versions/19.1/ug-editor/topics/text-mode-actio...
which basically allows you to type away the hex digits in Oxygen and then invoke the special "Convert Hexadecimal Sequence to Character" action.
Regards, Radu
Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
On 2/19/2018 10:56 AM, Ben McGinnes wrote:
On Mon, Feb 19, 2018 at 09:33:28AM +0200, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi Bernhard,
It seems that for "nbsp" which has the decimal equivalent "160" you would need to type "ALT" and then "0160", that leading "0" seems to be important. The same probably for all other characters, type their decimal equivalent but it needs to be four typed figures.
Oh, how quickly we forget certain things. :)
oXygen has had the ability to enter UTF-8 characters in the first plane by their four character hexadecimal code point value since version 17.1. I can't recall what the default hotkey is for invoking it because I changed mine (back) to F8 as soon as I installed that version. I believe I've still got the plugin you guys provided me during my trial period for 17.0.
Anyway, if Bernhard is happy with using hex instead of int, that's the solution instead of the Windows alt sequences (or the Mac alt/option sequences either, for that matter).
....
bash-4.4$ unum.pl 0x1f926 Octal Decimal Hex HTML Character Unicode 0374446 129318 0x1F926 🤦 "🤦" FACE PALM bash-4.4$
Obviously some of us can see that character properly and some can't, but you all know which it is.
Regards, Ben
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com<mailto:oXygen-user@oxygenxml.com> https://www.oxygenxml.com/mailman/listinfo/oxygen-user
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com<mailto:oXygen-user@oxygenxml.com> https://www.oxygenxml.com/mailman/listinfo/oxygen-user
-- spitzhalde9 D-79853 lenzkirch bernhard.kleine@gmx.net<mailto:bernhard.kleine@gmx.net> www.b-kleine.com<http://www.b-kleine.com>, www.urseetal.net<http://www.urseetal.net> - thunderbird mit enigmail GPG schlüssel: D5257409 fingerprint: 08 B7 F8 70 22 7A FC C1 15 49 CA A6 C7 6F A0 2E D5 25 74 09
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
-- spitzhalde9 D-79853 lenzkirch bernhard.kleine@gmx.net www.b-kleine.com, www.urseetal.net - thunderbird mit enigmail GPG schlüssel: D5257409 fingerprint: 08 B7 F8 70 22 7A FC C1 15 49 CA A6 C7 6F A0 2E D5 25 74 09

My presentation from XML Prague this year should also cover this aspect about conversions between bytes on disk and characters. https://www.youtube.com/watch?v=JDOEMQD32Ss Regards, Radu Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com On 2/19/2018 5:28 PM, Eliot Kimber wrote:
The (hex.) column is the UTF-8 encoding of the character, that is, the sequence of bytes.
The actual Unicode character number is the value in the first column, e.g., \u2190.
So you should be able to type 2190 and get the character you want.
Unicode is the character set and the character numbers (code points) are independent of how the characters are encoded.
The encoding is how the characters are translated to bytes when written as a byte sequence.
The Unicode standard defines a number of encodings, including UTF-8 and UTF-16.
So there are not “UTF-8 characters”, only UTF-8 encodings of Unicode characters.
The UTF-8 encoding was designed so that it is identical to ASCII for the first 127 or 255 characters (depending on which version of ASCII you’re looking at). But after character 255 it takes at least 3 bytes to encode a character.
Cheers,
E.
--
Eliot Kimber
*From: *oXygen-user <oxygen-user-bounces@oxygenxml.com> on behalf of Bernhard Kleine <bernhard.kleine@gmx.net> *Date: *Monday, February 19, 2018 at 9:17 AM *To: *<oxygen-user@oxygenxml.com> *Subject: *Re: [oXygen-user] How to type an UTF8 symbol in text as well as in author mode
The UTF8 table at http://www.utf8-zeichentabelle.de/unicode-utf8-table.pl?start=8592shows this first four lines.
*Unicode Codepos.*
*Zeichen*
*UTF-8 (hex.)*
*Name*
U+2190
←
e2 86 90
LEFTWARDS ARROW
U+2191
↑
e2 86 91
UPWARDS ARROW
U+2192
→
e2 86 92
RIGHTWARDS ARROW
U+2193
↓
e2 86 93
DOWNWARDS ARROW
When I tried to change a utf8 hex value in a simple doc, using Ctrl-Shift-X, I get:
cid:part1.482BC927.158A9BBD@gmx.net
(not a valid hexadecimal sequence to change)
I also tried the 0x1F926 from Bens example below. The same error. What do I wrong?
These arrows would be a good example since they will be used.
Regards
Bernhard
Am 19.02.2018 um 10:03 schrieb Oxygen XML Editor Support (Radu Coravu) :
Hi,
Thanks for the reminder Ben. Indeed I forgot about this feature in Oxygen:
https://www.oxygenxml.com/doc/versions/19.1/ug-editor/topics/text-mode-actio...
which basically allows you to type away the hex digits in Oxygen and then invoke the special "Convert Hexadecimal Sequence to Character" action.
Regards, Radu
Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
On 2/19/2018 10:56 AM, Ben McGinnes wrote:
On Mon, Feb 19, 2018 at 09:33:28AM +0200, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi Bernhard,
It seems that for "nbsp" which has the decimal equivalent "160" you would need to type "ALT" and then "0160", that leading "0" seems to be important. The same probably for all other characters, type their decimal equivalent but it needs to be four typed figures.
Oh, how quickly we forget certain things. :)
oXygen has had the ability to enter UTF-8 characters in the first plane by their four character hexadecimal code point value since version 17.1. I can't recall what the default hotkey is for invoking it because I changed mine (back) to F8 as soon as I installed that version. I believe I've still got the plugin you guys provided me during my trial period for 17.0.
Anyway, if Bernhard is happy with using hex instead of int, that's the solution instead of the Windows alt sequences (or the Mac alt/option sequences either, for that matter).
....
bash-4.4$ unum.pl 0x1f926 Octal Decimal Hex HTML Character Unicode 0374446 129318 0x1F926 🤦 "🤦" FACE PALM bash-4.4$
Obviously some of us can see that character properly and some can't, but you all know which it is.
Regards, Ben
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com<mailto:oXygen-user@oxygenxml.com> https://www.oxygenxml.com/mailman/listinfo/oxygen-user
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com<mailto:oXygen-user@oxygenxml.com> https://www.oxygenxml.com/mailman/listinfo/oxygen-user
--
spitzhalde9
D-79853 lenzkirch
bernhard.kleine@gmx.net<mailto:bernhard.kleine@gmx.net>
www.b-kleine.com<http://www.b-kleine.com>, www.urseetal.net<http://www.urseetal.net>
-
thunderbird mit enigmail
GPG schlüssel: D5257409
fingerprint:
08 B7 F8 70 22 7A FC C1 15 49 CA A6 C7 6F A0 2E D5 25 74 09
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

On Mon, Feb 19, 2018 at 04:17:26PM +0100, Bernhard Kleine wrote:
The UTF8 table at http://www.utf8-zeichentabelle.de/unicode-utf8-table.pl?start=8592 shows this first four lines.
Unicode Codepos. Zeichen UTF-8 (hex.) Name U+2190 ← e2 86 90 LEFTWARDS ARROW U+2191 ↑ e2 86 91 UPWARDS ARROW U+2192 → e2 86 92 RIGHTWARDS ARROW U+2193 ↓ e2 86 93 DOWNWARDS ARROW
When I tried to change a utf8 hex value in a simple doc, using Ctrl-Shift-X, I get:
(not a valid hexadecimal sequence to change)
I also tried the 0x1F926 from Bens example below. The same error. What do I wrong?
The face palm character probably isn't a good thing for you to test with since it's technically an emoji and is from one of the upper planes (it's near the end of the second plane out of 17). Your characters are all in the first plane and, prior to the popularisation of emojis, that's where all the action usually is.
These arrows would be a good example since they will be used.
Use the four character hexadecimal value of the code point without either a leading "U+" or a leading "0x" if you are doing this in oXygen, but it does need to be four characters, there must be a space before it and the cursor needs to be immediately adjacent to the end of it. So for the leftwards arrow you would enter 2190 and then, with the cursor next to the 0 (as it would be if you'd just typed it), press the key sequence to convert to UTF-8 and it'll do it. getting unum.pl to display it would be: bash-4.4$ unum.pl 0x2190 Octal Decimal Hex HTML Character Unicode 020620 8592 0x2190 ←,←,←,←,← "←" LEFTWARDS ARROW bash-4.4$ Note that that's offset too far because there are multiple methods of displaying it in HTML (which is why tests with arrows aren't always the best after all, but you did want this). LibreOffice uses either a compose special characters hotkey or menu with options on what the hotkey actually dows. I have it configured to let me enter the hexadecimal directly. It will provide the same range of characters as oXygenXML except I press the hotkey first and then enter the four characters. As with oXygen I must enter all four characters. Most other programs are the same as LibreOffice in that regard, including the IRC client hexchat. Emacs shares some similarity with those programs in that I invoke the conversion function first and then enter the hexadecimal character without the leading "0x" or "U+" but I only have to enter the hex value and I can enter from one to six characters. So if I wanted to create a u with an umlaut (i.e. this little character: ü) then I only need to press my key binding (F8) and then enter fc; whereas in oXygenXML, LibreOffice, Hexchat and most other programs I would have needed to enter 00fc. This sort of thing, however, is why I started keeping a cheat sheet in the first place. Regards, Ben
participants (7)
-
Ben McGinnes
-
Bernhard Kleine
-
Eliot Kimber
-
George Bina
-
Imsieke, Gerrit, le-tex
-
Oxygen XML Editor Support (Radu Coravu)
-
Radu Coravu