
Hi, Thanks for the reminder Ben. Indeed I forgot about this feature in Oxygen: https://www.oxygenxml.com/doc/versions/19.1/ug-editor/topics/text-mode-actio... which basically allows you to type away the hex digits in Oxygen and then invoke the special "Convert Hexadecimal Sequence to Character" action. Regards, Radu Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com On 2/19/2018 10:56 AM, Ben McGinnes wrote:
On Mon, Feb 19, 2018 at 09:33:28AM +0200, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi Bernhard,
It seems that for "nbsp" which has the decimal equivalent "160" you would need to type "ALT" and then "0160", that leading "0" seems to be important. The same probably for all other characters, type their decimal equivalent but it needs to be four typed figures.
Oh, how quickly we forget certain things. :)
oXygen has had the ability to enter UTF-8 characters in the first plane by their four character hexadecimal code point value since version 17.1. I can't recall what the default hotkey is for invoking it because I changed mine (back) to F8 as soon as I installed that version. I believe I've still got the plugin you guys provided me during my trial period for 17.0.
Anyway, if Bernhard is happy with using hex instead of int, that's the solution instead of the Windows alt sequences (or the Mac alt/option sequences either, for that matter).
Accessing characters in multiplanes beyond the first is difficult in most programs, including oXygenXML. Obviously XML can handle it, but the accessing problems are twofold:
1. Entering a hexadecimal character comprised of five or six hex characters on the remaining 16 planes (i.e. 0x10000 to 0x1fffff).
2. Rendering characters which can only be displayed using multiple fonts and guaranteeing font fallback capablities.
I have only one program which can handle both of these natively for editing and that's GNU Emacs, but in those cases where I need to delve into the upper multiplanes I can open a file from oXygen in Emacs and that'll do for now.
It might be worth having a look at extending the hex entry feature to enable a way to enter a hex value of grater than 3 bytes (4 characters), but oXygfen takes that input differently to other programs and so it might be tricker. Emacs, LibreOffice and other programs work by activating the hex input function (it's "M-x insert-char" in Emacs) and then entering the code point hex value. In oXygen you enter the hex value as four characters in the document and then press the hotkey which reads the preceding four characters and transforms them.
As for font fallback, there's pretty much no options for handling that in oXygen, but there are effective workarounds by doing sneaky things with CSS in the source files as well as the output formats.
I've got my own little Unicode cheat sheet which has been gradually growing over the last decade or so and covers most of this in more detail. Bear in mind two things: first, it's a personal cheat sheet that I only share because it often answers frequent questions I hear elsewhere; and second, it's a "living document" that gets updated frequently.
That said, it's here:
https://www.dropbox.com/s/8jifzcc8qks5cef/UnicodeNotes.pdf?dl=0
Or to download it:
https://www.dropbox.com/s/8jifzcc8qks5cef/UnicodeNotes.pdf?dl=1
It's only ever released as a PDF because of all the font/glyph embedding. It claims or attempts to export as PDF/A-1, but only to ensure that font embedding and it probably won't pass preflight checks (nor does it need to).
For those few readers of this list who also use Emacs, the last three pages of that file include those portions of my Emacs init file which specify the fallback fonts using fontset default. I've got coverage from 0x0000 to 0x2ffff and where things occasionally misbehave, they're easy to identify with the aid of the binding on F16 (i.e. M-x describe-char).
Finally, my current favourite code point checking tool, for any system with Perl installed, is unum.pl, available here:
https://www.fourmilab.ch/webtools/unum/
The current version of the cheat sheet discusses it on page 23, but here's a nice example of what it does:
bash-4.4$ unum.pl 0x1f926 Octal Decimal Hex HTML Character Unicode 0374446 129318 0x1F926 🤦 "🤦" FACE PALM bash-4.4$
Obviously some of us can see that character properly and some can't, but you all know which it is.
Regards, Ben
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user