Hi Guillaume,

when you say the characters are missing, I presume you refer to the result display in the BaseX GUI.

U+E16E is in the private use area U+E000 through U+F8FF. The BaseX GUI uses code points in this range for internal purposes, so they are blocked when being displayed, if they occur in actual content. This is a limitation of the GUI implementation.

However this affects the GUI only. The BaseX engine handles code point U+E16E just like any other valid Unicode code point, it is not dropped from the document. So when you send your results somewhere else than to GUI result panes, e.g. to files or REST responses, the affected code points should be present as expected.

Hope this helps.

Best regards,
Gunther

Gesendet: Dienstag, 9. Juni 2026 um 11:58

Von: "Guillaume Porte via BaseX-Talk" <basex-talk@mailman.uni-konstanz.de>

An: basex-talk@mailman.uni-konstanz.de

Betreff: [basex-talk] unicode characters missing in the output

Hi,

I'm trying to figure out a character display issue when using basex.

I have a document originally created by using a unicode font called
fedorovsk.otf (https://sci.ponomar.net/fonts.html)

When I open the file inside a text editor, I can see all characters and
ligatures (since the font is installed on the system), and it also works
in the browser when using the font with @font-face.

When I load the document with basex (either by the specifying the path
or after indexing it), ir returns the word with certain characters missing.

You can see the difference here : https://gprt.fr/unicode/test.html

Apparently, the codes for the first entity are &#57710 or 

When adding those codes to the XML document, they disappear in the output.

Do you have any idea on what's going on and how to fix it?

Regards

Guillaume