Hi Alex, yes, that’s true. The collation feature was introduced in a later version of XQuery. XQuery Full Text was implemented quite some time before that. The tokenization process reduces each non-ascii character to a single alternative character. As a result, the following returns true… 'anschließend' contains text 'anschliesend' …whereas one would rather expect 'anschliessend' to be accepted. You may have spotted Günter’s similar observation regarding the German long s (ſ) in [1]. It may be reasonable to bring all the normalization steps together (even if some of them are language-specific). Best, Christian [1] https://www.mail-archive.com/basex-talk@mailman.uni-konstanz.de/msg11927.htm... On Wed, Jul 31, 2019 at 1:42 PM Alexander Witzigmann <alexander.witzigmann@tanner.de> wrote:
Hi All,
the following query
declare default collation 'http://basex.org/collation?lang=de;strength=secondary'; "anschließend" contains text "anschliessend"
returns false
but the following query
declare default collation 'http://basex.org/collation?lang=de;strength=secondary'; "anschließend" = "anschliessend"
returns true.
and the following query
declare default collation 'http://basex.org/collation?lang=de;strength=secondary'; "anschließend" eq "anschliessend"
returns true.
I wonder why 'contains text' does not return true as well? All full-text related functions doesn't support ß = ss for german collation. Alex