Hi Alex,
If i understood correctly i used the unicode codes from http://unicode.org/charts/PDF/U0370.pdf to produce the following mapping: [...]
thanks; I have added your mappings and uploaded a new stable snapshot [1,2]. The following query should now return true:
"ά" contains text "α"
Next, I've added the Greek stemmer to our internal implementations. It can be invoked by setting "stemming" and "language"; e.g.:
"..." contains text "..." using stemming using language "el"
Due to my non-existing Greek language skills, I'm sorry I had no chance to perform any tests.. your feedback is welcome!
I am concerned though because this is not always the desired behavior. Sometimes (ie in an academic context) I could see the need for accent-sensitive searches.
In this particular case, you can switch off the removal of diacritics via..
"ά" contains text "α" using diacritics sensitive
OTOH this would be reinventing the collation wheel (an oversimplified version of it)
That's true. I'll write some more on that as a reply to Michael’s mail. Christian
[1] http://docs.basex.org/wiki/Releases [2] http://files.basex.org/releases/latest/