Hi Vincent,
I’m sorry this is currently not possible with BaseX, mainly due various conceptual issues. The following query
let $in := <xml> <a xml:lang="en">yes</a> <a xml:lang="fr">oui</a> </xml> return $in//*[text() contains text '...' using stemming]
would e.g. require that the specified query text is stemmed differently, depending on the addressed target nodes. While I wouldn’t claim that this is impossible, it leads to various open questions that would need to be solved.
I indexing is a big issue, I would suggest for now to preprocess your data and store all texts in different databases, one for each language. You could even think about keping your original file and creating additional language databases, which could then be addressed from a single query.
Hope this helps, Christian ___________________________
On Tue, May 21, 2013 at 11:54 AM, Vincent vbiragnet@1500signes.com wrote:
Dear BaseX Team,
I have some text in different languages stored in my XML files (tagged with xml:lang attributes) and I'd like to perform efficient full text queries on it (using stemming, stopwords...). I planned to use full text index.
I can't figure out how I can set full text index for different languages in a given database. Is there a way to do it ?
Thanks,
Vincent
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk