Hi Cerstin,
for testing purposes you could use the ft:search function (see http://docs.basex.org/wiki/Full-Text_Module#ft:search ). This automatically applies the correct options.
@Christian: I could not find it on the wiki but if I remember correctyl, the full-text would not be used, if the options used in the query do not match the options used when creating the database (wildcards, stemming etc.).
Regards,
Maximilian
Dear Christian,
thanks for your quick answer.
Zitat von Christian Grün <christian.gruen@gmail.com>:
To make inspection of results easier, I added ft:mark. A collection with
only a dozen of texts of about 71 MB with full text index for German,
optimized, etc. works quite well. However, the example query needs more than
9s, which is rather slow.
First of all, it might be interesting to hear what the query compiler
does. Have you looked at the QueryInfo panel to check out if the
full-text index is applied? If yes, you should find something like..
Compiling:
- ...
- applying full-text index
- ...
..in the info panel.
The interesting thing is:
for my original query
//(p|l) [text() contains text "Korb geben" using stemming using language "de"]
there is no information on compiling, only information on Timing, Result (number of results) and Queryplan.
If I change '(p|l)' to 'p', I get information on Compiling, but only:
Compiling:
- optimizing descendant-or-self step(s)
Result: root()/descendant::{http://www.tei-c.org/ns/1.0}p[text() contains text "Korb geben"]
Apparently, the index is not used.
Another hint: to enable index optimizations, your query...
//(p|l) [text() contains text "Korb geben" using stemming using language
"de"]
..may have to be rewritten as follows:
//*[text() contains text "Korb geben"][self::p or self::l]
OK, this results in reducing total time by half, but I see only:
Compiling:
- optimizing descendant-or-self step(s)
Result: root()/descendant::*[text() contains text "Korb geben"][self::{http://www.tei-c.org/ns/1.0}p or self::{http://www.tei-c.org/ns/1.0}l]
Also memory used is reduced a bit, so this definitely helps. However, if I include 'using stemming using language "de"', total time is almost the same.
I see no possibility to enforce using the index. I use BaseX 7.0.2, maybe this is a bug? I will try the Beta 7.1.
Best regards
Cerstin
--
Dr. phil. Cerstin Mahlow
Universität Basel
Deutsches Seminar
Nadelberg 4
4051 Basel
Schweiz
Tel: +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mahlow@unibas.ch
Web: http://www.oldphras.net
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk