Hi Cerstin,
for testing purposes you could use the ft:search function (see http://docs.basex.org/wiki/Full-Text_Module#ft:search ). This automatically applies the correct options.
@Christian: I could not find it on the wiki but if I remember correctyl, the full-text would not be used, if the options used in the query do not match the options used when creating the database (wildcards, stemming etc.).
Regards,
Maximilian
Am 13. Januar 2012 16:05 schrieb Cerstin Mahlow cerstin.mahlow@unibas.ch:
Dear Christian,
thanks for your quick answer.
Zitat von Christian Grün christian.gruen@gmail.com:
To make inspection of results easier, I added ft:mark. A collection with
only a dozen of texts of about 71 MB with full text index for German, optimized, etc. works quite well. However, the example query needs more than 9s, which is rather slow.
First of all, it might be interesting to hear what the query compiler does. Have you looked at the QueryInfo panel to check out if the full-text index is applied? If yes, you should find something like..
Compiling:
- ...
- applying full-text index
- ...
..in the info panel.
The interesting thing is:
for my original query
//(p|l) [text() contains text "Korb geben" using stemming using language "de"]
there is no information on compiling, only information on Timing, Result (number of results) and Queryplan.
If I change '(p|l)' to 'p', I get information on Compiling, but only:
Compiling:
- optimizing descendant-or-self step(s)
Result: root()/descendant::{http://**www.tei-c.org/ns/1.0%7Dp%5Btext()http://www.tei-c.org/ns/1.0%7Dp%5Btext%28%29contains text "Korb geben"]
Apparently, the index is not used.
Another hint: to enable index optimizations, your query...
//(p|l) [text() contains text "Korb geben" using stemming using language
"de"]
..may have to be rewritten as follows:
//*[text() contains text "Korb geben"][self::p or self::l]
OK, this results in reducing total time by half, but I see only:
Compiling:
- optimizing descendant-or-self step(s)
Result: root()/descendant::*[text() contains text "Korb geben"][self::{ http://www.tei-**c.org/ns/1.0%7Dp http://www.tei-c.org/ns/1.0%7Dp or self::{http://www.tei-c.org/**ns/1.0%7Dl http://www.tei-c.org/ns/1.0%7Dl]
Also memory used is reduced a bit, so this definitely helps. However, if I include 'using stemming using language "de"', total time is almost the same.
I see no possibility to enforce using the index. I use BaseX 7.0.2, maybe this is a bug? I will try the Beta 7.1.
Best regards
Cerstin
-- Dr. phil. Cerstin Mahlow
Universität Basel Deutsches Seminar Nadelberg 4 4051 Basel Schweiz
Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mahlow@unibas.ch Web: http://www.oldphras.net
------------------------------**------------------------------**---- This message was sent using IMP, the Internet Messaging Program.
______________________________**_________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-**konstanz.de BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.**de/mailman/listinfo/basex-talkhttps://mailman.uni-konstanz.de/mailman/listinfo/basex-talk